[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

CGP-Tuning: Structure-Aware Soft Prompt Tuning for Code Vulnerability Detection

Created by
  • Haebom

Author

Ruijun Feng, Hammond Pearce, Pietro Liguori, Yulei Sui

Outline

In this paper, we present CGP-Tuning, a novel code graph-enhanced structure-aware soft prompt tuning method for software vulnerability detection. To address the problem that existing fine-tuning techniques miss the structural information of source code, CGP-Tuning introduces type-aware embeddings that capture rich semantic information (e.g., control/data flow) in the code graph and an efficient cross-modal alignment module that integrates graph-text interactions while achieving linear computational cost. Experimental results on the state-of-the-art open source code LLM and DiverseVul datasets, including CodeLlama, CodeGemma, and Qwen2.5-Coder, show that CGP-Tuning provides model-independent performance improvements while maintaining practical inference speed, outperforming the existing state-of-the-art graph-enhanced soft prompt tuning techniques by 4% on average and the untuned zero-shot prompting by 15%.

Takeaways, Limitations

Takeaways:
We demonstrate that the structural information of code graphs can be effectively utilized to improve software vulnerability detection performance.
We present a novel method to improve performance while reducing computational cost through type-aware embedding and efficient cross-modal alignment module.
It is a model-independent method applicable to various open source code LLMs.
Significant performance improvements over zero-shot prompting.
Limitations:
Since the presented method was evaluated only on a specific dataset and LLM, its generalization performance on other datasets or LLM requires further study.
Lack of discussion of the complexity of generating and processing code graphs.
Additional research is needed to address scalability issues that may arise when applied to real-world large-scale software systems.
👍