This is a page that curates AI-related papers published worldwide. All content here is summarized using Google Gemini and operated on a non-profit basis. Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.
ETF: An Entity Tracing Framework for Hallucination Detection in Code Summary
Created by
Haebom
Author
Kishan Maharaj, Vitobha Munigala, Srikanth G. Tamilselvam, Prince Kumar, Sayandeep Sen, Palani Kodeswaran, Abhijit Mishra, Pushpak Bhattacharyya
Outline
This paper proposes a new dataset, CodeSumEval (~10K samples), and an Entity Tracking Framework (ETF) to address the hallucination problem that arises during code summarization using large-scale language models (LLMs). CodeSumEval is a dedicated dataset for detecting hallucinations in code summaries, while ETF identifies code entities through static program analysis and maps and verifies these entities to their intent within the code summaries generated using LLMs. Experimental results show that ETF achieves an F1 score of 73%, demonstrating its effectiveness in evaluating the accuracy of code summaries and localizing errors within the summaries.
Takeaways, Limitations
•
Takeaways:
◦
In Code Summary, we present a new dataset and framework for solving the hallucination problem.
◦
A novel approach combining static program analysis and LLM is proposed.
◦
Experimentally proven the effectiveness of ETFs through a high F1 score (73%).
◦
Ability to assess the accuracy of code summaries and localize errors.
•
Limitations:
◦
The size of the CodeSumEval dataset (10K samples) may be relatively small.
◦
The performance of an ETF may depend on a specific programming language, code style, or LLM.
◦
There may be limitations in detecting all types of hallucinations.
◦
Further validation of generalization performance in real-world environments is needed.