Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

“I know myself better, but not really greatly”: How Well Can LLMs Detect and Explain LLM-Generated Texts?

Created by
  • Haebom

Author

Jiazhou Ji, Jie Guo, Weidong Qiu, Zheng Huang, Yang Xu, Xinru Lu, Xiaoyu Jiang, Ruizhe Li, Shujun Li

Outline

This paper addresses the problem of distinguishing between human-generated and LLM-generated texts, taking into account the risks associated with the misuse of large-scale language models (LLMs). To this end, we investigate the detection and explanation capabilities of current LLMs in two settings: binary classification (human vs. LLM-generated) and ternary classification (with undetermined classes). We evaluate six open-source and closed-source LLMs of varying sizes and find that self-detection, where LLMs identify their own outputs, consistently outperforms cross-detection, where LLMs identify the outputs of other LLMs, but is suboptimal in both cases. Introducing a ternary classification framework improves detection accuracy and explanation quality across all models. Through comprehensive quantitative and qualitative analyses using human-annotated datasets, we identify major explanation failures, mainly reliance on incorrect features, hallucinations, and faulty inferences. As a result, we highlight the limitations of current LLMs in self-detection and self-explanation, and emphasize the need for further research to address overfitting and improve generalization ability.

Takeaways, Limitations

Takeaways: The ternary classification framework improves the text generation detection accuracy and description quality of LLM. Self-detection outperforms cross-detection.
Limitations: The current LLM has insufficient self-detection and self-explanation capabilities. Explanation failures occur due to reliance on inaccurate features, hallucinations, and faulty inferences. The overfitting problem of LLM and improvement of generalization ability are needed.
👍