This is a page that curates AI-related papers published worldwide. All content here is summarized using Google Gemini and operated on a non-profit basis. Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.
RevPRAG: Revealing Poisoning Attacks in Retrieval-Augmented Generation through LLM Activation Analysis
Created by
Haebom
Author
Xue Tan, Hao Luan, Mingyu Luo, Xiaoyan Sun, Ping Chen, Jun Dai
Outline
This paper proposes RevPRAG, a novel detection technique for RAG poisoning attacks, a vulnerability in Retrieval-Augmented Generation (RAG) systems. RAG poisoning involves injecting malicious text into a knowledge database to generate a desired response. RevPRAG is an automated detection pipeline that analyzes the activation patterns of LLMs to distinguish between normal and malicious responses. It achieved a true positive rate of 98% and a false positive rate close to 1% on various benchmark datasets and RAG architectures. This can contribute to strengthening the security of RAG systems that utilize publicly accessible knowledge databases.
Takeaways, Limitations
•
Takeaways:
◦
An effective RAG poisoning detection technique presented through analysis of LLMs' activation patterns.
◦
Demonstrates practical applicability by achieving high true positive rate and low false positive rate.
◦
Contributes to improving the security of the RAG system
•
Limitations:
◦
Since these are performance evaluation results for specific LLMs and datasets, generalizability to other LLMs and datasets needs to be verified.
◦
Detection performance verification for new types of RAG poisoning attacks is required.
◦
Further research is needed on efficiency and scalability in real-world environments.