This is a page that curates AI-related papers published worldwide. All content here is summarized using Google Gemini and operated on a non-profit basis. Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.
RevPRAG: Revealing Poisoning Attacks in Retrieval-Augmented Generation through LLM Activation Analysis
Created by
Haebom
Author
Xue Tan, Hao Luan, Mingyu Luo, Xiaoyan Sun, Ping Chen, Jun Dai
Outline
This paper presents RevPRAG, a novel detection technique for RAG poisoning attacks, a vulnerability in Retrieval-Augmented Generation (RAG) systems. RAG poisoning involves injecting malicious text into a knowledge database to force the LLM to generate responses desired by the attacker. RevPRAG detects these attacks by analyzing the LLM's activation patterns to distinguish between normal and malicious responses. We demonstrate that it achieves a true positive rate of 98% and a false positive rate close to 1% on various benchmark datasets and RAG architectures.
Takeaways, Limitations
•
Takeaways:
◦
We present the effectiveness of RAG poisoning attack detection through LLM activation pattern analysis.
◦
We present the RevPRAG technique, which achieves high accuracy (98% true positive rate, close to 1% false positive rate).
◦
It presents a new direction for strengthening the security of the RAG system.
•
Limitations:
◦
Since these are performance evaluation results for specific LLM and RAG architectures, their generalizability to other models and architectures requires further research.
◦
As the variety and complexity of attacks increase, the potential for RevPRAG performance degradation exists.
◦
Further research is needed to verify its effectiveness in real-world settings.