Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

RevPRAG: Revealing Poisoning Attacks in Retrieval-Augmented Generation through LLM Activation Analysis

Created by
  • Haebom

Author

Xue Tan, Hao Luan, Mingyu Luo, Xiaoyan Sun, Ping Chen, Jun Dai

Outline

This paper proposes RevPRAG, a novel detection technique for RAG poisoning attacks, a vulnerability in Retrieval-Augmented Generation (RAG) systems. RAG poisoning involves injecting malicious text into a knowledge database to generate a desired response. RevPRAG is an automated detection pipeline that analyzes the activation patterns of LLMs to distinguish between normal and malicious responses. It achieved a true positive rate of 98% and a false positive rate close to 1% on various benchmark datasets and RAG architectures. This can contribute to strengthening the security of RAG systems that utilize publicly accessible knowledge databases.

Takeaways, Limitations

Takeaways:
An effective RAG poisoning detection technique presented through analysis of LLMs' activation patterns.
Demonstrates practical applicability by achieving high true positive rate and low false positive rate.
Contributes to improving the security of the RAG system
Limitations:
Since these are performance evaluation results for specific LLMs and datasets, generalizability to other LLMs and datasets needs to be verified.
Detection performance verification for new types of RAG poisoning attacks is required.
Further research is needed on efficiency and scalability in real-world environments.
👍