Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

A Few Words Can Distort Graphs: Knowledge Poisoning Attacks on Graph-based Retrieval-Augmented Generation of Large Language Models

Created by
  • Haebom

Author

Jiayi Wen, Tianxin Chen, Zhirun Zheng, Cheng Huang

Outline

This paper presents two Knowledge Poisoning Attacks (KPAs) that exploit vulnerabilities in the Graph-based Retrieval-Augmented Generation (GraphRAG) model. GraphRAG transforms raw text into a structured knowledge graph to improve the accuracy and explainability of LLMs. We address the potential for malicious manipulation of the LLM's knowledge extraction process from raw text. The two proposed attacks are Targeted KPA (TKPA) and Universal KPA (UKPA). TKPA uses graph theoretical analysis to identify vulnerable nodes in the generated graph and rewrites the corresponding descriptions into LLMs, precisely controlling specific question-answering (QA) results. UKPA exploits linguistic cues, such as pronouns and dependencies, to alter globally influential words, thereby destroying the structural integrity of the generated graph. Experimental results demonstrate that even small text modifications can significantly degrade GraphRAG's QA accuracy, highlighting the failure of existing defense techniques to detect these attacks.

Takeaways, Limitations

Takeaways: This paper demonstrates the security vulnerabilities of LLM-based knowledge graph generation models such as GraphRAG and presents a novel attack technique and its effectiveness against knowledge poisoning attacks. It exposes the limitations of existing defense techniques and emphasizes the need for research to strengthen the security of the GraphRAG model. TKPA and UKPA have high attack success rates and can significantly impact the performance of even small text modifications.
Limitations: The attack presented here is specific to a specific GraphRAG implementation, and its generalizability to other implementations or LLM architectures requires further research. The effectiveness of the attack in a real-world environment requires further verification. While the lack of a proposed defense technique suggests a direction for future research, there is a lack of discussion of specific defense strategies.
👍