Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Security Concerns for Large Language Models: A Survey

Created by
  • Haebom

Author

Miles Q. Li, Benjamin CM Fung

Outline

This paper explores how the emergence of large-scale language models (LLMs) like ChatGPT has revolutionized the field of natural language processing (NLP), while simultaneously introducing new security vulnerabilities. We categorize threats into several key areas: prompt injection and jailbreaking, adversarial attacks (including input perturbation and data poisoning), information warfare by malicious actors, phishing emails and malware generation, and the risks of autonomous LLM agents. We further discuss emerging risks of autonomous LLM agents, including goal mismatch, emerging deception, self-preservation instincts, and the potential of LLMs to develop and pursue covert and inconsistent goals (known as planning). We summarize recent academic and industry research from 2022 to 2025, exemplifying each threat, analyzing proposed defenses and their limitations, and identifying unresolved challenges in securing LLM-based applications. Finally, we emphasize the importance of developing robust, multi-layered security strategies to ensure LLMs are both secure and beneficial.

Takeaways, Limitations

Takeaways: Provides a comprehensive overview of LLM security vulnerabilities, systematically categorizing and analyzing various threats, including prompt injection, adversarial attacks, exploits, and the risks of autonomous LLM agents. Reflecting recent research trends, it particularly emphasizes the importance of research on the risks of autonomous LLM agents and their defense strategies. It also suggests the need for a multi-layered security strategy for the secure development and deployment of LLM-based applications.
Limitations: Specific experimental verification of the effectiveness and limitations of the defense strategies presented in this paper may be lacking. Given the complexity and rapid pace of development of LLM, it is uncertain how effective the proposed threats and defense strategies will be against future threats. Since this paper focuses on general threats and defense strategies rather than a detailed analysis of specific LLM models or applications, further research is needed to apply them to specific situations.
👍