Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment

Created by
  • Haebom

Author

Kun Wang, Guibin Zhang, Zhenhong Zhou, Jiahao Wu, Miao Yu, Shiqian Zhao, Chenlong Yin, Jinhu Fu, Yibo Yan, Hanjun Luo, Liang Lin, Zhihao Xu, Haolang Lu, Xinye Cao, Xinyun Zhou, Weifei Jin, Fanci Meng, Shicheng Xu, Junyuan Mao, Yu Wang, Hao Wu, Minghe Wang, Fan Zhang, Junfeng Fang, Wenjie Qu, Yue Liu, Chengwei Liu, Yifan Zhang, Qiankun Li, Chongye Guo, Yalan Qin, Zhaoxin Fan, Kai Wang, Yi Ding, Donghai Hong, Jiaming Ji, Yingxin Lai, Zitong Yu, Xinfeng Li, Yifan Jiang, Yanhui Li, Xinyu Deng, Junlin Wu, Dongxia Wang, Yihao Huang, Yufei Guo, Jen-tse Huang, Qiufeng Wang, Xiaolong Jin, Wenxuan Wang, Dongrui Liu, Yanwei Yue, Wenke Huang, Guancheng Wan, Heng Chang, Tianlin Li, Yi Yu, Chenghao Li, Jiawei Li, Lei Bai, Jie Zhang, Qing Guo, Jingyi Wang, Tianlong Chen, Joey Tianyi Zhou, Xiaojun Jia, Weisong Sun, Cong Wu, Jing Chen, Xuming Hu, Yiming Li, Xiao Wang, Ningyu Zhang, Luu Anh Tuan, Guowen Xu, Jiaheng Zhang, Tianwei Zhang, Yu-Gang Jiang, Felix Juefei-Xu, Hui Xiong, Xiaofeng Wang, Dacheng Tao, Philip S. Yu, Qingsong Wen, Yang Liu

Outline

This paper provides a comprehensive analysis of the safety and security issues of large-scale language models (LLMs). Unlike previous studies that focus only on specific stages of the LLM life cycle (e.g., deployment or fine-tuning stages), this paper is the first to present a "full-stack" safety concept that considers the entire life cycle of LLMs, including data preparation, pre-training, post-training, deployment, and final commercialization. By analyzing more than 800 papers, we provide a comprehensive understanding of LLM safety and suggest promising research directions, including data generation, alignment techniques, model editing, and LLM-based agent systems.

Takeaways, Limitations

Takeaways:
Presenting a comprehensive safety assessment framework across the entire life cycle of LLM
Extensive literature review and systematic organization of safety issues through analysis of over 800 papers
Promising research directions include data generation, alignment techniques, model editing, and LLM-based agent systems.
Providing a roadmap and perspective for LLM safety research
Limitations:
Further research is needed on the practical application and effectiveness of the “full-stack” safety concept presented in this paper.
Given the rapidly developing nature of LLM technology, further verification is needed to determine whether the results of this paper will remain valid for the long term.
Further research is needed on the generalizability of safety analysis to various LLM architectures and application areas.
👍