Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Chimera: Harnessing Multi-Agent LLMs for Automatic Insider Threat Simulation

Created by
  • Haebom

Author

Jiongchi Yu, Xiaofei Xie, Qiang Hu, Yuhan Ma, Ziming Zhao

Outline

This paper proposes Chimera, a large-scale language model (LLM)-based multi-agent framework, to address the data shortage in the field of insider threat detection (ITD). Chimera automatically simulates benign and malicious insider activities in various corporate environments and collects various logs to generate a new dataset, ChimeraLog. Chimera models each employee as an agent with role-specific behavior and incorporates group meetings, two-way interactions, and autonomous scheduling modules to capture realistic organizational dynamics. The ChimeraLog dataset, which includes 15 types of insider attacks, was created by simulating activities in three sensitive domains: technology companies, financial firms, and healthcare institutions. Human studies and quantitative analysis validated ChimeraLog's diversity, realism, and the presence of explainable threat patterns. Evaluation of existing ITD methodologies revealed an average F1 score of 0.83 for ChimeraLog, significantly lower than the 0.99 score for the CERT dataset, demonstrating ChimeraLog's high difficulty and its utility in advancing ITD research.

Takeaways, Limitations

Takeaways:
We present the possibility of generating a realistic insider threat dataset through LLM-based multi-agent simulation.
The ChimeraLog dataset has a higher difficulty level than existing datasets and contributes to the advancement of ITD research.
Implement realistic scenarios that reflect diverse corporate environments and types of insider attacks.
Helps improve the interpretability of ITD models by including explainable threat patterns.
Limitations:
Lack of detailed description of the creation process and parameters of the ChimeraLog dataset.
Further verification is needed to ensure perfect alignment with the actual business environment.
Potential for data to be biased towards specific corporate environments.
It is unlikely to cover all types of insider attacks.
👍