Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Benchmarking LLM - Assisted Blue Teaming via Standardized Threat Hunting

Created by
  • Haebom

Author

Yuqiao Meng, Luoxi Tang, Feiyang Yu, Xi Li, Guanhua Yan, Ping Yang, Zhaohan Xi

Outline

With the rise of cyber threats, the need for threat analysis utilizing large-scale language models (LLMs) has emerged. This paper presents CyberTeam, a benchmark for enhancing LLM's blue team defense capabilities. CyberTeam models realistic threat hunting workflows, identifies dependencies between analysis tasks, and builds operational modules for each task, enabling LLM to perform threat analysis step-by-step. CyberTeam integrates 30 tasks and nine operational modules to support standardized threat analysis and has been evaluated against leading LLMs and cutting-edge cybersecurity agents. While CyberTeam demonstrates improvements through its standardized design, it also reveals the limitations of open-ended reasoning in real-world threat hunting.

Takeaways, Limitations

Takeaways:
Providing standardized blue team benchmarks for LLM
Enhance your LLM's threat analysis capabilities with a step-by-step, modular approach.
Verifying LLM's performance improvements in real-world threat hunting scenarios.
Limitations:
Exposing the limitations of open-ended reasoning
Lack of additional information on the specific extent of CyberTeam's performance improvements.
👍