[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

BLAST: A Stealthy Backdoor Leverage Attack against Cooperative Multi-Agent Deep Reinforcement Learning based Systems

Created by
  • Haebom

Author

Jing Fang, Saihao Yan, Xueyu Yin, Yinbo Yu, Chunwei Tian, Jiajia Liu

Outline

In this paper, we propose BLAST, a novel backdoor attack technique for cooperative multi-agent deep reinforcement learning (c-MADRL) systems. To overcome the Limitations (lack of stealth of immediate trigger pattern, backdoor learning/activation via additional networks, and backdooring of all agents) of existing backdoor attacks, BLAST attacks the entire multi-agent team by inserting a backdoor only into a single agent. Stealth is achieved by using adversarial spatiotemporal behavior patterns as backdoor triggers, and the reward function of the backdoor agent is unilaterally induced to achieve the 'leverage attack effect'. Through experiments on VDN, QMIX, MAPPO algorithms and existing defense mechanisms in SMAC and Pursuit environments, we confirm the high attack success rate and low normal performance variance of BLAST.

Takeaways, Limitations

Takeaways:
A novel method to attack entire multi-agent systems through a single-agent backdoor is presented.
A proposed backdoor attack technique that increases stealth by using spatiotemporal behavior patterns as triggers.
Experimentally demonstrating the feasibility of effective attacks against existing c-MADRL algorithms and defense mechanisms.
Highlighting the severity of security vulnerabilities in cooperative multi-agent systems.
Limitations:
The proposed attack may be limited to specific environments and algorithms.
There is a need to evaluate more diverse defense mechanisms.
Further research is needed to determine applicability and generalizability in real-world settings.
Feasibility and ethical issues of hacking reward functions through unilateral induction need to be considered.
👍