Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

POEX: Towards Policy Executable Jailbreak Attacks Against the LLM-based Robots

Created by
  • Haebom

Author

Xuancun Lu, Zhengxian Huang, Xinfeng Li, Chi Zhang, Xiaoyu ji, Wenyuan Xu

Outline

This paper investigates the security vulnerabilities of large-scale language models (LLMs)-based robotic systems. We highlight that LLMs' vulnerability to jailbreak attacks, which transform robot commands into executable policies, poses a serious security risk from the digital to the physical domain. We investigate the applicability of existing LLM jailbreak attacks to robotic systems and propose a novel attack technique, POlicy Executable (POEX). POEX uses hidden-layer gradient optimization and a multi-agent evaluator to derive executable harmful policies, and its effectiveness is verified through real-world robotic systems and simulations. Finally, we propose prompt-based and model-based defense techniques to mitigate jailbreak attacks.

Takeaways, Limitations

Takeaways:
We empirically demonstrate the feasibility of a jailbreak attack on an LLM-based robotic system.
We explain why existing LLM jailbreak attacks are not directly applicable to robotic systems.
We propose POEX, a novel jailbreak attack technique specialized for robotic systems, and verify its effectiveness.
Prompt-based and model-based defense techniques against prison escape attacks are presented.
Emphasizes the need for urgent security measures to ensure the safe deployment of LLM-based robots.
Limitations:
The effectiveness of POEX has been validated for specific robotic systems and LLMs, and its generalizability to other systems or LLMs requires further study.
Further analysis is needed on the practical effectiveness and limitations of the proposed defense techniques.
A comprehensive study of the various types of prison escape attacks and defense techniques is needed.
👍