Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

BadPromptFL: A Novel Backdoor Threat to Prompt-based Federated Learning in Multimodal Models

Created by
  • Haebom

Author

Maozhen Zhang, Mengnan Zhao, Wei Wang, Bo Wang

Outline

This paper presents BadPromptFL, the first backdoor attack on prompt-based federated learning (PromptFL) in multimodal contrastive learning models. BadPromptFL involves a compromised client jointly optimizing local backdoor triggers and prompt embeddings to inject malicious prompts into the global aggregation process. These malicious prompts are then propagated to benign clients, enabling universal backdoor activation during inference without modifying model parameters. Leveraging the contextual learning behavior of the CLIP-style architecture, BadPromptFL achieves a high attack success rate (e.g., >90%) with minimal visibility and limited client involvement. Extensive experiments on various datasets and aggregation protocols demonstrate the effectiveness, stealth, and generalizability of this attack, raising serious concerns about the robustness of prompt-based federated learning in real-world deployments.

Takeaways, Limitations

Takeaways: We reveal a security vulnerability in prompt-based federated learning and propose a new backdoor attack technique, BadPromptFL, suggesting research directions for ensuring the security of real-world systems. We also demonstrate the effectiveness of an attack that exploits the contextual learning characteristics of the CLIP-style architecture.
Limitations: Research on defense techniques against the currently proposed attack techniques is lacking. Further research is needed to determine the generalizability of the attack to various types of multimodal models and federated learning settings. Experimental results limited to specific datasets and settings may limit the generalizability of the attack's effectiveness to other environments.
👍