Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

When and Where do Data Poisons Attack Textual Inversion?

Created by
  • Haebom

Author

Jeremy Styborski, Mingzhi Lyu, Jiayou Lu, Nupur Kapur, Adams Kong

Outline

This paper systematically analyzes poisoning attacks on the textual inversion (TI) technique of diffusion models (DMs). First, we present Semantic Sensitivity Maps, a novel method for visualizing the impact of poisoning attacks on text embeddings. Next, we experimentally demonstrate that DMs exhibit nonuniform learning behavior across time steps, particularly focusing on low-noise samples. Poisoning attacks leverage this bias by injecting adversarial signals primarily at low time steps. Finally, we observe that adversarial signals disrupt learning from relevant conceptual regions during training, thereby compromising the TI process. Based on these insights, we propose Safe-Zone Training (SZT), a novel defense mechanism comprised of three main components: 1. attenuation of high-frequency poisoning signals via JPEG compression; 2. restriction of high-time steps to avoid adversarial signals at low time steps; and 3. loss masking to restrict learning to relevant regions. Through extensive experiments on various poisoning attacks, we show that SZT significantly improves the robustness of TI against all poisoning attacks and improves the generation quality over previously published defenses.

Takeaways, Limitations

Takeaways:
We present a novel method (Semantic Sensitivity Maps) to systematically analyze and visualize the impact of poisoning attacks on the TI of DMs.
We elucidate the time-step-dependent non-uniform learning behavior of DMs and reveal how addiction attacks exploit this.
We propose SZT, an effective defense mechanism against poisoning attacks, and experimentally verify its effectiveness.
Achieve improved production quality over existing defenses.
Limitations:
Further research is needed on the generalization performance of SZT.
Applicability and effectiveness verification for various types of DMs and TI methods is needed.
There is a need to evaluate the resistance of SZT to new types of addiction attacks.
👍