Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Att-Adapter: A Robust and Precise Domain-Specific Multi-Attributes T2I Diffusion Adapter via Conditional Variational Autoencoder

Created by
  • Haebom

Author

Wonwoong Cho, Yan-Ying Chen, Matthew Klenk, David I. Inouye, Yanxia Zhang

Outline

In this paper, we propose a novel plug-and-play module, Att-Adapter, to address the problem of simultaneously and precisely controlling multiple attributes in a pre-trained diffusion model. Att-Adapter learns a single control adapter from a set of sample images containing unpaired multiple visual attributes. It utilizes a decoupled cross-attention module to naturally harmonize multiple domain attributes with textual conditions, and uses a conditional variational autoencoder (CVAE) to mitigate overfitting and accommodate the diverse characteristics of the visual world. Evaluation results on two public datasets show that Att-Adapter outperforms all LoRA-based baseline models in continuous attribute control, demonstrating a wider control range and improved inter-attribute separation. In addition, it has the advantage of not requiring paired synthetic data for training and can be easily extended to multiple attributes.

Takeaways, Limitations

Takeaways:
We present a novel method to precisely control continuous multi-attributes in pre-trained diffusion models.
Ability to train using unpaired data, improving data efficiency.
Outperforms LoRA-based methods and StyleGAN-based methods.
Easily extendable to multiple properties in a single model.
Wide control range and improved separation between attributes.
Limitations:
The specific Limitations is not explicitly mentioned in the paper. It may be revealed through further experiments or analysis.
Generalization performance for specific domains or properties requires further study.
Using CVAE may not always be efficient.
👍