Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Att-Adapter: A Robust and Precise Domain-Specific Multi-Attributes T2I Diffusion Adapter via Conditional Variational Autoencoder

Created by
  • Haebom

Author

Wonwoong Cho, Yan-Ying Chen, Matthew Klenk, David I. Inouye, Yanxia Zhang

Outline

In this paper, we propose Attribute (Att) Adapter, a novel method to precisely control multiple continuous attributes (e.g., eye opening degree, car width) simultaneously in a pre-trained text-to-image diffusion model. Att-Adapter learns a single control adapter from a set of unpaired sample images and utilizes a decoupled cross-attention module to harmonize multiple domain attributes and text conditions. To mitigate overfitting, we additionally introduce a Conditional Variational Autoencoder (CVAE) to reflect various visual world characteristics. Experimental results show that Att-Adapter outperforms existing LoRA-based methods, demonstrating a wider control range and improved attribute separation performance. In addition, it can be trained without paired synthetic data and has excellent scalability for multiple attributes.

Takeaways, Limitations

Takeaways:
We present a novel method for precise control of continuous multi-attributes in pre-trained diffusion models.
It is trainable using non-paired data and is highly scalable.
It shows better performance than LoRA-based methods and StyleGAN-based methods.
Provides wide control range and improved attribute separation performance.
Limitations:
The presented __T56934__ is not explicitly mentioned in the paper. It should be revealed through additional experiments or analyses. For example, generalization performance for specific types of properties or datasets, computational cost, etc. may be areas that require further research.
👍