Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

LOTS of Fashion! Multi-Conditioning for Image Generation via Sketch-Text Pairing

Created by
  • Haebom

Author

Federico Girella, Davide Talon, Ziyue Liu, Zanxi Ruan, Yiming Wang, Marco Cristani

Outline

This paper presents LOTS (LOcalized Text and Sketch for fashion image generation), a fashion image generation method that combines sketches and text information, considering the complex creative process of fashion design. LOTS combines global descriptions with local sketch and text information to generate complete fashion images through a diffusion model-based stepwise merging strategy. Using a modular pair-centric representation, the sketch and text are encoded in a shared latent space while maintaining independent local features. Attention-based guidance integrates local and global conditions during the multi-stage denoising process of the diffusion model. We present a new fashion dataset, Sketchy, and demonstrate its superior performance over existing methods through quantitative and qualitative evaluations.

Takeaways, Limitations

Takeaways:
Effectively combine sketch and text information to improve the accuracy and detail of fashion image creation.
We present a new way to control the details of a design by leveraging local information.
We are releasing a new fashion dataset, Sketchy, to contribute to future research.
It can contribute to the advancement of fashion design by achieving superior performance compared to existing methods.
Limitations:
The size and diversity of the Sketchy dataset could be improved in the future.
It may not perfectly reflect all aspects of complex fashion design.
It is necessary to clearly define the differences between the real-world fashion design process and the real-world fashion design process.
👍