[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

CSD-VAR: Content-Style Decomposition in Visual Autoregressive Models

Created by
  • Haebom

Author

Quang-Binh Nguyen, Minh Luu, Quang Nguyen, Anh Tran, Khoi Nguyen

Outline

This paper addresses the problem of content-style decomposition (CSD), which separates content and style from a single image. Unlike existing diffusion model-based personalization methods, in this paper we propose a novel method, CSD-VAR, which performs CSD by utilizing visual autoregressive modeling (VAR). CSD-VAR introduces three key innovations to enhance the separation of content and style by leveraging the size-dependent generation process. First, we use a size-aware cross-optimization strategy to align content and style representations to their respective sizes. Second, we mitigate content leakage into style representations by using an SVD-based correction method. Third, we improve content identity preservation by using an augmented key-value (KV) memory. In addition, we introduce a new benchmark dataset, CSD-100, for CSD tasks. Experimental results show that CSD-VAR achieves better content preservation and style fidelity than existing methods.

Takeaways, Limitations

Takeaways:
We present the possibility of performing CSD using VAR and demonstrate superior performance over existing diffusion model-based methods.
Proposes novel techniques including size-aware cross-optimization, SVD-based modification, and augmented KV memory.
We provide a new benchmark dataset, CSD-100, for CSD tasks.
Limitations:
Further review of the size and diversity of the CSD-100 dataset is needed.
Additional experiments are needed to evaluate the generalization performance of the proposed method.
Need to evaluate performance on different types of images or styles.
👍