Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Negative-Guided Subject Fidelity Optimization for Zero-Shot Subject-Driven Generation

Created by
  • Haebom

Author

Chaehun Shin, Jooyoung Choi, Johan Barthelemy, Jungbeom Lee, Sungroh Yoon

Outline

This paper presents Subject Fidelity Optimization (SFO), a novel comparative learning framework for zero-shot subject-driven generation. SFO overcomes the limitations of supervised learning methods that rely solely on positive objects by introducing additional synthetic negative objects, thereby inducing the model to prefer positive objects over negative ones. To achieve this, we propose Condition-Degradation Negative Sampling (CDNS), which generates synthetic negatives cost-effectively while maintaining high-quality subject-related details and textual alignment. Furthermore, we recalibrate the diffusion time step to focus on intermediate stages where subject details emerge. SFO and CDNS are shown to significantly outperform existing strong baselines in both subject fidelity and textual alignment on subject-driven generation benchmarks.

Takeaways, Limitations

Takeaways:
A novel approach to improve subject fidelity in zero-shot subject-driven generation is presented.
Proposal of a CDNS methodology for efficiently generating synthetic negation targets.
We present an effective fine-tuning strategy that improves both subject fidelity and text alignment performance.
Demonstrating superior performance compared to existing methodologies, it sets a new standard for related research.
Limitations:
The performance of CDNS may vary depending on the appropriate choice of degradation technique.
The model complexity may increase and computational costs may be incurred.
Further research is needed on generalization performance in real-world environments.
👍