This is a page that curates AI-related papers published worldwide. All content here is summarized using Google Gemini and operated on a non-profit basis. Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.
Nikolas Adaloglou, Tim Kaiser, Damir Iagudin, Markus Kollmann
Outline
This paper presents a guidance technique for improving sample quality in diffusion models. Using two-dimensional examples, we demonstrate that guidance is highly beneficial when the auxiliary model's generalization error is similar to, but stronger than, that of the primary model. Building on this insight, we propose Masked Sliding Window Guidance (M-SWG), a novel, training-free method. M-SWG selectively restricts the receptive field to guide the primary model itself, thereby enhancing long-range spatial dependence. It eliminates the need for model weighting, additional training, or class conditioning in previous iterations. It achieves a superior Inception Score (IS) than existing state-of-the-art training-free approaches and does not induce sample oversaturation. Combined with existing guidance methods, it achieves state-of-the-art Frechet DINOv2 distances on ImageNet using EDM2-XXL and DiT-XL. The code is available at https://github.com/HHU-MMBS/swg_bmvc2025_official .
Takeaways, Limitations
•
Takeaways:
◦
A proposal for M-SWG, a new guide technique that requires no training.
◦
Achieves superior Inception Scores than existing state-of-the-art training-free methods.
◦
Improved performance without sample oversaturation.
◦
Achieving state-of-the-art Frechet DINOv2 distance on ImageNet by combining with existing guidance methods.
◦
Suggesting that the generalization error characteristics of the auxiliary model are important for the guide performance.
•
Limitations:
◦
Although the theoretical basis was presented using a two-dimensional example, further research is needed to determine generalizability to high-dimensional datasets.
◦
There is a possibility that the performance improvements of M-SWG may be limited to specific models and datasets.
◦
Extensive experimental validation is needed across various diffusion models and datasets.