This page organizes papers related to artificial intelligence published around the world. This page is summarized using Google Gemini and is operated on a non-profit basis. The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.
UniGen is a unified framework for image-to-image generation, aiming to generate controllable images using various conditional inputs and prompt instructions. To address the redundant model structure and inefficient computational resource utilization of existing approaches, we propose the Conditional Modulation Expert (CoMoE) module and WeaveNet. CoMoE aggregates semantically similar patch features and assigns them to dedicated expert modules to perform visual representation and conditional modeling, while WeaveNet is a dynamic linking mechanism that enables effective interaction between the backbone and conditional branches. Experimental results on the Subjects-200K and MultiGen-20M datasets demonstrate that UniGen achieves state-of-the-art performance on diverse conditional image generation tasks.
Takeaways, Limitations
•
Takeaways:
◦
We propose a unified image-to-image generation framework that supports various condition inputs and improves generation efficiency and expressiveness.
◦
The CoMoE module addresses redundant parameters and computational inefficiencies, enabling independent modeling of foreground features under different conditions.
◦
We propose a dynamic linking mechanism for efficient information exchange between the backbone and conditional branches via WeaveNet.
◦
Experiments on the Subjects-200K and MultiGen-20M datasets demonstrate state-of-the-art performance on a variety of conditional image generation tasks.
•
Limitations:
◦
The paper does not specifically address Limitations (however, potential issues not addressed in the paper may exist).