This is a page that curates AI-related papers published worldwide. All content here is summarized using Google Gemini and operated on a non-profit basis. Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.
This paper proposes Omni-Effects, a novel model capable of generating and spatially controlling diverse visual effects (VFX) within a single framework. Existing LoRA-based VFX generation models struggle with spatial control of multiple effects due to their effect-specific learning. Omni-Effects addresses this challenge through LoRA-based Mixing of Experts (LoRA-MoE) and Spatial Awareness Prompts (SAP). LoRA-MoE integrates multiple effects while mitigating inter-task interference, while SAP integrates spatial mask information into text tokens to enable precise spatial control. Furthermore, the Independent Information Flow (IIF) module separates control signals for individual effects, preventing unwanted mixing. We also present Omni-VFX, a comprehensive VFX dataset built using a novel data collection pipeline, and a dedicated VFX evaluation framework. Experimental results demonstrate that Omni-Effects achieves precise spatial control and diverse effect generation.
Takeaways, Limitations
•
Takeaways:
◦
A new method for efficiently creating and spatially controlling various VFX from a single model.
◦
Overcoming the limitations of existing LoRA-based models through LoRA-MoE and SAP.
◦
Providing a large-scale VFX dataset called Omni-VFX and a dedicated evaluation framework.
◦
Provides flexibility for users to specify both the type and location of the effect they want.
•
Limitations:
◦
Further clarification is needed regarding the size and diversity of the Omni-VFX dataset.
◦
Lack of analysis of the computational cost and training time of the proposed method.
◦
Further validation of applicability and scalability in real-world film production environments is needed.
◦
More detailed analysis of generalization performance across different types of VFX is needed.