Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Ultra3D: Efficient and High-Fidelity 3D Generation with Part Attention

Created by
  • Haebom

Author

Yiwen Chen, Zhihao Li, Yikai Wang, Hu Zhang, Qin Li, Chi Zhang, Guosheng Lin

Outline

In this paper, we propose an Ultra3D framework that improves the efficiency of 3D content generation using sparse volume cell representation. The conventional two-stage diffusion model suffers from serious computational inefficiency due to the quadratic complexity of the attention mechanism. Ultra3D efficiently generates object layouts in the first stage by utilizing the VecSet representation and accelerates volume cell coordinate prediction by reducing the number of tokens. In the second stage, a partial attention mechanism based on geometric recognition is introduced to restrict attention computation only within semantically consistent subregions, thereby maintaining structural continuity and avoiding unnecessary global attention. This achieves up to 6.7x speedup in latent variable generation, supports high-resolution 3D generation at 1024 resolution, and achieves state-of-the-art performance in terms of visual fidelity and user preference. In addition, we build a scalable partial annotation pipeline that transforms raw meshes into sparse volume cells with partial labels.

Takeaways, Limitations

Takeaways:
Significantly improved the speed of 3D model generation based on sparse volume cell representation (up to 6.7x speedup)
Supports creation of high resolution (1024) 3D models.
Achieves state-of-the-art performance in terms of visual fidelity and user preference.
We present an efficient partial attention mechanism and a scalable partial annotation pipeline.
Limitations:
Further studies may be needed to explore the generalization performance of the proposed partial attention mechanism.
Further analysis of the accuracy and efficiency of the partial annotation pipeline may be required.
Additional evaluation of generalization performance for different types of 3D objects is needed.
There is a lack of analysis of memory usage.
👍