Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Complete Gaussian Splats from a Single Image with Denoising Diffusion Models

Created by
  • Haebom

Author

Ziwei Liao, Mohamed Sayed, Steven L. Waslander, Sara Vicente, Daniyar Turmukhambetov, Michael Firman

Outline

This paper proposes a novel method for reconstructing complete 3D scenes from a single image using Gaussian splats. Existing Gaussian splatting techniques require dense observation data and struggle to reconstruct occluded or unobserved regions. This study utilizes a latent diffusion model to reconstruct a complete 3D scene, including occluded regions, from a single image. Completing the surface of occluded regions is a challenging problem due to ambiguity. Existing methods rely on regression-based approaches that predict a single "mode," leading to blurriness, unrealistic interpretation, and inability to interpret multiple regions. In contrast, this study proposes a generative method that learns the 3D representation distribution of Gaussian splats based on a single input image. To address the lack of accurate data, we propose a variational autoreconstructor that learns the latent space using only 2D images using a self-supervised learning approach, and then train a diffusion model based on this model. Consequently, our method achieves faithful reconstructions and a diverse sample set, enabling the completion of occluded surfaces for high-quality 360-degree rendering.

Takeaways, Limitations

Takeaways:
Reconstruction of a complete 3D scene, including occluded areas, from a single image.
Solving the blurriness and unreality issues of existing methods
Ability to create 3D representations with various possibilities
High-quality 360-degree rendering possible
Efficient latent space learning based on self-supervised learning
Limitations:
Lack of detailed analysis of the performance of the proposed Variational AutoReconstructor.
Need to evaluate generalization performance for real complex scenes
Scalability evaluation for large datasets is needed.
Lack of analysis of computational costs and processing times
👍