Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Learning 3D-Gaussian Simulators from RGB Videos

Created by
  • Haebom

Author

Mikel Zhobro, Andreas Ren e Geist, Georg Martius

Outline

This paper proposes a novel learning-based 3D simulator, 3DGSim. 3DGSim directly learns physical interactions from multi-view RGB video, enabling realistic simulations without the need for privileged information such as depth information or particle tracking. It learns a latent particle-based representation of a 3D scene using MVSplat, predicts particle dynamics with the Point Transformer, performs consistent temporal aggregation with the Temporal Merging module, and generates new view renderings using Gaussian Splatting. By jointly learning inverse rendering and dynamics prediction, we embed physical properties into point-wise latent features, capturing a wide range of physical behaviors (from rigid to elastic, including cloth-like dynamics and boundary conditions) and realistic lighting effects, and generalize to unseen multi-body interactions and novel scene manipulations.

Takeaways, Limitations

Takeaways:
We present a novel method for learning physical interactions directly from multi-view RGB videos without privileged information.
Capturing a wide range of physical behaviors, from rigid bodies to elastic and cloth-like, and realistic lighting effects.
Improved generalization performance for unseen multi-body interactions and novel scene editing.
Integrating 3D scene reconstruction, particle dynamics prediction, and video synthesis into a single end-to-end framework.
Limitations:
Absence of specific analysis of the proposed model's computational cost and training data size.
Limitations of generalization performance for various physical phenomena and the need for additional experiments
Further verification of applicability and robustness to complex real-world situations is needed.
👍