Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

EarthCrafter: Scalable 3D Earth Generation via Dual-Sparse Latent Diffusion

Created by
  • Haebom

Author

Shang Liu, Chenjie Cao, Chaohui Yu, Wen Qian, Jing Wang, Fan Wang

Outline

This paper addresses the challenge of generating 3D models of geographically vast areas (thousands of square kilometers). To address this, we present Aerial-Earth3D, a large-scale 3D aerial dataset consisting of 50,000 600m x 600m aerial photographs of the entire continental United States. The dataset contains multi-view images, depth maps, normals, semantic segmentation, and camera position information, and is quality-controlled to ensure terrain diversity. Based on this, we propose the EarthCrafter framework for large-scale 3D Earth generation using sparse-decoupled latent diffusion. EarthCrafter reduces computational costs by separating structure and texture generation using dual sparse 3D-VAE, which transforms high-resolution geometric voxels and 2D Gaussian splats (2DGS) into a compressed latent space. In addition, we model latent geometric and texture features independently and flexibly using condition-aware flow matching models trained on inputs that are either semantic, image, or a combination of both. Experimental results show that EarthCrafter excels at large-scale generation, supporting a wide range of applications from semantically guided city layout generation to unconditional terrain synthesis.

Takeaways, Limitations

Takeaways:
We present a new dataset Aerial-Earth3D and a framework EarthCrafter for large-scale 3D Earth generation.
Solving the computational cost problem of large-scale generation using sparse-separable potential diffusion techniques.
It presents various application possibilities such as semantic induction generation and unconditional terrain synthesis.
Ability to create diverse terrains while maintaining geographical validity.
Limitations:
The Aerial-Earth3D dataset is limited to the continental United States, making it difficult to expand globally.
EarthCrafter's performance evaluation is limited to a specific dataset, requiring further research on its generalizability.
Possible loss of information due to latent space compression.
Lack of detailed analysis of the computational complexity of the model.
👍