This is a page that curates AI-related papers published worldwide. All content here is summarized using Google Gemini and operated on a non-profit basis. Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.
Scaffold Diffusion: Sparse Multi-Category Voxel Structure Generation with Discrete Diffusion
Created by
Haebom
Author
Justin Jung
Outline
This paper proposes a generative model called Scaffold Diffusion to address the challenges of generating sparse multi-category 3D voxel structures, specifically the severe class imbalance caused by the cubic memory scaling and sparsity of the voxel structures. Scaffold Diffusion treats voxels as tokens and generates 3D voxel structures using a discrete diffusion language model. We demonstrate that this model can be extended to generate spatially coherent 3D structures beyond inherently sequential domains such as text. Through evaluations on Minecraft house structures from the 3D-Craft dataset, we demonstrate that Scaffold Diffusion, unlike existing baseline models and autoregressive formulations, generates realistic and consistent structures even when trained with data with >98% sparsity. We also provide an interactive viewer to visualize the generated samples and the generation process ( https://scaffold.deepexploration.org/ ).