Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

LL3M: Large Language 3D Modelers

Created by
  • Haebom

Author

Sining Lu, Guan Chen, Nam Anh Dinh, Itai Lang, Ari Holtzman, Rana Hanocka

Outline

LL3M is a multi-agent system that generates 3D assets by leveraging pre-trained Large Language Models (LLMs) to write Python code interpretable in Blender. Unlike traditional generative approaches that learn from 3D datasets, it reframes shape generation as a code-writing task, enhancing modularity, editability, and integration with artist workflows. Given a text prompt, LL3M coordinates a team of specialized LLM agents to plan, discover, write, debug, and refine Blender scripts to generate and edit geometry and appearances. The generated code operates on a high-level, interpretable, human-readable, and well-documented representation of scenes and objects, leveraging sophisticated Blender components (e.g., B-mesh, geometry modifiers, shader nodes) for a wide variety of shapes, materials, and scenes. This code offers numerous avenues for additional agents and human editing and experimentation through code tuning or procedural parameters. This medium naturally facilitates a collaborative creative loop within the system. Agents can automatically self-critique using code and visuals, and iterative user guidance provides an intuitive way to improve assets. Shared code context between agents enables awareness of previous attempts, and BlenderRAG, a search-augmented generative knowledge base built on the Blender API documentation, provides agents with examples, types, and functions that enhance advanced modeling tasks and code accuracy. The effectiveness of LL3M is demonstrated across a range of shape categories, style and material editing, and user-driven improvements. Experiments demonstrate the power of code as a generative and interpretable medium for 3D asset creation. The project page is https://threedle.github.io/ll3m입니다 .

Takeaways, Limitations

Takeaways:
Introducing a new paradigm for 3D asset creation: improving modularity, editability, and interpretability through code generation.
Support for a variety of shapes, styles, and materials: Create complex and diverse 3D models by leveraging Blender's diverse features.
Support for collaborative creation processes with users: enabling iterative modification and improvement of code-based content.
Generate high-quality, interpretable code: The generated code is human-readable and editable, increasing its usability.
Limitations:
Dependency on LLM and Blender API: Affected by the performance and limitations of the LLM and Blender API.
Potential performance degradation when creating complex models: Creating complex 3D models requires more computational time and resources.
Difficulty in debugging and error handling: Additional effort is required to fix bugs and handle errors in the generated code.
Blender expertise required: Some knowledge of Blender is required to understand and modify the generated code.
👍