< Explain other AI papers

MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds

Bingquan Dai, Li Ray Luo, Qihong Tang, Jie Wang, Xinyu Lian, Hao Xu, Minghan Qin, Xudong Xu, Bo Dai, Haoqian Wang, Zhaoyang Lyu, Jiangmiao Pang

2025-08-21

MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds

Summary

This paper introduces MeshCoder, a new system that turns 3D shapes made of points into editable computer programs that can be used in Blender. It uses a powerful AI model trained on a large dataset of 3D shapes and their corresponding programs to create these editable scripts, improving how we understand and modify 3D models.

What's the problem?

It's hard to create editable computer programs from complex 3D shapes. Existing methods are often limited because they can only handle simple shapes or use specialized, restricted programming languages, making it difficult to work with intricate designs.

What's the solution?

The researchers created MeshCoder, a framework that uses a large language model (LLM) to translate 3D point cloud data into Python scripts for Blender. They built a special collection of these scripts, breaking down each object's code into logical parts, and trained the AI on this data to generate accurate and functional programs.

Why it matters?

This work is important because it allows for easier reverse engineering and editing of 3D designs by providing editable code. It also helps AI understand 3D shapes better and makes complex geometry creation more accessible and flexible.

Abstract

Reconstructing 3D objects into editable programs is pivotal for applications like reverse engineering and shape editing. However, existing methods often rely on limited domain-specific languages (DSLs) and small-scale datasets, restricting their ability to model complex geometries and structures. To address these challenges, we introduce MeshCoder, a novel framework that reconstructs complex 3D objects from point clouds into editable Blender Python scripts. We develop a comprehensive set of expressive Blender Python APIs capable of synthesizing intricate geometries. Leveraging these APIs, we construct a large-scale paired object-code dataset, where the code for each object is decomposed into distinct semantic parts. Subsequently, we train a multimodal large language model (LLM) that translates 3D point cloud into executable Blender Python scripts. Our approach not only achieves superior performance in shape-to-code reconstruction tasks but also facilitates intuitive geometric and topological editing through convenient code modifications. Furthermore, our code-based representation enhances the reasoning capabilities of LLMs in 3D shape understanding tasks. Together, these contributions establish MeshCoder as a powerful and flexible solution for programmatic 3D shape reconstruction and understanding.