CAD-Editor: A Locate-then-Infill Framework with Automated Training Data Synthesis for Text-Based CAD Editing

Yu Yuan, Shizhao Sun, Qi Liu, Jiang Bian

2025-02-12

CAD-Editor: A Locate-then-Infill Framework with Automated Training Data
Synthesis for Text-Based CAD Editing

Summary

This paper talks about CAD-Editor, a new way to edit 3D computer designs (CAD models) using simple text instructions. It's like having a smart assistant that can understand and apply changes to complex 3D models just by reading your written directions.

What's the problem?

Currently, editing 3D computer designs is complicated and requires special skills. Existing tools either can't use text instructions to make specific changes, or they ignore the original design when making new ones. Also, it's hard to get enough good examples to teach AI systems how to do this task well.

What's the solution?

The researchers created CAD-Editor, which works in two main steps. First, it figures out which part of the 3D model needs to be changed based on the text instructions. Then, it makes those specific changes. To teach their AI system, they came up with a clever way to create lots of practice examples automatically. They use other AI tools to generate pairs of original and edited 3D models, and then describe the differences between them in words.

Why it matters?

This matters because it could make 3D design much easier and more accessible to everyone. Instead of needing to learn complex 3D software, people could just type what they want to change, and the AI would do it for them. This could speed up design processes in many industries, from product design to architecture, and allow more people to participate in 3D modeling without extensive training.

Abstract

Computer Aided Design (CAD) is indispensable across various industries. Text-based CAD editing, which automates the modification of CAD models based on textual instructions, holds great potential but remains underexplored. Existing methods primarily focus on design variation generation or text-based CAD generation, either lacking support for text-based control or neglecting existing CAD models as constraints. We introduce CAD-Editor, the first framework for text-based CAD editing. To address the challenge of demanding triplet data with accurate correspondence for training, we propose an automated data synthesis pipeline. This pipeline utilizes design variation models to generate pairs of original and edited CAD models and employs Large Vision-Language Models (LVLMs) to summarize their differences into editing instructions. To tackle the composite nature of text-based CAD editing, we propose a locate-then-infill framework that decomposes the task into two focused sub-tasks: locating regions requiring modification and infilling these regions with appropriate edits. Large Language Models (LLMs) serve as the backbone for both sub-tasks, leveraging their capabilities in natural language understanding and CAD knowledge. Experiments show that CAD-Editor achieves superior performance both quantitatively and qualitatively.

View Paper