Find Any Part in 3D

Ziqi Ma, Yisong Yue, Georgia Gkioxari

2024-11-26

Summary

This paper discusses Find3D, a new model designed to segment any part of a 3D object based on text queries, allowing for better analysis and understanding of 3D structures.

What's the problem?

Current methods for segmenting parts of 3D objects are limited because they often depend on specific categories and predefined labels. This makes it hard to work with a wide variety of objects and limits the flexibility needed for tasks that require recognizing different parts in an open-world context.

What's the solution?

Find3D addresses this issue by using a model that can predict parts of 3D objects without needing prior labels. It trains on a large collection of 3D models from the internet and uses advanced techniques to understand and segment parts based on any text input. The model is designed to work quickly and efficiently, achieving significant improvements in performance compared to existing methods.

Why it matters?

This research is important because it enhances our ability to analyze and manipulate 3D objects in various applications, such as robotics, gaming, and virtual reality. By allowing users to identify and work with any part of a 3D object through simple text queries, Find3D opens up new possibilities for creativity and functionality in digital environments.

Abstract

We study open-world part segmentation in 3D: segmenting any part in any object based on any text query. Prior methods are limited in object categories and part vocabularies. Recent advances in AI have demonstrated effective open-world recognition capabilities in 2D. Inspired by this progress, we propose an open-world, direct-prediction model for 3D part segmentation that can be applied zero-shot to any object. Our approach, called Find3D, trains a general-category point embedding model on large-scale 3D assets from the internet without any human annotation. It combines a data engine, powered by foundation models for annotating data, with a contrastive training method. We achieve strong performance and generalization across multiple datasets, with up to a 3x improvement in mIoU over the next best method. Our model is 6x to over 300x faster than existing baselines. To encourage research in general-category open-world 3D part segmentation, we also release a benchmark for general objects and parts. Project website: https://ziqi-ma.github.io/find3dsite/

View Paper