MeshFleet: Filtered and Annotated 3D Vehicle Dataset for Domain Specific Generative Modeling
Damian Boborzi, Phillip Mueller, Jonas Emrich, Dominik Schmid, Sebastian Mueller, Lars Mikelsons
2025-03-19
Summary
This paper is about creating a special dataset of 3D vehicles to help AI models generate realistic 3D cars and trucks.
What's the problem?
AI models need high-quality data to learn how to create realistic 3D objects, but it's hard to find good datasets for specific things like vehicles.
What's the solution?
The researchers created a dataset called MeshFleet by filtering and annotating a large collection of 3D objects to focus on vehicles. They used a quality classifier to automatically select the best 3D models.
Why it matters?
This work is important because it provides a resource for training AI models to generate realistic 3D vehicles, which could be used in engineering, design, and other fields.
Abstract
Generative models have recently made remarkable progress in the field of 3D objects. However, their practical application in fields like engineering remains limited since they fail to deliver the accuracy, quality, and controllability needed for domain-specific tasks. Fine-tuning large generative models is a promising perspective for making these models available in these fields. Creating high-quality, domain-specific 3D datasets is crucial for fine-tuning large generative models, yet the data filtering and annotation process remains a significant bottleneck. We present MeshFleet, a filtered and annotated 3D vehicle dataset extracted from Objaverse-XL, the most extensive publicly available collection of 3D objects. Our approach proposes a pipeline for automated data filtering based on a quality classifier. This classifier is trained on a manually labeled subset of Objaverse, incorporating DINOv2 and SigLIP embeddings, refined through caption-based analysis and uncertainty estimation. We demonstrate the efficacy of our filtering method through a comparative analysis against caption and image aesthetic score-based techniques and fine-tuning experiments with SV3D, highlighting the importance of targeted data selection for domain-specific 3D generative modeling.