MuSc-V2: Zero-Shot Multimodal Industrial Anomaly Classification and Segmentation with Mutual Scoring of Unlabeled Samples

Xurui Li, Feng Xue, Yu Zhou

2025-11-14

MuSc-V2: Zero-Shot Multimodal Industrial Anomaly Classification and Segmentation with Mutual Scoring of Unlabeled Samples

Summary

This paper introduces a new method, called MuSc-V2, for finding defects in products – like scratches or imperfections – without needing any examples of what those defects look like beforehand. It works with both 2D images and 3D models, or even a combination of both.

What's the problem?

Existing methods for finding defects without labeled examples often struggle because they don't fully utilize the fact that normal parts of a product tend to be very similar to each other, while defects are usually unique and stand out. They also have trouble accurately representing 3D shapes, leading to false alarms, and don't effectively combine information from 2D images and 3D models.

What's the solution?

MuSc-V2 tackles this by first improving how it understands 3D shapes using a technique called Iterative Point Grouping, which helps avoid misidentifying normal surfaces as defects. Then, it uses a method called Similarity Neighborhood Aggregation with Multi-Degrees to combine information from both 2D and 3D data, creating a more detailed picture of each part of the product. The core idea is a 'Mutual Scoring Mechanism' where different parts of the product 'score' each other based on similarity, and a 'Cross-modal Anomaly Enhancement' step to combine the 2D and 3D scores. Finally, it refines the results by suppressing false positives based on how similar a part is to other representative parts.

Why it matters?

This research is important because it significantly improves the accuracy of finding defects in products without needing to manually label examples. The method achieves substantial performance gains on standard datasets, even surpassing methods that *do* use labeled data in some cases. This makes it easier and cheaper to automate quality control in manufacturing, and the method’s flexibility allows it to be used across different types of products.

Abstract

Zero-shot anomaly classification (AC) and segmentation (AS) methods aim to identify and outline defects without using any labeled samples. In this paper, we reveal a key property that is overlooked by existing methods: normal image patches across industrial products typically find many other similar patches, not only in 2D appearance but also in 3D shapes, while anomalies remain diverse and isolated. To explicitly leverage this discriminative property, we propose a Mutual Scoring framework (MuSc-V2) for zero-shot AC/AS, which flexibly supports single 2D/3D or multimodality. Specifically, our method begins by improving 3D representation through Iterative Point Grouping (IPG), which reduces false positives from discontinuous surfaces. Then we use Similarity Neighborhood Aggregation with Multi-Degrees (SNAMD) to fuse 2D/3D neighborhood cues into more discriminative multi-scale patch features for mutual scoring. The core comprises a Mutual Scoring Mechanism (MSM) that lets samples within each modality to assign score to each other, and Cross-modal Anomaly Enhancement (CAE) that fuses 2D and 3D scores to recover modality-specific missing anomalies. Finally, Re-scoring with Constrained Neighborhood (RsCon) suppresses false classification based on similarity to more representative samples. Our framework flexibly works on both the full dataset and smaller subsets with consistently robust performance, ensuring seamless adaptability across diverse product lines. In aid of the novel framework, MuSc-V2 achieves significant performance improvements: a +23.7% AP gain on the MVTec 3D-AD dataset and a +19.3% boost on the Eyecandies dataset, surpassing previous zero-shot benchmarks and even outperforming most few-shot methods. The code will be available at The code will be available at https://github.com/HUST-SLOW/MuSc-V2{https://github.com/HUST-SLOW/MuSc-V2}.

View Paper