VQ-Seg: Vector-Quantized Token Perturbation for Semi-Supervised Medical Image Segmentation
Sicheng Yang, Zhaohu Xing, Lei Zhu
2026-01-16
Summary
This paper introduces a new method, VQ-Seg, for improving medical image segmentation, specifically focusing on identifying lung cancer in CT scans. It aims to make the process of training these image analysis tools more reliable and accurate, especially when only limited labeled data is available.
What's the problem?
Current methods for improving medical image segmentation often use a technique called 'dropout' to add randomness during training, which helps prevent the model from memorizing the training data. However, dropout requires careful adjustment of a setting called the 'dropout rate', and finding the right value is tricky and can significantly impact performance. It's hard to know what rate will work best, and a poorly chosen rate can lead to a less effective model.
What's the solution?
VQ-Seg tackles this problem by using 'vector quantization' to simplify the image features. Instead of randomly dropping connections like dropout, it rearranges these simplified features in a controlled way. To prevent losing important information during this simplification, the method uses a two-part system: one part reconstructs the original image, and the other performs the segmentation task. Additionally, it incorporates knowledge from pre-trained models to add back some of the detail lost during simplification, guiding the segmentation process.
Why it matters?
This research is important because it offers a more stable and effective way to train medical image segmentation models, particularly when dealing with limited labeled data. By removing the need for careful dropout rate tuning and improving performance on lung cancer detection, VQ-Seg has the potential to help doctors more accurately diagnose and treat lung cancer, and the approach could be applied to other medical imaging tasks as well.
Abstract
Consistency learning with feature perturbation is a widely used strategy in semi-supervised medical image segmentation. However, many existing perturbation methods rely on dropout, and thus require a careful manual tuning of the dropout rate, which is a sensitive hyperparameter and often difficult to optimize and may lead to suboptimal regularization. To overcome this limitation, we propose VQ-Seg, the first approach to employ vector quantization (VQ) to discretize the feature space and introduce a novel and controllable Quantized Perturbation Module (QPM) that replaces dropout. Our QPM perturbs discrete representations by shuffling the spatial locations of codebook indices, enabling effective and controllable regularization. To mitigate potential information loss caused by quantization, we design a dual-branch architecture where the post-quantization feature space is shared by both image reconstruction and segmentation tasks. Moreover, we introduce a Post-VQ Feature Adapter (PFA) to incorporate guidance from a foundation model (FM), supplementing the high-level semantic information lost during quantization. Furthermore, we collect a large-scale Lung Cancer (LC) dataset comprising 828 CT scans annotated for central-type lung carcinoma. Extensive experiments on the LC dataset and other public benchmarks demonstrate the effectiveness of our method, which outperforms state-of-the-art approaches. Code available at: https://github.com/script-Yang/VQ-Seg.