SegBook: A Simple Baseline and Cookbook for Volumetric Medical Image Segmentation
Jin Ye, Ying Chen, Yanjun Li, Haoyu Wang, Zhongying Deng, Ziyan Huang, Yanzhou Su, Chenglong Ma, Yuanfeng Ji, Junjun He
2024-11-26

Summary
This paper presents SegBook, a comprehensive resource for volumetric medical image segmentation that includes a large dataset and a simple baseline model to help researchers improve their techniques in this area.
What's the problem?
Volumetric medical image segmentation, which involves identifying and outlining different structures in 3D medical images like CT scans, is challenging due to the lack of large, diverse datasets. This limits researchers' ability to train models effectively and evaluate their performance consistently across various tasks and imaging types.
What's the solution?
To address this issue, the authors collected 87 public datasets covering different imaging modalities (like CT and MRI) and various anatomical targets. They created a baseline model that can be used for training and benchmarking segmentation tasks. This model allows researchers to evaluate how well their methods perform when transferring knowledge from one type of data to another, such as using models trained on CT images to work with MRI images. The paper also discusses the findings from experiments that highlight how dataset size affects model performance.
Why it matters?
This research is important because it provides a valuable resource for the medical imaging community, enabling better training and evaluation of segmentation models. By offering a large-scale benchmark and insights into transfer learning, SegBook aims to enhance the accuracy and efficiency of medical image analysis, ultimately improving patient care through better diagnostic tools.
Abstract
Computed Tomography (CT) is one of the most popular modalities for medical imaging. By far, CT images have contributed to the largest publicly available datasets for volumetric medical segmentation tasks, covering full-body anatomical structures. Large amounts of full-body CT images provide the opportunity to pre-train powerful models, e.g., STU-Net pre-trained in a supervised fashion, to segment numerous anatomical structures. However, it remains unclear in which conditions these pre-trained models can be transferred to various downstream medical segmentation tasks, particularly segmenting the other modalities and diverse targets. To address this problem, a large-scale benchmark for comprehensive evaluation is crucial for finding these conditions. Thus, we collected 87 public datasets varying in modality, target, and sample size to evaluate the transfer ability of full-body CT pre-trained models. We then employed a representative model, STU-Net with multiple model scales, to conduct transfer learning across modalities and targets. Our experimental results show that (1) there may be a bottleneck effect concerning the dataset size in fine-tuning, with more improvement on both small- and large-scale datasets than medium-size ones. (2) Models pre-trained on full-body CT demonstrate effective modality transfer, adapting well to other modalities such as MRI. (3) Pre-training on the full-body CT not only supports strong performance in structure detection but also shows efficacy in lesion detection, showcasing adaptability across target tasks. We hope that this large-scale open evaluation of transfer learning can direct future research in volumetric medical image segmentation.