MRGen: Diffusion-based Controllable Data Engine for MRI Segmentation towards Unannotated Modalities
Haoning Wu, Ziheng Zhao, Ya Zhang, Weidi Xie, Yanfeng Wang
2024-12-06

Summary
This paper talks about MRGen, a new method for improving MRI segmentation by generating synthetic data for medical images that lack annotations, making it easier to train models without needing extensive labeled data.
What's the problem?
Medical image segmentation is crucial for diagnosing and treating patients, but there aren't enough annotated images (images with labels that indicate specific features) available, especially for certain types of MRI scans. This scarcity makes it difficult to develop accurate segmentation models that can identify different parts of the body in MRI images.
What's the solution?
The authors created a large dataset called MedGen-1M, which includes various radiology images along with some annotations. They developed MRGen, a diffusion-based data engine that generates synthetic MRI images based on text prompts and existing masks. This allows the model to create training samples for MRI modalities that don’t have enough annotations, helping improve the performance of segmentation models even when they are trained on unannotated data.
Why it matters?
This research is important because it addresses the lack of annotated medical images, which can hinder the development of effective diagnostic tools. By enabling the generation of synthetic training data, MRGen can help improve the accuracy and reliability of MRI segmentation models, ultimately leading to better patient care and outcomes in medical imaging.
Abstract
Medical image segmentation has recently demonstrated impressive progress with deep neural networks, yet the heterogeneous modalities and scarcity of mask annotations limit the development of segmentation models on unannotated modalities. This paper investigates a new paradigm for leveraging generative models in medical applications: controllably synthesizing data for unannotated modalities, without requiring registered data pairs. Specifically, we make the following contributions in this paper: (i) we collect and curate a large-scale radiology image-text dataset, MedGen-1M, comprising modality labels, attributes, region, and organ information, along with a subset of organ mask annotations, to support research in controllable medical image generation; (ii) we propose a diffusion-based data engine, termed MRGen, which enables generation conditioned on text prompts and masks, synthesizing MR images for diverse modalities lacking mask annotations, to train segmentation models on unannotated modalities; (iii) we conduct extensive experiments across various modalities, illustrating that our data engine can effectively synthesize training samples and extend MRI segmentation towards unannotated modalities.