[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

MRGen: Segmentation Data Engine for Underrepresented MRI Modalities

Created by
  • Haebom

Author

Haoning Wu, Ziheng Zhao, Ya Zhang, Yanfeng Wang, Weidi Xie

Outline

In this paper, we explore how to leverage generative models to address the lack of annotated data in training medical image segmentation models for rare but clinically important medical image modalities. Focusing specifically on MRI, which lacks annotations, we present three major contributions. First, we introduce MRGen-DB, a large-scale radiology image text dataset with rich metadata including modality labels, attributes, regions, and organ information, and a subset of pixel-wise mask annotations. Second, we present MRGen, a diffusion-based data engine conditioned on text prompts and segmentation masks. MRGen generates realistic images for a variety of MRI modalities lacking mask annotations, facilitating segmentation training in areas lacking sources. Third, we demonstrate through extensive experiments on multiple modalities that MRGen significantly improves segmentation performance for unannotated modalities by providing high-quality synthetic data. This work addresses an important gap in medical image analysis by extending segmentation capabilities to scenarios where manual annotations are difficult to obtain. The code, models, and data will be made publicly available.

Takeaways, Limitations

Takeaways:
Contributes to solving the problem of lack of annotation data by generating synthetic data for training medical image segmentation models.
Enabling research by providing a large-scale medical image text dataset called MRGen-DB.
Realistic image synthesis for various MRI modalities using diffusion-based model MRGen.
Experimentally demonstrating improved segmentation performance in modalities lacking annotations.
Limitations:
There is a need for more stringent criteria for qualitative assessment of generated synthetic data.
Further validation of generalization performance with real clinical data is needed.
There is a possibility of poor generalization performance for datasets biased towards specific modalities.
Consideration needs to be given to the computational cost and time required to generate synthetic data.
👍