Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Advancing AI-Powered Medical Image Synthesis: Insights from MedVQA-GI Challenge Using CLIP, Fine-Tuned Stable Diffusion, and Dream-Booth + LoRA

Created by
  • Haebom

Author

Ojonugwa Oluwafemi Ejiga Peter, Md Mahmudur Rahman, Fahmi Khalifa

Outline

This paper presents a novel approach for generating dynamic, scalable, and accurate medical images from text descriptions, addressing the MEDVQA-GI challenge. To overcome the limitations of existing methods (static image analysis and the lack of dynamic medical image generation from text descriptions), we integrated fine-tuned Stable Diffusion and DreamBooth models with Low-Rank Adaptation (LORA) to generate high-quality medical images. The system consists of two subtasks: Image Synthesis (IS) and Optimal Prompt Generation (OPG). Evaluation results show that Stable Diffusion generates higher-quality and more diverse images than CLIP and DreamBooth + LORA. Specifically, it achieved the lowest FID score (0.099 for single-center, 0.064 for multi-center, and 0.067 for combined) and the highest Inception Score (average across datasets, 2.327). This achievement is expected to contribute to the advancement of AI-based medical diagnosis.

Takeaways, Limitations

Takeaways:
We present a novel method for dynamically generating high-quality medical images from text descriptions.
Proving the superiority of the Stable Diffusion model in the field of medical image generation.
Contributing to improving AI-based medical diagnostic technology.
Limitations:
Further research is needed, including model improvement, dataset augmentation, and ethical considerations for clinical application.
👍