Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

XGeM: A Multi-Prompt Foundation Model for Multimodal Medical Data Generation

Created by
  • Haebom

Author

Daniele Molino, Francesco Di Feola, Eliodoro Faiella, Deborah Fazzini, Domiziana Santucci, Linlin Shen, Valerio Guarrasi, Paolo Soda

Outline

XGeM is a 6.7 billion-parameter multimodal generative model proposed to address the challenges of AI utilization in medical imaging: data scarcity, privacy concerns, and the need for robust multimodal integration. It constructs a shared latent space via contrastive learning and introduces a novel multi-prompt training strategy that can condition on an arbitrary subset of input modalities, thereby supporting synthesis across various arbitrary modalities. We compare it with competing models using the MIMIC-CXR dataset, and evaluate the realism and clinical relevance of the generated data through a visual Turing test targeting expert radiologists. We demonstrate that it can be utilized to address healthcare data challenges such as data anonymization, class imbalance, and data scarcity.

Takeaways, Limitations

Takeaways:
We present a robust multimodal generative model with 6.7 billion parameters supporting flexible and arbitrary inter-conversion between various medical data modalities.
Integrating multiple modalities while maintaining clinical consistency through contrastive learning and multi-prompt training strategies.
Contributes to solving problems of medical data anonymization, class imbalance, and data insufficiency.
Validation of the reality and clinical validity of generated data through expert evaluation.
Limitations:
The specific Limitations is not explicitly mentioned in the paper. Future research is needed to improve the model's performance and generalization ability.
There may be a dependency on a specific dataset (MIMIC-CXR). Generalization performance to other datasets needs to be verified.
Computing resource consumption and accessibility issues due to the large model size of 6.7 billion parameters.
👍