Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

InfiMed: Low-Resource Medical MLLMs with Advancing Understanding and Reasoning

Created by
  • Haebom

Author

Zeyu Liu, Zhitian Hou, Guanghao Zhu, Zhijie Sang, Congkai Xie, Hongxia Yang

Outline

This paper addresses two key challenges in applying multimodal large-scale language models (MLLMs) to healthcare: sparse multimodal medical datasets and the reliability of Reinforcement Learning with Verifiable Rewards (RLVR) in healthcare. To achieve this, we integrated high-quality text inference data and general multimodal data with multimodal medical datasets during the Supervised Fine-tuning (SFT) phase to enhance the underlying medical capabilities and restore the model's inference capabilities. Furthermore, considering the sparse multimodal medical dataset, we synthesized a reflective pattern-injected chain-of-thought (CoT) sample in addition to the general CoT sample to provide early reflective inference capabilities. As a result, we developed the InfiMed-SFT-3B and InfiMed-RL-3B models, which achieved the highest performance on seven multimodal healthcare benchmarks. InfiMed-RL-3B achieved an average accuracy of 59.2%, outperforming InternVL3-8B (57.3%).

Takeaways, Limitations

Takeaways:
The performance of MLLM in the medical field was improved by utilizing various types of data at the SFT stage.
We provide a foundation for RLVR training by imparting early reflective inference capabilities through the CoT reflective pattern injection.
The InfiMed-RL-3B model achieved superior performance, outperforming larger models.
Our experiments provide insights that may contribute to improving MLLM performance in the medical field.
Limitations:
The paper may lack detailed information on the specific dataset composition and synthesis methodology.
Further validation of the generalizability of the proposed methodology is needed.
The safety and ethical aspects of the model may not have been sufficiently considered.
👍