Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

CBVLM: Training-free Explainable Concept-based Large Vision Language Models for Medical Image Classification

Created by
  • Haebom

Author

Cristiano Patricio, Isabel Rio-Torto, Jaime S. Cardoso, Lu is F. Teixeira, Jo ao C. Neves

Outline

To address the lack of annotated data and poor interpretability, key challenges limiting the adoption of deep learning-based solutions in medical image analysis, this paper proposes the Concept Bottleneck Vision-Language Model (CBVLM), which leverages a large-scale Vision-Language Model (LVLM). CBVLM identifies the presence or absence of concepts in images through LVLM and classifies them based on this information. Furthermore, it integrates a retrieval module that selects optimal examples for context learning, reducing annotation costs and enhancing interpretability. Extensive experiments on four medical datasets and twelve LVLMs demonstrate that CBVLM outperforms existing methods.

Takeaways, Limitations

Takeaways:
Significantly reduce annotation costs by leveraging LVLM's small-shot learning ability.
Improving model interpretability through concept-based explanations.
Consistent performance across diverse medical datasets without additional training.
Outperforms existing CBM (Concept Bottleneck Model) and task-specific supervised methods.
Limitations:
It heavily depends on the performance of LVLM, and the limitations of the model also affect the performance of CBVLM.
The quality of the concept definition and search modules plays a significant role in the results.
LVLM can be computationally expensive.
👍