Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

EVM-Fusion: An Explainable Vision Mamba Architecture with Neural Algorithmic Fusion

Created by
  • Haebom

Author

Zichuan Yang, Yongzhi Wang

Outline

This paper presents the Explainable Vision Mamba (EVM-Fusion) architecture to improve the accuracy, interpretability, and generalizability of medical image classification. EVM-Fusion employs a multi-pass design utilizing DenseNet and U-Net-based paths, each enhanced by a Vision Mamba (Vim) module. Various features are dynamically integrated through a two-step fusion process involving cross-modal attention and an iterative Neural Algorithm Fusion (NAF) block. Intrinsic explainability is internalized through path-specific spatial attention, Vim Δ-value maps, original feature SE-attention, and cross-modal attention weights. Experimental results on a diverse nine-class, multi-institutional medical image dataset demonstrate robust classification performance, achieving 99.75% test accuracy, highlighting the potential of reliable AI in medical diagnosis.

Takeaways, Limitations

Takeaways:
The accuracy of medical image classification was significantly improved through a multi-path design and NAF-based fusion mechanism (99.75% test accuracy achieved).
We provide multifaceted insights into the decision-making process through path-specific attention mechanisms and Δ-value maps, thereby enhancing interpretability.
It has demonstrated potential to contribute to the development of reliable AI-based medical diagnostic systems.
Limitations:
The proposed nine-class multi-institutional medical image dataset may not fully reflect the diversity of real-world clinical settings. Further validation of its generalizability is needed.
The complexity of the NAF block can increase computational costs, which may limit its application in real-time medical diagnostic systems.
Comparative analysis with other medical image classification models is lacking. More extensive validation of its performance is needed.
👍