Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

CANDLE: A Cross-Modal Agentic Knowledge Distillation Framework for Interpretable Sarcopenia Diagnosis

Created by
  • Haebom

Author

Yuqi Jin, Zhenhao Shuai, Zihan Hu, Weiteng Zhang, Weihao Xie, Jianwei Shuai, Xian Shen, Zhen Feng

Outline

This paper presents a method for leveraging the superior generalization and transferability of large-scale language models (LLMs) to apply them to data-poor and heterogeneous medical domains, particularly tasks such as sarcopenia diagnosis. We aim to address the trade-off between interpretability and predictive performance by combining the robust performance and feature-level explainability of traditional machine learning (TML) models with the semantic expressiveness of LLMs. Specifically, we propose the CANDLE framework, which inputs SHAP values from an XGBoost model into an LLM and generates calibrated evidence and improved decision rules through a reinforcement learning-based inference process, which are then integrated into a knowledge repository to perform case-based inference using a retrieval-augmented generation (RAG) approach.

Takeaways, Limitations

Takeaways:
By overcoming the interpretability limitations of LLM by combining it with an explainable machine learning model based on SHAP, we have increased the usability of LLM in the medical field.
Using the diagnosis of sarcopenia as a case study, we demonstrated the applicability of LLM in data-poor and heterogeneous medical data environments.
We propose a method to improve the inference process of LLM and generate reliable decision rules using reinforcement learning.
We propose a novel approach to integrating knowledge from existing TML models into LLM-based systems, enabling knowledge capitalization.
By utilizing RAG to perform case-based reasoning, we demonstrated the possibility of building a system applicable to real-world medical settings.
Limitations:
Because this paper focused on the diagnosis of sarcopenia, further research is needed to determine its generalizability to other medical fields.
Reinforcement learning processes can be complex and computationally expensive.
There is a possibility of information loss when applying SHAP values to LLM.
Learning data bias in LLM may affect the results.
Large-scale medical datasets are required, and verification of data quality is crucial.
👍