Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Early Detection of Pancreatic Cancer Using Multimodal Learning on Electronic Health Records

Created by
  • Haebom

Author

Mosbah Aouad, Anirudh Choudhary, Awais Farooq, Steven Nevers, Lusine Demirkhanyan, Bhrandon Harris, Suguna Pappu, Christopher Gondi, Ravishankar Iyer

Outline

This paper presents a novel multimodal approach for the early diagnosis of pancreatic ductal adenocarcinoma (PDAC). Specifically, we propose a method to detect PDAC up to 1 year before clinical diagnosis by integrating longitudinal diagnosis code histories from electronic health records (EHRs) with regularly collected laboratory measurements. We combine neural controlled differential equations to model non-stationary laboratory time series, pretrained language models and recurrent neural networks to learn diagnosis code trajectory representations, and a cross-attention mechanism to capture the interaction between the two modalities. We develop and evaluate the approach on a real-world dataset of approximately 4,700 patients, achieving significant improvements in AUC (range, 6.5% to 15.5%) over state-of-the-art methods. Furthermore, we identify a panel of diagnosis codes and laboratory tests that are associated with a high risk of PDAC, including both established and novel biomarkers. The source code is available at https://github.com/MosbahAouad/EarlyPDAC-MML .

Takeaways, Limitations

Takeaways:
Demonstrating the potential for early diagnosis of pancreatic cancer using electronic health record data.
Achieves improved accuracy (AUC) compared to existing methods.
Suggesting the possibility of discovering a new pancreatic cancer risk biomarker.
Demonstrating the utility of a multimodal approach.
Limitations:
Lack of detailed description of the characteristics (size, composition, etc.) of the dataset used in the study.
Further verification of the generalization performance of the developed model is needed.
Further research is needed for application in real clinical settings.
Further research is needed to validate new biomarkers.
👍