Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

MedRep: Medical Concept Representation for General Electronic Health Record Foundation Models

Created by
  • Haebom

Author

Junmo Kim, Namkyeong Lee, Jiwon Kim, Kwangsoo Kim

Outline

This paper proposes a novel medical concept representation (MedRep) based on the OMOP Common Data Model (CDM). Despite the performance gains from electronic health record (EHR)-based models, MedRep addresses the challenges of generalization and integration of models trained on different vocabularies due to the handling of unregistered medical codes. MedRep enriches the information of each concept by adding minimal definitions using Large-Scale Language Model (LLM) prompts and supplementing textual representations based on graph ontology in the OMOP vocabulary. Experimental results demonstrate that MedRep outperforms existing EHR-based models and models using existing medical code tokenizers across a variety of prediction tasks, and its generalizability is demonstrated through external validation.

Takeaways, Limitations

Takeaways:
Effectively solve the problem of handling unregistered medical codes in EHR-based models through a new medical concept representation (MedRep) based on OMOP CDM.
It shows improved performance over existing models in various prediction tasks.
External validation confirms the generalizability of MedRep.
Introducing new possibilities for medical data representation through the integration of LLM and OMOP CDM.
Limitations:
As an approach dependent on OMOP CDM, it may have limited applicability to EHR data that does not use OMOP CDM.
Further research is needed on optimizing LLM prompt engineering and utilizing OMOP graph ontology.
Further research is needed to determine generalizability across different medical domains and languages.
👍