Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

EMR-AGENT: Automating Cohort and Feature Extraction from EMR Databases

Created by
  • Haebom

Author

Kwanhyung Lee, Sungsoo Hong, Joonhyung Park, Jeonghyeop Lim, Juhwan Choi, Donghwee Yoon, Eunho Yang

Outline

EMR-AGENT is an agent-based framework for extracting structured clinical data from electronic medical records (EMRs). It automates cohort selection, feature extraction, and code mapping by replacing manual rule writing with dynamic, language-model-based interactions. EMR-AGENT queries the database to infer schemas and documents, and uses SQL not only for data retrieval but also for database observation and decision-making. Performance and generalization capabilities are demonstrated through benchmarking results against three EMR databases: MIMIC-III, eICU, and SICdb. The code is publicly released.

Takeaways, Limitations

Overcomes the limitations of manual data pipeline construction and enables automated EMR data extraction.
Applicable to various EMR databases, increasing generalizability across institutions.
Reduce schema dependency by leveraging SQL for database observation and decision-making in addition to data retrieval.
Demonstrate performance with benchmarking results against three EMR databases and increase accessibility by releasing code and demos.
As of now, only evaluation results for three databases are presented, and validation for more diverse databases is needed.
Results may vary depending on the performance of the language model, and consideration must be given to the model's bias issues.
👍