[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

MKE-Coder: Multi-Axial Knowledge with Evidence Verification in ICD Coding for Chinese EMRs

Created by
  • Haebom

Author

Xinxin You, Xien Liu, Xue Yang, Ziyi Wang, Ji Wu

Outline

This paper presents a novel framework for automatic International Classification of Diseases (ICD) coding based on Chinese electronic medical records (EMRs), MKE-Coder. Due to the concise description and special internal structure of Chinese EMRs, it is difficult to extract disease code-related information, and existing methods fail to utilize disease-based multi-axis knowledge and lack correlation with clinical evidence. To solve this problem, MKE-Coder first identifies candidate codes for diagnoses and classifies them into four coding axes. Then, it searches for corresponding clinical evidence from the comprehensive content of EMRs and filters reliable evidence through a scoring model. Finally, it verifies the validity of candidate codes through an inference module based on masked language modeling strategy, checks whether all axis knowledge related to the candidate codes are supported by evidence, and provides recommendations accordingly. Experiments are conducted using a large Chinese EMR dataset collected from various hospitals, and the results show that MKE-Coder performs well in automatic ICD coding tasks based on Chinese EMRs. Practical evaluations in simulated real-world coding scenarios demonstrate that it greatly contributes to improving the coder’s coding accuracy and speed.

Takeaways, Limitations

Takeaways:
Presenting an effective solution to the problem of automatic ICD coding in Chinese EMR
Proposing a new framework leveraging disease-based multi-axis knowledge and clinical evidence
Code validation and reliability enhancement through inference module based on mask language modeling
Contributes to improving the coder's coding accuracy and speed
Limitations:
Limitations in generalizability due to the specificity of the Chinese EMR dataset used.
Lack of performance evaluation across different medical fields or disease types
Lack of detailed description of scoring model and mask language modeling strategy.
Additional validation needed for practical application in medical settings
👍