Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Generating Multi-Table Time Series EHR from Latent Space with Minimal Preprocessing

Created by
  • Haebom

Author

Eunbyeol Cho, Jiyoun Kim, Minjae Lee, Sungjin Park, Edward Choi

Outline

RawMed is the first framework to synthesize multi-table time-series EHR data that resembles raw EHRs, unlike traditional medical records that contain only a few key signs or structured codes selected by experts. It captures complex structure and temporal dynamics with minimal preprocessing using text-based representation and compression techniques. We also propose a novel evaluation framework for multi-table time-series synthetic EHRs to evaluate distributional similarity, inter-table relationships, temporal dynamics, and privacy. When validated on two open-source EHR datasets, RawMed outperforms baseline models in terms of fidelity and usability. The code is available at https://github.com/eunbyeol-cho/RawMed .

Takeaways, Limitations

Takeaways:
A novel framework for generating multi-table time series synthetic EHR data similar to raw EHR data is presented.
Efficient data synthesis and capture of complex structures and temporal dynamics using text-based representation and compression techniques.
A novel evaluation framework for multi-table time series synthetic EHR data (distributional similarity, inter-table relationships, temporal dynamics, privacy)
Improved fidelity and usability compared to previous models
Open source code disclosure
Limitations:
The paper does not specifically mention Limitations. Additional experiments and verification are needed to clarify generalizability and Limitations.
Since these are validation results for a specific EHR dataset, further research is needed to determine generalizability to other datasets.
👍