Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

DischargeSim: A Simulation Benchmark for Educational Doctor-Patient Communication at Discharge

Created by
  • Haebom

Author

Zonghai Yao, Michael Sun, Won Seok Jang, Sunjae Kwon, Soie Kwon, Hong Yu

Outline

We present a new benchmark, DischargeSim, which evaluates the ability of large-scale language models (LLMs) to serve as personalized discharge educators after patient visits. It simulates multi-turn post-visit conversations between LLM-based DoctorAgents and PatientAgents with diverse psychosocial profiles (e.g., health literacy, education, and emotional intelligence). Interactions are structured across six clinically relevant discharge topics and evaluated along three axes: conversational quality through automated and LLM-as-judge assessments; personalized document generation, including free-text summaries and structured AHRQ checklists; and patient understanding through downstream multiple-choice testing. Experimental results across 18 LLMs reveal significant variation in discharge education performance, with performance significantly varying across patient profiles. Specifically, model size does not always lead to better educational outcomes, highlighting the trade-off between strategy use and content prioritization. DischargeSim represents a first step toward benchmarking LLMs in post-visit clinical education and promoting equitable and personalized patient support.

Takeaways, Limitations

Takeaways:
Provides the first benchmark for the teaching capacity of LLM graduates.
This demonstrates that there is no clear correlation between model size and educational performance, highlighting the importance of strategic and content prioritization.
It highlights the importance of personalized discharge education that takes into account the patient's psychosocial factors.
We present the potential of leveraging the LLM to provide equitable and personalized patient support.
Limitations:
DischargeSim is still an early stage benchmark and there is room for further development and improvement.
There may be limitations in generalizability due to differences from actual clinical settings.
Due to limitations in the evaluation metrics, they may not fully capture the graduate teaching capabilities of the LLM.
👍