Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Empowering Healthcare Practitioners with Language Models: Structuring Speech Transcripts in Two Real-World Clinical Applications

Created by
  • Haebom

Author

Jean-Philippe Corbeil, Asma Ben Abacha, George Michalopoulos, Phillip Swazinna, Miguel Del-Agua, Jerome Tremblay, Akila Jeeson Daniel, Cari Bader, Yu-Cheng Cho, Pooja Krishnan, Nathan Bodenstab, Thomas Lin, Wenxuan Teng, Francois Beaulieu, Paul Vozila

Outline

This paper highlights that despite the strong performance of large-scale language models (LLMs) such as GPT-4o and o1 on clinical natural language processing (NLP) tasks in several healthcare benchmarks, two important NLP tasks, namely, generating structured tabular reports from nurses’ utterances and extracting medical prescriptions from doctor-patient consultations, have not yet been sufficiently studied due to data scarcity and sensitivity issues. This study investigates these two challenging tasks by evaluating the performance of open and closed weighted LLMs using private and public clinical datasets and analyzing their respective strengths and limitations. In addition, we propose an agent pipeline that generates realistic and non-sensitive nurse utterances, enabling structured extraction of clinical observations. To support further research, we release SYNUR and SIMORD, the first publicly available datasets for nurse observation extraction and medical prescription extraction, respectively.

Takeaways, Limitations

Takeaways:
Presenting the possibility of developing an LLM-based system capable of extracting structured information from nurses' oral and doctor-patient consultation data.
Presents the potential to reduce the workload of healthcare providers and improve patient-centered care delivery.
Introducing the first publicly available datasets for nurse observation extraction (SYNUR) and medical prescription extraction (SIMORD).
Suggesting future research directions through analysis of the strengths and limitations of open and closed LLMs.
Limitations:
Study scope limited due to lack of data and sensitivity issues.
Further validation of the generalization performance and robustness of the proposed agent pipeline is needed.
Further improvements are needed in the size and diversity of the SYNUR and SIMORD datasets.
Research is needed to verify the application and validity of the system in actual clinical environments.
👍