Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Automating Adjudication of Cardiovascular Events Using Large Language Models

Created by
  • Haebom

Author

Sonish Sivarajkumar, Kimia Ameri, Chuqin Li, Yanshan Wang, Min Jiang

Outline

This paper presents a novel framework for automating cardiovascular event adjudication in cardiovascular disease clinical trials using large-scale language models (LLMs). To address the time-consuming, resource-intensive, and inter-adjudicator variability of traditional manual adjudication methods, we develop a two-step approach: an LLM-based pipeline for event information extraction from unstructured clinical data, and an LLM-based adjudication process guided by the Tree of Thoughts approach and Clinical Endpoints Committee (CEC) guidelines. Using cardiovascular event-specific clinical trial data, we achieve an F1 score of 0.82 for event extraction and an accuracy of 0.68 for adjudication. We also present a novel automated metric, the CLEART score, specifically designed to assess the quality of AI-generated clinical inference in cardiovascular event adjudication. This approach demonstrates the potential to significantly reduce adjudication time and cost while maintaining high-quality, consistent, and auditable outcomes in clinical trials. By reducing variability and improving standardization, we can more quickly identify and mitigate risks associated with cardiovascular therapies.

Takeaways, Limitations

Takeaways:
LLM-based automation presents the potential for time and cost savings in cardiovascular event adjudication.
Reduced inter-judge variability and improved consistency of results.
Presenting the possibility of AI-based clinical inference quality assessment through the introduction of the CLEART score.
Provides rapid identification and potential mitigation of cardiovascular treatment-related risks.
Limitations:
Accuracy of event extraction (F1 0.82) and judgment (accuracy 0.68) is not perfect. Further accuracy improvement is needed.
Further research is needed to determine the generalizability and validity of the CLEART score.
Need to ensure explainability and transparency of LLM-based systems.
Consideration should be given to bias and generalizability of clinical data.
👍