Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Benchmarking GPT-5 in Radiation Oncology: Measurable Gains, but Persistent Need for Expert Oversight

Created by
  • Haebom

Author

Ugur Dinc, Jibak Sarkar, Philipp Schubert, Sabine Semrau, Thomas Weissmann, Andre Karius, Johann Brand, Bernd-Niklas Axer, Ahmed Gomaa, Pluvio Stephan, Ishita Sheth, Sogand Beirami, Annette Schwarz, Udo Gaipl, Benjamin Frey, Christoph Bert, Stefanie Corradini, Rainer Fietkau, Florian Putz

Outline

This paper presents the results of a study evaluating the potential of GPT-5 in radiation oncology. GPT-5's performance was evaluated on two benchmarks: the ACR Radiation Oncology Training Intern Exam (TXIT, 2021) and 60 real-world clinical vignettes. On the TXIT, GPT-5 achieved an accuracy of 92.8%, outperforming GPT-4 (78.8%) and GPT-3.5 (62.1%). In the vignette evaluation, GPT-5 achieved high scores for accuracy (average score of 3.24/4) and comprehensiveness (average score of 3.59/4), but errors were observed in complex situations. In conclusion, GPT-5 holds promise in radiation oncology, but rigorous expert supervision is required before clinical application.

Takeaways, Limitations

Takeaways:
GPT-5 outperformed existing LLM models in the field of radiation oncology.
Excellent ability to create treatment plans for actual clinical cases.
Its potential as a radiation oncology education and decision support tool has been confirmed.
Limitations:
There is a potential for errors to occur in complex clinical situations.
Rigorous review by experts is essential for clinical application.
Because the inter-rater reliability was low (Fleiss' κ 0.083), it is difficult to exclude the influence of subjective judgment.
The possibility of hallucination in the results generated by GPT-5 has not been completely ruled out.
👍