Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Fine-tuning foundational models to code diagnoses from veterinary health records

Created by
  • Haebom

Author

Mayla R. Boguslav, Adam Kiehl, David Kott, G. Joseph Strecker, Tracy Webb, Nadia Saklou, Terri Ward, Michael Kirby

Outline

This paper highlights the importance of clinical coding using standardized medical terminology to address the interoperability challenges of veterinary medical records, a large-scale data resource for veterinary clinical research. Compared to previous DeepTag and VetTag studies that attempted to automate veterinary diagnosis coding using LSTM and Transformer models, this study included all 7,739 SNOMED-CT diagnosis codes recognized by the Colorado State University Veterinary Teaching Hospital (CSU VTH) and fine-tuned 13 freely pre-trained language models (LMs) using 246,473 manually coded veterinary patient visit records from CSU VTH's electronic health record (EHR). The results demonstrated superior performance compared to previous studies, with the most accurate results achieved when fine-tuning a relatively large clinical LM using extensive labeled data. However, we demonstrated that similar results can be achieved even with limited resources and using non-clinical LMs. These findings contribute to improving the quality of veterinary EHRs by investigating accessible methods for automatic coding and to building an integrated and comprehensive health database spanning species and institutions to support both animal and human health research.

Takeaways, Limitations

Takeaways:
We demonstrate that leveraging pre-trained language models can improve the accuracy of automated veterinary diagnostic coding.
While the best performance is achieved with large datasets and large clinical LMs, we suggest that similar performance can be achieved with limited resources and non-clinical LMs.
Contribute to improving the quality of veterinary EHR and building an integrated cross-species, cross-institutional health database.
Contribute to securing a data base usable for animal and human health research.
Limitations:
This study was limited to data from CSU VTH, and further research is needed to determine generalizability.
A more in-depth analysis of the performance differences depending on the type and size of LM used is needed.
Further research is needed to determine applicability and generalizability to other veterinary institutions and species.
Further research is needed on its application and maintenance in real clinical settings.
👍