Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Learning to Diagnose Privately: DP-Powered LLMs for Radiology Report Classification

Created by
  • Haebom

Author

Payel Bhattacharjee, Fengwei Tian, Geoffrey D. Rubin, Joseph Y. Lo, Nirav Merchant, Heidi Hanson, John Gounley, Ravi Tandon

Outline

This study proposes a framework for fine-tuning a large-scale language model (LLM) using differential privacy (DP) to perform multiple anomaly detection in radiology report text. By injecting compensated noise during fine-tuning, we aim to mitigate privacy risks associated with sensitive patient data and prevent data leakage while maintaining classification performance. Using the MIMIC-CXR and CT-RATE datasets (50,232 reports collected from 2011 to 2019), we fine-tuned three model architectures: BERT-medium, BERT-small, and ALBERT-base using differential privacy low-rank adaptation (DP-LoRA). We evaluated model performance under various privacy budgets (0.01, 0.1, 1.0, and 10.0) using weighted F1 scores to quantitatively analyze the privacy-utility tradeoff.

Takeaways, Limitations

Takeaways:
Differential privacy fine-tuning using LoRA addresses key challenges in fine-tuning LLM on sensitive medical data, enabling effective and privacy-preserving multi-anomaly classification from radiology reports.
Under reasonable privacy guarantees, the DP fine-tuned model achieved similar weighted F1 scores on the MIMIC-CXR (0.88 vs. 0.90) and CT-RATE (0.59 vs. 0.78) datasets compared to the non-privacy-preserving LoRA baseline model.
We experimentally verified the privacy-utility trade-off across various model architectures and privacy levels.
Limitations:
The study was limited to specific datasets (MIMIC-CXR, CT-RATE) and model architectures (BERT-medium, BERT-small, ALBERT-base), requiring further research on generalizability.
The privacy-usefulness tradeoff may vary across datasets and models, and further research is needed to determine the optimal level of privacy.
Applicability to more diverse medical datasets and clinical scenarios needs to be verified.
👍