Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Federated Breast Cancer Detection Enhanced by Synthetic Ultrasound Image Augmentation

Created by
  • Haebom

Author

Hongyi Pan, Ziliang Hong, Gorkem Durak, Ziyue Xu, Ulas Bagci

Outline

In this paper, we propose a generative AI-based data augmentation framework that integrates synthetic image sharing into ultrasound images for breast cancer diagnosis to address the limited data availability and non-independent, identically distributed (NID) data issues that limit the effectiveness of federated learning (FL). We train two class-specific Deep Convolutional Generative Adversarial Networks (DCGANs) to generate synthetic images, and simulate the federated learning environments based on FedAvg and FedProx algorithms using three public datasets: BUSI, BUS-BRA, and UDIAT. The experimental results show that by adding an appropriate number of synthetic images, the average AUC of FedAvg improves from 0.9206 to 0.9237, and that of FedProx improves from 0.9429 to 0.9538, while excessive use of synthetic data leads to performance degradation. This demonstrates that generative AI-based data augmentation can improve the FL results on breast ultrasound image classification tasks.

Takeaways, Limitations

Takeaways:
Experimentally demonstrate that generative AI-based data augmentation is effective in improving the performance of federated learning (FL).
Performance improvement through AUC enhancement in ultrasound image classification for breast cancer diagnosis.
Emphasizes the importance of maintaining appropriate synthetic data ratios.
Limitations:
The dataset used is limited to public datasets, so there may be differences from the actual clinical environment.
Further validation of the generalization performance of the proposed method is needed.
More sophisticated methodological research is needed to determine the optimal synthetic data ratio.
Further experiments with different FL algorithms are needed.
👍