Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Fairness in Dysarthric Speech Synthesis: Understanding Intrinsic Bias in Dysarthric Speech Cloning using F5-TTS

Created by
  • Haebom

Author

M Anuprabha, Krishna Gurugubelli, Anil Kumar Vuppala

Outline

This paper focuses on developing assistive technologies for dysarthric speech, which is challenging due to limited data. Specifically, we note that despite recent advances in neural speech synthesis utilizing zero-shot speech replication techniques, these techniques can introduce biases in dysarthric speech. Using the TORGO dataset, we investigate the effectiveness of the state-of-the-art F5-TTS model in replicating dysarthric speech in terms of intelligence, speaker similarity, and prosody preservation. We also assess the disparity between dysarthric severity levels using inequity metrics (Disparate Impact and Parity Difference). Our results suggest that F5-TTS exhibits a stronger bias toward intelligence over speaker and prosody preservation in synthesizing dysarthric speech. These findings can contribute to the integration of fairness-conscious speech synthesis for dysarthric speech, thereby fostering the development of more comprehensive speech technologies.

Takeaways, Limitations

Takeaways:
We demonstrate that state-of-the-art neural speech synthesis models, such as F5-TTS, can be applied to speech synthesis for dysarthria.
We emphasize the importance of balancing intelligence, speaker similarity, and prosody preservation in speech synthesis for dysarthria.
We present a method to evaluate the bias of speech synthesis systems for speech disorders using fairness metrics.
This suggests the need for developing speech synthesis technology for speech disorders that takes fairness into account.
Limitations:
Using only one TORGO dataset may limit generalizability.
In addition to the fairness metrics used in the analysis, other metrics need to be considered.
Further research is needed to fully address the bias of the F5-TTS model.
👍