Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

AUDETER: A Large-scale Dataset for Deepfake Audio Detection in Open Worlds

Created by
  • Haebom

Author

Qizhou Wang, Hanxun Huang, Guansong Pang, Sarah Erfani, Christopher Leckie

Outline

This paper presents AUDETER, a large-scale and diverse deepfake audio dataset, to address the challenges of deepfake audio detection. Existing deepfake detection methods suffer from performance degradation in real-world environments due to discrepancies between training data and real-world data. AUDETER addresses this challenge by incorporating over 3 million audio clips (over 4,500 hours) generated by 11 text-to-speech models and 10 vocoders. Experimental results show that state-of-the-art methods trained on existing datasets struggle to generalize to new deepfake audio samples and exhibit high false positive rates. In contrast, methods trained on AUDETER achieve good detection performance and significantly reduce error rates.

Takeaways, Limitations

Takeaways:
We contribute to the advancement of deepfake audio detection by providing AUDETER, a large-scale and diverse deepfake audio dataset.
Through experiments using AUDETER, we clearly demonstrate the limitations of existing deepfake detection methods and emphasize the need for developing a generalized detection model.
We demonstrate that AUDETER-based training can significantly improve deepfake detection performance (achieving an error rate of 4.17%).
Limitations:
Despite AUDETER's diversity, it may not fully encompass all deepfake audio types in the real world.
As new deepfake generation technologies emerge, the validity of AUDETER may diminish over time.
Although the dataset is large, there is a possibility that certain types of deepfake audio may be under- or over-represented.
👍