Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Double Entendre: Robust Audio-Based AI-Generated Lyrics Detection via Multi-View Fusion

Created by
  • Haebom

Author

Markus Frohmann, Gabriel Meseguer-Brocal, Markus Schedl, Elena V. Epure

Outline

In this paper, we present a novel method for detecting AI-generated music to address copyright and music industry-wide issues arising from the advancement of AI-based music generation tools. To overcome the limitations of existing audio or lyrics-based detectors (generalization and noise vulnerability of audio-based detectors, lack of accurate lyrics data of lyrics-based detectors), we propose a multi-modal, modular post-fusion pipeline that combines automatically transcribed song lyrics with speech features that capture lyrics-related information in audio. This method directly leverages lyric aspects in audio to enhance robustness and mitigate sensitivity to low-level artifacts, thereby increasing practical applicability. Experimental results show that the proposed DE-detect method outperforms existing lyrics-based detectors and is more robust to audio noise. The code is available on GitHub.

Takeaways, Limitations

Takeaways:
A novel multi-modal approach is presented to solve practical problems in AI-generated music detection.
Development of an AI-generated music detection model that is robust to audio noise and has excellent generalization performance.
Experimental results showing improved performance over existing methods are presented.
Increased reproducibility and usability through open code.
Limitations:
The performance of the proposed method is based on experimental results on a specific dataset, and additional validation of generalization performance on various music genres and AI generative models may be required.
Detection performance may be affected by the accuracy of automatic lyric transcription.
As new AI music generation models emerge, continuous model updates and retraining may be required.
👍