Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Crossing the Species Divide: Transfer Learning from Speech to Animal Sounds

작성자
  • Haebom

Author

Jules Cauzinille, Marius Miron, Olivier Pietquin, Masato Hagiwara, Ricard Marxer, Arnaud Rey, Benoit Favre

Outline

This paper studies the transfer learning performance of speech-based self-supervised learning models (HuBERT, WavLM, and XEUS) for bioacoustic detection and classification tasks. We demonstrate their ability to generate rich latent representations of animal sounds from diverse taxa, and analyze model characteristics through linear probing of time-averaged representations. Furthermore, we extend the approach to consider the influence of temporal information by using different downstream architectures, and study the impact of frequency range and noise on performance. Consequently, we demonstrate competitive performance with fine-tuned bioacoustic pretraining models, demonstrating the impact of noise-tolerant pretraining settings. This highlights the potential of speech-based self-supervised learning as an effective framework for advancing bioacoustic research.

Takeaways, Limitations

Takeaways:
We demonstrate that a self-supervised learning model can be effectively applied to bioacoustic data analysis.
We confirmed that we can generate rich latent expressions for various animal sounds.
Suggesting the importance of noise-resistant pre-training settings.
Presenting new possibilities for the advancement of bioacoustic research.
Limitations:
Since the results are for a specific model and dataset, further research is needed to determine generalizability.
Further analysis of Limitations on the temporal information consideration method is needed.
Further in-depth research into the frequency range and the effects of noise is needed.
👍