Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

MambAttention: Mamba with Multi-Head Attention for Generalizable Single-Channel Speech Enhancement

Created by
  • Haebom

Author

Nikolai Lund K uhne, Jesper Jensen, Jan {\O}stergaard, Zheng-Hua Tan

Outline

This paper presents a single-channel speech enhancement technique, MambAttention, which combines Mamba and a shared time-frequency multi-head attention module. We train the MambAttention model on the VB-DemandEx dataset and demonstrate that it outperforms existing LSTM, xLSTM, Mamba, and Conformer-based systems on two domain-external datasets: DNS-2020 and EARS-WHAM_v2.

Takeaways, Limitations

Takeaways:
The MambAttention model significantly improves generalization performance in the field of single-channel speech enhancement.
The importance of shared time-frequency multi-head attention modules is emphasized.
We improved performance by integrating attention modules into existing models such as LSTM and xLSTM.
Limitations:
Although the paper does not specifically mention Limitations, it may lack an analysis of model complexity and computational cost.
Further research may be needed on generalization performance to datasets outside of other domains.
👍