Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Dynamic Parameter Memory: Temporary LoRA-Enhanced LLM for Long-Sequence Emotion Recognition in Conversation

Created by
  • Haebom

Author

Jialong Mai, Xiaofen Xing, Yawei Li, Weidong Chen, Zhipeng Li, Jingyuan Xing, Xiangmin Xu

Outline

This paper proposes a dynamic parameter memory (DPM) mechanism to address the limited processing capacity of SLLM due to its high frame rate. DPM incrementally encodes sentence-level emotional information into temporary LoRA modules, effectively "memorizing" contextual information, enabling unlimited-length audio processing even within a limited context window. Experimental results using the IEMOCAP dataset demonstrate that DPM significantly improves the emotion recognition performance of SLLM when processing long audio sequences, achieving state-of-the-art performance.

Takeaways, Limitations

Takeaways:
Enables long-term speech data processing by solving the limited context window problem of SLLM.
Effectively utilizing sentence-level emotional information to improve emotion recognition performance in conversations.
Bringing the performance of SLLM-based speech emotion recognition to the state-of-the-art through the DPM mechanism.
Limitations:
DPM's performance is based on experimental results on the IEMOCAP dataset, and further research is needed to determine its generalization performance on other datasets or diverse speech features.
Currently, the focus is on sentence-level emotion encoding, and research on utilizing emotion information at more fine-grained units (e.g., syllables, words) may be needed.
Additional analysis and optimization studies may be needed to address the increased computational cost of DPM.
👍