Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

C3: A Bilingual Benchmark for Spoken Dialogue Models Exploring Challenges in Complex Conversations

Created by
  • Haebom

Author

Chengqian Ma, Wei Tao, Yiwen Guo

Outline

This paper focuses on comprehensively understanding the practical effectiveness of the Spoken Dialogue Model (SDM) and identifies shortcomings compared to well-established, text-based large-scale language models (LLMs). Considering the complexity of spoken dialogue, we highlight the challenges posed by linguistic and phonetic characteristics such as polysemy, homonyms, and contextual dependence. To address these challenges, we present a benchmark dataset containing 1,079 instances in English and Chinese, and evaluate the performance of the SDM using an LLM-based evaluation method.

Takeaways, Limitations

Takeaways:
Providing a benchmark dataset for practical performance evaluation of SDM.
Presenting the possibility of evaluation close to human judgment through LLM-based evaluation methodology.
Clearly present the complexity of spoken conversation (ambiguity, context dependence, etc.), which is a major challenge of SDM.
Limitations:
The presented dataset is limited to specific languages (English and Chinese), making generalization difficult.
Further validation is needed to ensure that the LLM-based assessment methodology fully matches actual human judgment.
The paper lacks any content on performance comparison or analysis of specific SDM models.
👍