[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

ERR@HRI 2.0 Challenge: Multimodal Detection of Errors and Failures in Human-Robot Conversations

Created by
  • Haebom

Author

Shiye Cao, Maia Stiber, Amama Mahmood, Maria Teresa Parreira, Wendy Ju, Micol Spitale, Hatice Gunes, Chien-Ming Huang

Outline

This paper introduces the ERR@HRI 2.0 challenge for detecting and resolving errors (e.g., misunderstanding user intent, user interruptions in conversation, and failure to respond) in conversational robots based on large-scale language models (LLMs). The challenge provides a 16-hour human-robot interaction dataset (containing facial, vocal, and head movement features) annotated with the presence of robot errors and the user’s error-correction intentions. Participants will develop machine learning models to detect robot errors using multimodal data, and will be evaluated on metrics such as detection accuracy and false positive rate. This is an important step toward improving error detection in human-robot interaction through social signal analysis.

Takeaways, Limitations

Takeaways:
Presenting a standardized benchmarking criterion for error detection and resolution research in LLM-based conversational robots
Promoting the development of robot error detection models using data from various modes
Contributes to improving the reliability and efficiency of human-robot interaction
Contribute to the development of human-robot interaction research based on social signal analysis
Limitations:
Limitations on the size and diversity of the dataset (16 hours of data may not be enough)
Subjectivity and error potential in comments
Verification of generalization performance in real environments is needed
It is possible that it may not cover all types of robot errors.
👍