Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

First Steps Towards Overhearing LLM Agents: A Case Study With Dungeons & Dragons Gameplay

Created by
  • Haebom

Author

Andrew Zhu, Evan Osgood, Chris Callison-Burch

Outline

This paper presents a novel paradigm, "overhearing agents," that differs from existing conversational LLM agents. Overhearing agents do not directly participate in conversations; instead, they "eavesdrop" on human conversations and perform background tasks to assist the user or provide suggestions. This study conducts in-depth research using large-scale multimodal audio-language models as overhearing agents to support the Dungeon Master within the context of Dungeons & Dragons gameplay. We evaluate the usability of these agents through human evaluations and find that some large-scale audio-language models are capable of performing overhearing agent tasks using implicit audio cues. Finally, we release a Python library and project code to support further research on the overhearing agent paradigm ( https://github.com/zhudotexe/overhearing_agents ).

Takeaways, Limitations

Takeaways:
We propose a new paradigm for conversational LLM agents, the "eavesdropping agent," and explore its possibilities.
We demonstrate the potential of a large-scale multimodal audio-language model to act as an eavesdropping agent in the specific context of a Dungeons & Dragons game.
The ability to perform tasks using implicit audio cues has been discovered in some large-scale models.
We support follow-up research by releasing related Python libraries and code.
Limitations:
Because this study was limited to the specific context of the Dungeons & Dragons game, further research is needed to determine its generalizability to other contexts.
There is a lack of detailed information on the types and performance of large-scale models used.
There is a lack of discussion of the privacy and ethical issues of eavesdropping agents.
👍