Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Towards Machine Theory of Mind with Large Language Model-Augmented Inverse Planning

Created by
  • Haebom

Author

Rebekah A. Gelp i, Eric Xue, William A. Cunningham

Outline

In this paper, we propose a hybrid approach to the mechanical theory of mind (ToM), which combines the Bayesian reverse planning model and the large-scale language model (LLM), where the Bayesian reverse planning model computes posterior probabilities for possible mental states of an agent based on its actions, and the LLM is used to generate hypotheses and likelihood functions. The Bayesian reverse planning model can accurately predict human reasoning in a variety of ToM tasks, but it has limitations in scaling to scenarios with a large number of possible hypotheses and actions. On the other hand, the LLM-based approach shows promise in solving ToM benchmarks, but may exhibit weaknesses and failures in inference tasks. Our hybrid approach exploits the strengths of each component to achieve close to optimal results in tasks inspired by existing reverse planning models, and improves performance over models using the LLM alone or with thought process prompting. Furthermore, it demonstrates the potential to predict mental states in open tasks, suggesting promising directions for future development of ToM models and the creation of socially intelligent generative agents.

Takeaways, Limitations

Takeaways:
Achieving performance improvement on ToM tasks through a hybrid approach of LLM and Bayesian inverse programming models.
Presenting ToM predictability for more complex and diverse scenarios than existing reverse planning models.
Excellent performance on ToM tasks even with small LLMs.
We present new possibilities for developing socially intelligent generative agents by demonstrating the predictability of mental states in open tasks.
Limitations:
There may still be vulnerabilities in LLM and potential failures in inference tasks (an issue that has not been fully resolved).
Further studies are needed to investigate the generalization performance of the proposed hybrid model and its applicability to various ToM tasks.
Additional research may be needed to explore the model's interpretability and transparency.
👍