Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Active Confusion Expression in Large Language Models: Leveraging World Models toward Better Social Reasoning

Created by
  • Haebom

Author

Jialu Du, Guiyang Hou, Yihui Fu, Chen Wu, Wenqi Zhang, Yongliang Shen, Weiming Lu

Outline

Large-scale language models (LLMs) excel at mathematical and code reasoning, but struggle with social reasoning tasks, exhibiting cognitive confusion, logical inconsistencies, and confusion between objective world states and subjective belief states. Analysis of the inference trajectories of DeepSeek-R1 reveals that LLMs frequently encounter inference blockages when handling scenarios with multiple participants and timelines, outputting contradictory terms like "tricky" and "confusing," leading to erroneous inferences or infinite loops. The fundamental problem lies in the inability to separate objective reality from the agent's subjective beliefs. To address this, we propose an adaptive world model enhancement inference mechanism that builds a dynamic textual world model that tracks entity states and temporal sequences. This mechanism dynamically monitors inference trajectories for confusion indicators and provides a clear world state description, helping the model resolve cognitive dilemmas. This mechanism mimics how humans use implicit world models to distinguish internal beliefs from external events. Significant improvements in accuracy (e.g., +10% on Hi-ToM) and reduced computational costs (up to 33.8% token reduction) were observed across three social benchmark evaluations.

Takeaways, Limitations

Takeaways:
A novel approach to enhance the social reasoning ability of LLMs (Adaptive World Model Enhancement Inference Mechanism).
Increasing the deployability of LLM in social contexts by improving accuracy and reducing computational costs.
Emphasizes the importance of the ability to separate objective reality from subjective beliefs.
Limitations:
Further research is needed to determine the generalizability of the mechanisms presented in the paper.
Validation of the effectiveness in other LLM models is needed.
Additional evaluation methodologies beyond social inference benchmarks are needed.
👍