Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Corrupted by Reasoning: Reasoning Language Models Become Free-Riders in Public Goods Games

Created by
  • Haebom

Author

David Guzman Piedrahita, Yongjin Yang, Mrinmaya Sachan, Giorgia Ramponi, Bernhard Scholkopf, Zhijing Jin

Outline

This paper addresses the problem of costly sanctions, such as resource investment to induce cooperation or punishment for non-cooperation, in a large-scale language model (LLM) system with multiple agents. Using the public goods game in behavioral economics, we observe how various LLMs navigate social dilemmas in repeated interactions. Our analysis shows that LLMs exhibit four behavioral patterns: a type that maintains the level of cooperation continuously, a type that alternates between cooperation and non-cooperation, a type whose cooperative behavior decreases over time, and a type that follows a fixed strategy regardless of the outcome. Surprisingly, while LLMs with high reasoning ability, such as the o1 series, struggle to cooperate, some existing LLMs consistently achieve high levels of cooperation. This suggests that existing LLM improvement approaches that focus on improving reasoning ability do not necessarily lead to cooperation, and provides valuable insights for deploying LLM agents in environments that require continuous cooperation.

Takeaways, Limitations

Takeaways:
Shows that improved reasoning ability in LLM may not lead to collaboration.
Identify and categorize various patterns of collaborative behavior in LLMs.
Provides critical Takeaways for LLM agent deployment in environments requiring continuous collaboration.
Increased understanding of social interaction mechanisms in LLM.
Limitations:
Lack of specific details on the type and scope of LLM used.
Further review of the generalizability of the experimental design is needed.
Lack of analysis of LLM's responses to various social dilemma situations.
👍