Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Talk Isn't Always Cheap: Understanding Failure Modes in Multi-Agent Debate

Created by
  • Haebom

Author

Andrea Wynn, Harsh Satija, Gillian Hadfield

Outline

While multi-agent discussion has been proposed as a promising strategy for improving AI reasoning capabilities, this paper finds that discussion can be more detrimental than helpful. While previous research has focused on discussion within homogeneous agent groups, this study explores the impact of model power diversity on the dynamics and outcomes of multi-agent interactions. A series of experiments demonstrate that discussion can decrease accuracy over time, even in environments where stronger models outnumber weaker models. Analysis reveals that models favor consensus over challenge on incorrect inferences and frequently switch from correct to incorrect answers based on peers' reasoning. Additional experiments explore various factors contributing to this detrimental shift, including flattery, social conformity, and model and task type. These results highlight a critical failure mode in reason exchange during multi-agent discussion, suggesting that simply applying discussion can lead to performance degradation if the model lacks the ability to encourage discussion or resist incorrect inferences.

Takeaways, Limitations

We found that multi-agent discussions may not always lead to improved AI reasoning capabilities.
The diversity of model capabilities affects the outcome of the discussion.
Accuracy may decrease if the model chooses to agree rather than challenge incorrect inferences.
Flattery, social conformity, models, and task type were found to be factors influencing the outcome of the discussion.
If the discussion-engaging agent lacks the ability to resist incorrect inferences, performance degradation may occur.
👍