Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Constructive Conflict-Driven Multi-Agent Reinforcement Learning for Strategic Diversity

Created by
  • Haebom

Author

Yuxiang Mai, Qiyue Yin, Wancheng Ni, Pei Xu, Kaiqi Huang

CoDiCon: Competitive Diversity through Constructive Conflict

Outline

This paper proposes CoDiCon, a novel approach that introduces competitive incentives into cooperative scenarios to address the limitations of existing methods in multi-agent reinforcement learning (MARL) that do not consider interactions between agents. Inspired by sociological research, which suggests that appropriate competition and constructive conflict facilitate group decision-making, we design an intrinsic reward mechanism using ranking features to incentivize competition. A centralized intrinsic reward module generates and distributes diverse reward values to agents, maintaining a balance between competition and cooperation. By optimizing the centralized reward module, which is parameterized to maximize environmental rewards, we reframe the constrained bidirectional optimization problem to align it with the original task objective. We evaluate CoDiCon against state-of-the-art methods in SMAC and GRF environments, demonstrating that the competitive intrinsic reward effectively promotes diverse and adaptive strategies among cooperative agents, achieving superior performance.

Takeaways, Limitations

Takeaways:
We present a novel methodology for promoting strategic diversity among agents by leveraging competitive incentives in a collaborative environment.
Inspired by sociological research, designing an intrinsic reward mechanism that balances competition and cooperation.
Demonstrated superior performance compared to existing methods in SMAC and GRF environments.
Limitations:
Dependence on a centralized rewards module.
Further research is needed on the generalizability of CoDiCon and its applicability to other MARL environments.
Continued research is needed to find the optimal balance between competition and cooperation.
👍