This paper proposes the Convergent Meta Alignment Algorithm (COMAL), a novel approach to solving the language model alignment problem that leverages a game-theoretic framework to capture the complexity of human preferences. It aims to overcome the Limitations of existing alignment methods and find a Nash equilibrium policy, guaranteeing a 50% win rate for all competing policies. COMAL is simple and easily integrated with existing preference optimization methods, and its high win rate has been demonstrated by applying it to the Llama-3-8B-Instruct and Qwen2.5-7B models.