This paper analyzes bias in multi-agent systems that utilize large-scale language models (LLMs) as evaluators. Specifically, we evaluate four types of bias—position bias, detail bias, thought process bias, and opinion bias—in two frameworks: Multi-Agent-Debate and LLM-as-Meta-Judge. Experimental results show that the debate framework significantly amplifies and persists bias after initial debate, while the meta-evaluator approach is more resistant to bias. Furthermore, integrating PINE, a single-agent bias mitigation technique, effectively reduces bias in the debate setting but is less effective in the meta-evaluator scenario. This study provides a comprehensive study of bias behavior in multi-agent LLM evaluation systems and highlights the need for targeted bias mitigation strategies in collaborative evaluation environments.