Takeaways: CO-Bench, a comprehensive benchmark covering real-world CO problems of various domains and complexity levels, enables systematic study of the combinatorial optimization problem-solving capabilities of LLM-based agents. Comparative evaluations with existing algorithms identify the strengths and weaknesses of LLM agents and suggest future research directions.