As the autonomy and generalization capabilities of multimodal LLM-based agents advance, evaluations based on static datasets raise the issue of insufficiently assessing actual capabilities in dynamic environments and diverse tasks. To address this issue, we propose Graph2Eval. Graph2Eval is a framework that comprehensively evaluates agents' reasoning, collaboration, and interaction capabilities by automatically generating multimodal document understanding and web interaction tasks based on a knowledge graph. Using a knowledge graph constructed from external data as a workspace, it transforms semantic relationships into structured multimodal tasks through subgraph sampling, task templates, and metapaths. A multi-stage filtering pipeline based on node reachability, LLM scores, and similarity analysis ensures the quality and feasibility of the generated tasks. Graph2Eval supports end-to-end evaluation of various agent types, including single-agent, multi-agent, and web-agent, and measures their reasoning, collaboration, and interaction capabilities. We implement our framework and conduct experiments on a curated dataset called Graph2Eval-Bench containing 1,319 document understanding and web interaction scenarios to differentiate agent and model performance, reveal gaps in inference, collaboration, and web interaction in different settings, and provide a new perspective on agent evaluation.