This paper presents a scene graph as a structured and serializable environmental representation for spatial reasoning based on a large-scale language model (LLM). We propose SG², an iterative, schema-based scene graph inference framework based on a multi-agent LLM. Each agent consists of two modules: a reasoner module (Reasoner), which plans abstract tasks and generates graph information queries, and a retrieval module (Retriever), which extracts relevant graph information by writing code based on the queries. These two modules iteratively collaborate to enable sequential inference and adaptive attention to graph information. A scene graph schema presented to both modules streamlines the inference and retrieval processes and guides their collaboration. This eliminates the need to present the entire graph data to the LLM, thereby reducing the potential for hallucinations due to irrelevant information. Experiments in various simulated environments demonstrate that the proposed framework outperforms existing LLM-based approaches and baseline single-agent, tool-based reason-while-retrieve strategies on numerical question-answering and planning tasks.