In this paper, we present a study on applying the Process Reward Model (PRM) to graph inference problems to improve the inference ability of large-scale language models (LLMs). To address the high cost of manually generating step-by-step supervision data, we construct a large-scale graph inference dataset called GraphSILO, which generates detailed inference steps and step-by-step labels using task-oriented trajectories and Monte Carlo tree search (MCTS). Based on this dataset, we train GraphPRM, the first PRM for graph inference problems, and evaluate its effectiveness in reinforcement learning settings with inference time extension and direct preference optimization (DPO). Experimental results show that GraphPRM significantly improves LLM performance on 13 graph inference tasks, especially with a 9% performance improvement on the Qwen2.5-7B model. In addition, we demonstrate transferability to new graph inference datasets and new inference domains such as math problem solving. The performance improvements on GSM8K and Math500 highlight the cross-domain applicability of graph-based inference rewards.