This paper deals with the case where the reward function is a submodular function in Reinforcement Learning (RL). In conventional RL, the reward function is assumed to be additive, but in real problems such as path planning or adaptive control, it is more appropriate to model it as a submodular function with diminishing returns. In this paper, we propose a submodular graph-based pruning technique for RL problems with submodular reward functions. We prove that the technique finds an approximate optimal policy within a computable time, and analyze the time and space complexity and performance guarantee. Through experiments using a benchmark environment used in previous studies, we confirm that the proposed technique obtains higher rewards than existing methods.