In this paper, we propose a neuro-symbolic model-based reinforcement learning architecture for navigation in human-collaborating environments. Navigation considering human interaction can be expressed as a partially observable Markov decision process (POMDP), which implies that we need to infer the hidden beliefs of others. Inspired by the Theory of Mind and Epistemic Planning, we present a neuro-symbolic model-based reinforcement learning architecture for solving the belief tracking problem in partially observable environments, and a perspective-shifting operator for belief estimation leveraging influence-based abstraction (IBA) in a structured multi-agent setting.