This paper presents a mathematical model and method for resolving situations in reinforcement learning where an agent reaches an unknown state. We propose an "episodic Markov decision process with growing awareness (EMDP-GA)" model for situations where the agent reaches a state outside its aware domain. The EMDP-GA model uses the "noninformative value expansion (NIVE)" technique, which initializes the value function for the new state with a noninformative belief (the average value of the known domain). This design reflects the absence of any prior knowledge about the value of the state. Furthermore, we apply Upper Confidence Bound Momentum Q-learning to train the EMDP-GA model. Consequently, despite accessing an unknown state, we demonstrate that the proposed model achieves a level of regret comparable to state-of-the-art (SOTA) methods, and that its computational and space complexity are comparable to those of SOTA methods.