This paper presents a proposed method to address the challenge of learning an effective reward function in real-world scenarios where reward signals are extremely rare. The proposed method performs reward formation by utilizing all transitions, including the zero-reward transition. Specifically, it combines semi-supervised learning (SSL) and a novel data augmentation technique to learn trajectory space representations from the zero-reward transition, thereby enhancing the efficiency of reward formation. Experimental results on Atari games and robot manipulation demonstrate that the proposed method outperforms supervised learning-based methods in reward inference and improves agent scores. In particular, in environments where rewards are even more scarce, the proposed method achieves a best-in-class score that is up to twice that of existing methods. The proposed double-entropy data augmentation technique significantly improves performance, achieving a best-in-class score that is 15.8% higher than that of other augmentation methods.