This paper proposes Contrastive learning with annotated CoT-based Reinforced Fine-Tuning (\TheName{}), a novel reinforcement learning-based fine-tuning method for improving the inference ability of large-scale language models (LLMs). To address the problems of unstable inference path sampling and neglect of annotated thought processes (CoTs) in existing RL-based methods, as well as the overemphasis of CoTs in existing SFT approaches, we learn representations for each CoT and design new contrastive signals to guide the fine-tuning process. \TheName{} fully utilizes annotated CoTs while incorporating unsupervised learning signals to stabilize the fine-tuning process. Experimental results using three baseline methods, two base models, and two datasets demonstrate significant advantages of \TheName{} in terms of robustness, performance (up to 10.15% improvement), and efficiency (up to 30.62% improvement).