Klear-Reasoner is a model capable of long-term reasoning, exhibiting careful deliberation during problem-solving and achieving outstanding performance across multiple benchmarks. Existing inference models struggle to reproduce high-performance models due to incomplete disclosure of training details. This paper analyzes the entire process, from data preparation, fine-tuning the long Chain-of-Thought map (long CoT SFT), and reinforcement learning (RL). Experimental results on SFT data demonstrate that a small number of high-quality data sources are more effective than a large number of diverse data sources, and that using challenging samples without accuracy filtering yields better results. Furthermore, to address two key issues with existing RL clipping mechanisms (clipping suppresses important exploration signals and ignores non-optimal paths), we propose Gradient-Preserving Clipping Policy Optimization (GPPO). GPPO smoothly backpropagates gradients from clipped tokens to enhance the model's exploration ability and improve learning from negative samples. Klear-Reasoner demonstrates excellent reasoning skills in mathematics and programming, scoring 90.5% on AIME 2024, 83.2% on AIME 2025, 66.0% on LiveCodeBench V5, and 58.1% on LiveCodeBench V6.