This paper proposes a novel two-step fine-tuning strategy, Think-How-to-Think (TH2T), to address the problem of excessive inference in large-scale inference models (LRMs). TH2T first injects difficulty level awareness into the model to adjust the inference depth, and then reduces excessive inference by identifying and removing unnecessary inference patterns in intermediate inference stages. It is trained using a dataset with a mixture of short and long inference paths, and experimental results on 7B, 14B, and 32B models demonstrate that it maintains performance while reducing inference costs by over 70% on easy tasks and over 40% on difficult tasks.