In this paper, we propose a ConciseHint framework to induce concise representations during the inference process generation to address the inefficiency problem caused by excessively detailed inference process of large-scale inference models (LRMs). ConciseHint continuously injects text-based hints, which are manually designed or learned with concise data, into the token generation process to induce the model to infer concisely. It also adjusts the hint strength according to the complexity of the query to prevent model performance degradation. Experimental results on state-of-the-art LRMs such as DeepSeek-R1 and Qwen-3 series demonstrate that ConciseHint can effectively reduce the length of the inference process without any performance degradation. For example, using the Qwen-3 4B model, we achieve a 65% reduction in inference length on the GSM8K benchmark while maintaining almost the same accuracy.