In this paper, we point out that the inference cost of large-scale language models (LLMs) is increasing, and propose a novel framework, directed stochastic skill search (DS3), that represents the inference process as a probabilistic search over a learned skill graph. DS3 provides analytical formulas for calculating the task success rate and computational cost for various inference strategies, such as chains of thought (CoT) and tree of thought (ToT), enabling comparative analysis with respect to task difficulty and model performance. By extending the ternary graph framework for LLM training to integrate inference, and connecting DS3 with experimental methods that characterize LLM scaling behavior, we theoretically reproduce experimentally observed patterns, such as linear accuracy scaling with logarithmic computational cost, variation of optimal inference strategies with task difficulty and model performance, emergent behavior exhibited by inference even when performance plateaus under parameter scaling, and best-of-N (BoN) and majority voting behaviors captured within the integrated analysis framework. By explicitly characterizing training-inference interdependencies, this framework deepens theoretical understanding and supports principled algorithm design and resource allocation.