This paper proposes Decomposition of Experts (DoE), a novel framework for reducing the inference cost of large-scale language models (LLMs). DoE defines neurons that play a crucial role in a specific task as "experts," dynamically identifying and activating these experts for each task to accelerate inference. Upon receiving a user request, DoE identifies experts for the task, performs inference using only those experts, and reverts to the original model after the task is completed. This four-step process demonstrates that DoE achieves up to a 1.73x increase in inference speed and a 65% parameter reduction while maintaining accuracy. We validate the effectiveness of DoE and the importance of its components through comparisons with various expert identification methods and ablation studies. We also analyze the impact of batch size, number of tokens, and layer type on inference speed. DoE is a practical and highly scalable framework applicable to Transformer-based architectures.