To address the high energy cost of large-scale language model (LLM) inference, this paper presents VoltanaLLM, an energy-efficient LLM serving system that considers service-level objectives (SLOs). VoltanaLLM co-designs frequency scaling and request routing in a novel, prefill/decode decoupled architecture from a control theory perspective. A feedback-based frequency controller dynamically adjusts GPU frequencies in the prefill and decode stages, and a state-space router explores inter-instance routing decisions to minimize energy under latency constraints. VoltanaLLM, implemented in SGLang, has been evaluated on several state-of-the-art LLMs and real-world datasets, achieving up to 36.3% energy savings and nearly perfect SLO achievement. The source code is available on GitHub.