Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Towards Agentic OS: An LLM Agent Framework for Linux Schedulers

Created by
  • Haebom

Author

Yusheng Zheng, Yanpeng Hu, Wei Zhang, Andi Quinn

Outline

SchedCP is the first framework to leverage Large Language Model (LLM) agents to optimize the performance of operating system schedulers. To address the fundamental problem of existing schedulers' lack of understanding of application-specific requirements, we propose a decoupled control plane architecture that separates AI's semantic reasoning (what to optimize) from system execution (how to observe and act). Implemented as a Model Context Protocol (MCP) server, SchedCP provides three main services: a workload analysis engine, an evolving scheduler policy repository, and an execution verifier that verifies AI-generated code and configurations through static and dynamic analysis. A multi-agent system called sched-agent autonomously analyzes workloads, synthesizes customized eBPF scheduling policies, and deploys them via the sched_ext infrastructure. Evaluation results show that SchedCP achieves up to 1.79x performance gains and 13x cost reductions compared to existing approaches, while maintaining a high success rate. This enables expert-level system optimization and represents a step toward a self-optimizing, application-aware operating system.

Takeaways, Limitations

Takeaways:
We present a novel method for automatically optimizing the operating system scheduler by leveraging LLM.
Safe and efficient optimization possible through an architecture that separates semantic inference and execution.
Demonstrated performance improvement and cost reduction compared to existing methods.
Generalizing expert-level system optimization.
Suggesting the possibility of developing a self-optimizing operating system.
Improving accessibility through open source disclosure.
Limitations:
Further research is needed on the stability and reliability of LLM agents.
Generalizability verification is needed for various workloads and system environments.
It is implemented based on eBPF and cannot be applied to systems that do not support eBPF.
Additional verification of the stability and scalability of the MCP server is required.
👍