Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Murakkab: Resource-Efficient Agentic Workflow Orchestration in Cloud Platforms

Created by
  • Haebom

Author

Gohar Irfan Chaudhry, Esha Choukse, Haoran Qiu, I nigo Goiri, Rodrigo Fonseca, Adam Belay, Ricardo Bianchini

Outline

Murakkab is a resource-efficient serving system for agent-based workflows. Existing frameworks tightly couple agent logic with model and hardware selection, exposing workflows as opaque sequences of model and tool calls, leading to inefficiencies. Murakkab introduces a declarative abstraction that separates workflow specifications from execution configuration. A profile-driven optimizer and an adaptive runtime manage the entire stack, including orchestration of workflow components, mapping to models and hardware, and dynamic reconfiguration to meet user-defined service level objectives (SLOs). By exposing the internal structure of agent workflows, it enables cross-layer optimizations that existing frameworks and cloud schedulers cannot achieve. Evaluations on various workflows have shown that Murakkab reduces GPU usage by up to 2.8x, energy consumption by 3.7x, and costs by 4.3x, all while maintaining SLOs.

Takeaways, Limitations

Takeaways:
Agent-based workflows can significantly improve resource efficiency (reducing GPU usage, energy consumption, and costs).
Declarative abstraction separates workflow specifications from execution configurations, increasing flexibility and manageability.
Cross-layer optimization enables more efficient resource management than existing systems.
We experimentally demonstrated that it is possible to reduce resource usage while satisfying SLOs.
Limitations:
Murakkab's performance improvements may vary depending on specific workflows and hardware environments.
Further research is needed to explore generalizability across different types of agent workflows.
The complexity of declarative abstractions can be challenging for certain users.
Long-term stability and scalability in actual operating environments require verification.
👍