Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

GATES: Cost-aware Dynamic Workflow Scheduling via Graph Attention Networks and Evolution Strategy

Created by
  • Haebom

Author

Ya Shen, Gang Chen, Hui Ma, Mengjie Zhang

Outline

This paper addresses the Cost-Aware Dynamic Workflow Scheduling (CADWS) problem, which efficiently schedules dynamically arriving workflow tasks in a cloud computing environment. Designing an effective scheduling policy that schedules tasks, represented as a Directed Acyclic Graph (DAG), to appropriate virtual machines (VMs) is a key challenge. Existing Deep Reinforcement Learning (DRL)-based methods suffer from limitations due to their heavy reliance on problem-specific policy network design, hyperparameters, and reward feedback. In this paper, we propose GATES, a novel DRL method that combines a policy network based on graph attention networks (GANs) with an evolution strategy. GATES learns the topological relationships between tasks within a DAG to capture the impact of current task scheduling on subsequent tasks. It assesses the importance of each VM to adapt to dynamically changing VM resources. It leverages the robustness and exploratory power of the evolution strategy, as well as its tolerance for delayed rewards, to achieve stable policy learning. Experimental results demonstrate that GATES outperforms existing state-of-the-art algorithms. The source code is available on GitHub.

Takeaways, Limitations

Takeaways:
We show that more efficient workflow scheduling is possible by considering the topological relationship of DAG.
We propose a scheduling strategy that adaptively responds to dynamically changing VM resources.
Improve the stability and performance of DRL by leveraging evolution strategies.
Experimentally verified superior performance compared to existing state-of-the-art algorithms.
Limitations:
Further research is needed to evaluate the scalability and generalization performance of the proposed GATES algorithm in real cloud environments.
Possible bias towards certain types of workflows or VM resource environments.
The computational cost of evolutionary strategies can be relatively high.
Additional evaluation of generalization performance for workflows of varying size and complexity is needed.
👍