This paper addresses the Cost-Aware Dynamic Workflow Scheduling (CADWS) problem, which efficiently schedules dynamically arriving workflow tasks in a cloud computing environment. Designing an effective scheduling policy that schedules tasks, represented as a Directed Acyclic Graph (DAG), to appropriate virtual machines (VMs) is a key challenge. Existing Deep Reinforcement Learning (DRL)-based methods suffer from limitations due to their heavy reliance on problem-specific policy network design, hyperparameters, and reward feedback. In this paper, we propose GATES, a novel DRL method that combines a policy network based on graph attention networks (GANs) with an evolution strategy. GATES learns the topological relationships between tasks within a DAG to capture the impact of current task scheduling on subsequent tasks. It assesses the importance of each VM to adapt to dynamically changing VM resources. It leverages the robustness and exploratory power of the evolution strategy, as well as its tolerance for delayed rewards, to achieve stable policy learning. Experimental results demonstrate that GATES outperforms existing state-of-the-art algorithms. The source code is available on GitHub.