Learning Workflow Scheduling on Multi-Resource Clusters

2019 IEEE International Conference on Networking, Architecture and Storage (NAS) Pub Date : 2019-08-01 DOI:10.1109/NAS.2019.8834720

Yang Hu, C. D. Laat, Zhiming Zhao

引用次数: 9

Abstract

Workflow scheduling is one of the key issues in the management of workflow execution. Typically, a workflow application can be modeled as a Directed-Acyclic Graph (DAG). In this paper, we present GoDAG, an approach that can learn to well schedule workflows on multi-resource clusters. GoDAG directly learns the scheduling policy from experience through deep reinforcement learning. In order to adapt deep reinforcement learning methods, we propose a novel state representation, a practical action space and a corresponding reward definition for workflow scheduling problem. We implement a GoDAG prototype and a simulator to simulate task running on multi-resource clusters. In the evaluation, we compare the GoDAG with three state-of-the-art heuristics. The results show that GoDAG outperforms the baseline heuristics, leading to less average makespan to different workflow structures.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

学习多资源集群的工作流调度

工作流调度是工作流执行管理的关键问题之一。通常，工作流应用程序可以建模为有向无环图(DAG)。在本文中，我们提出了GoDAG，一种可以学习在多资源集群上很好地调度工作流的方法。GoDAG通过深度强化学习直接从经验中学习调度策略。为了适应深度强化学习方法，我们提出了一种新的工作流调度问题的状态表示、实际动作空间和相应的奖励定义。我们实现了一个GoDAG原型和一个模拟器来模拟在多资源集群上运行的任务。在评价中，我们将GoDAG与三种最先进的启发式方法进行了比较。结果表明，GoDAG优于基线启发式方法，使得不同工作流结构的平均完工时间更短。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2019 IEEE International Conference on Networking, Architecture and Storage (NAS)

自引率

0.00%

发文量

期刊最新文献

NAS 2019 Program Optimizing Tail Latency of LDPC based Flash Memory Storage Systems Via Smart Refresh HCMonitor: An Accurate Measurement System for High Concurrent Network Services Learning Workflow Scheduling on Multi-Resource Clusters An Adaptive SSD Cache Architecture Simultaneously Using Multiple Caches