{"title":"A Deep Reinforcement Learning Approach for Cost Optimized Workflow Scheduling in Cloud Computing Environments","authors":"Amanda Jayanetti, Saman Halgamuge, Rajkumar Buyya","doi":"arxiv-2408.02926","DOIUrl":null,"url":null,"abstract":"Cost optimization is a common goal of workflow schedulers operating in cloud\ncomputing environments. The use of spot instances is a potential means of\nachieving this goal, as they are offered by cloud providers at discounted\nprices compared to their on-demand counterparts in exchange for reduced\nreliability. This is due to the fact that spot instances are subjected to\ninterruptions when spare computing capacity used for provisioning them is\nneeded back owing to demand variations. Also, the prices of spot instances are\nnot fixed as pricing is dependent on long term supply and demand. The\npossibility of interruptions and pricing variations associated with spot\ninstances adds a layer of uncertainty to the general problem of workflow\nscheduling across cloud computing environments. These challenges need to be\nefficiently addressed for enjoying the cost savings achievable with the use of\nspot instances without compromising the underlying business requirements. To\nthis end, in this paper we use Deep Reinforcement Learning for developing an\nautonomous agent capable of scheduling workflows in a cost efficient manner by\nusing an intelligent mix of spot and on-demand instances. The proposed solution\nis implemented in the open source container native Argo workflow engine that is\nwidely used for executing industrial workflows. The results of the experiments\ndemonstrate that the proposed scheduling method is capable of outperforming the\ncurrent benchmarks.","PeriodicalId":501422,"journal":{"name":"arXiv - CS - Distributed, Parallel, and Cluster Computing","volume":"67 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Distributed, Parallel, and Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.02926","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Cost optimization is a common goal of workflow schedulers operating in cloud
computing environments. The use of spot instances is a potential means of
achieving this goal, as they are offered by cloud providers at discounted
prices compared to their on-demand counterparts in exchange for reduced
reliability. This is due to the fact that spot instances are subjected to
interruptions when spare computing capacity used for provisioning them is
needed back owing to demand variations. Also, the prices of spot instances are
not fixed as pricing is dependent on long term supply and demand. The
possibility of interruptions and pricing variations associated with spot
instances adds a layer of uncertainty to the general problem of workflow
scheduling across cloud computing environments. These challenges need to be
efficiently addressed for enjoying the cost savings achievable with the use of
spot instances without compromising the underlying business requirements. To
this end, in this paper we use Deep Reinforcement Learning for developing an
autonomous agent capable of scheduling workflows in a cost efficient manner by
using an intelligent mix of spot and on-demand instances. The proposed solution
is implemented in the open source container native Argo workflow engine that is
widely used for executing industrial workflows. The results of the experiments
demonstrate that the proposed scheduling method is capable of outperforming the
current benchmarks.