Pub Date : 2022-11-10DOI: 10.48550/arXiv.2211.05346
V. Venkataswamy, J. Grigsby, A. Grimshaw, Yanjun Qi
The exponential growth in demand for digital services drives massive datacenter energy consumption and negative environmental impacts. Promoting sustainable solutions to pressing energy and digital infrastructure challenges is crucial. Several hyperscale cloud providers have announced plans to power their datacenters using renewable energy. However, integrating renewables to power the datacenters is challenging because the power generation is intermittent, necessitating approaches to tackle power supply variability. Hand engineering domain-specific heuristics-based schedulers to meet specific objective functions in such complex dynamic green datacenter environments is time-consuming, expensive, and requires extensive tuning by domain experts. The green datacenters need smart systems and system software to employ multiple renewable energy sources (wind and solar) by intelligently adapting computing to renewable energy generation. We present RARE (Renewable energy Aware REsource management), a Deep Reinforcement Learning (DRL) job scheduler that automatically learns effective job scheduling policies while continually adapting to datacenters' complex dynamic environment. The resulting DRL scheduler performs better than heuristic scheduling policies with different workloads and adapts to the intermittent power supply from renewables. We demonstrate DRL scheduler system design parameters that, when tuned correctly, produce better performance. Finally, we demonstrate that the DRL scheduler can learn from and improve upon existing heuristic policies using Offline Learning.
{"title":"RARE: Renewable Energy Aware Resource Management in Datacenters","authors":"V. Venkataswamy, J. Grigsby, A. Grimshaw, Yanjun Qi","doi":"10.48550/arXiv.2211.05346","DOIUrl":"https://doi.org/10.48550/arXiv.2211.05346","url":null,"abstract":"The exponential growth in demand for digital services drives massive datacenter energy consumption and negative environmental impacts. Promoting sustainable solutions to pressing energy and digital infrastructure challenges is crucial. Several hyperscale cloud providers have announced plans to power their datacenters using renewable energy. However, integrating renewables to power the datacenters is challenging because the power generation is intermittent, necessitating approaches to tackle power supply variability. Hand engineering domain-specific heuristics-based schedulers to meet specific objective functions in such complex dynamic green datacenter environments is time-consuming, expensive, and requires extensive tuning by domain experts. The green datacenters need smart systems and system software to employ multiple renewable energy sources (wind and solar) by intelligently adapting computing to renewable energy generation. We present RARE (Renewable energy Aware REsource management), a Deep Reinforcement Learning (DRL) job scheduler that automatically learns effective job scheduling policies while continually adapting to datacenters' complex dynamic environment. The resulting DRL scheduler performs better than heuristic scheduling policies with different workloads and adapts to the intermittent power supply from renewables. We demonstrate DRL scheduler system design parameters that, when tuned correctly, produce better performance. Finally, we demonstrate that the DRL scheduler can learn from and improve upon existing heuristic policies using Offline Learning.","PeriodicalId":229341,"journal":{"name":"Job Scheduling Strategies for Parallel Processing","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134376074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-05-22DOI: 10.1007/978-3-030-63171-0_3
Meghana Thiyyakat, Subramaniam Kalambur, D. Sitaram
{"title":"Improving Resource Isolation of Critical Tasks in a Workload","authors":"Meghana Thiyyakat, Subramaniam Kalambur, D. Sitaram","doi":"10.1007/978-3-030-63171-0_3","DOIUrl":"https://doi.org/10.1007/978-3-030-63171-0_3","url":null,"abstract":"","PeriodicalId":229341,"journal":{"name":"Job Scheduling Strategies for Parallel Processing","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116739994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-05-25DOI: 10.1007/978-3-030-10632-4_1
Mehmet Soysal, M. Berghoff, A. Streit
{"title":"Analysis of Job Metadata for Enhanced Wall Time Prediction","authors":"Mehmet Soysal, M. Berghoff, A. Streit","doi":"10.1007/978-3-030-10632-4_1","DOIUrl":"https://doi.org/10.1007/978-3-030-10632-4_1","url":null,"abstract":"","PeriodicalId":229341,"journal":{"name":"Job Scheduling Strategies for Parallel Processing","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122144401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-05-25DOI: 10.1007/978-3-030-10632-4_6
T. H. Bhuiyan, M. Halappanavar, Ryan D. Friese, Hugh R. Medal, L. D. L. Torre, A. Sathanur, Nathan R. Tallent
{"title":"Stochastic Programming Approach for Resource Selection Under Demand Uncertainty","authors":"T. H. Bhuiyan, M. Halappanavar, Ryan D. Friese, Hugh R. Medal, L. D. L. Torre, A. Sathanur, Nathan R. Tallent","doi":"10.1007/978-3-030-10632-4_6","DOIUrl":"https://doi.org/10.1007/978-3-030-10632-4_6","url":null,"abstract":"","PeriodicalId":229341,"journal":{"name":"Job Scheduling Strategies for Parallel Processing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122145176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-05-25DOI: 10.1007/978-3-030-10632-4_2
D. Klusácek, Václav Chlumský
{"title":"Evaluating the Impact of Soft Walltimes on Job Scheduling Performance","authors":"D. Klusácek, Václav Chlumský","doi":"10.1007/978-3-030-10632-4_2","DOIUrl":"https://doi.org/10.1007/978-3-030-10632-4_2","url":null,"abstract":"","PeriodicalId":229341,"journal":{"name":"Job Scheduling Strategies for Parallel Processing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116421901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-05-25DOI: 10.1007/978-3-030-10632-4_3
Frédéric Azevedo, Lucas Gombert, F. Suter
{"title":"Reducing the Human-in-the-Loop Component of the Scheduling of Large HTC Workloads","authors":"Frédéric Azevedo, Lucas Gombert, F. Suter","doi":"10.1007/978-3-030-10632-4_3","DOIUrl":"https://doi.org/10.1007/978-3-030-10632-4_3","url":null,"abstract":"","PeriodicalId":229341,"journal":{"name":"Job Scheduling Strategies for Parallel Processing","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114939990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-06-02DOI: 10.1007/978-3-319-77398-8_1
W. Allcock, P. Rich, Yuping Fan, Z. Lan
{"title":"Experience and Practice of Batch Scheduling on Leadership Supercomputers at Argonne","authors":"W. Allcock, P. Rich, Yuping Fan, Z. Lan","doi":"10.1007/978-3-319-77398-8_1","DOIUrl":"https://doi.org/10.1007/978-3-319-77398-8_1","url":null,"abstract":"","PeriodicalId":229341,"journal":{"name":"Job Scheduling Strategies for Parallel Processing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115538208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}