A Scheduler-Level Incentive Mechanism for Energy Efficiency in HPC

2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing Pub Date : 2015-05-04 DOI:10.1109/CCGrid.2015.101

Yiannis Georgiou, David Glesser, K. Rządca, D. Trystram

{"title":"A Scheduler-Level Incentive Mechanism for Energy Efficiency in HPC","authors":"Yiannis Georgiou, David Glesser, K. Rządca, D. Trystram","doi":"10.1109/CCGrid.2015.101","DOIUrl":null,"url":null,"abstract":"Energy consumption has become one of the most important factors in High Performance Computing platforms. However, while there are various algorithmic and programming techniques to save energy, a user has currently no incentive to employ them, as they might result in worse performance. We propose to manage the energy budget of a supercomputer through EnergyFairShare (EFS), a FairShare-like scheduling algorithm. FairShare is a classic scheduling rule that prioritizes jobs belonging to users who were assigned small amount of CPU-second in the past. Similarly, EFS keeps track of users 'consumption of Watt-seconds and prioritizes those whom jobs consumed less energy. Therefore, EFS incentives users to optimize their code for energy efficiency. Having higher priority, jobs have smaller queuing times and, thus, smaller turn-around time. To validate this principle, we implemented EFS in a scheduling simulator and processed workloads from various HPC centers. The results show that, by reducing it energy consumption, auser will reduce it stretch (slowdown), compared to increasing it energy consumption. To validate the general feasibility odour approach, we also implemented EFS as an extension forSLURM, a popular HPC resource and job management system.We validated our plugin both by emulating a large scale platform, and by experiments upon a real cluster with monitored energy consumption. We observed smaller waiting times for energy efficient users.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"56 3 1","pages":"617-626"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGrid.2015.101","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 27

Abstract

Energy consumption has become one of the most important factors in High Performance Computing platforms. However, while there are various algorithmic and programming techniques to save energy, a user has currently no incentive to employ them, as they might result in worse performance. We propose to manage the energy budget of a supercomputer through EnergyFairShare (EFS), a FairShare-like scheduling algorithm. FairShare is a classic scheduling rule that prioritizes jobs belonging to users who were assigned small amount of CPU-second in the past. Similarly, EFS keeps track of users 'consumption of Watt-seconds and prioritizes those whom jobs consumed less energy. Therefore, EFS incentives users to optimize their code for energy efficiency. Having higher priority, jobs have smaller queuing times and, thus, smaller turn-around time. To validate this principle, we implemented EFS in a scheduling simulator and processed workloads from various HPC centers. The results show that, by reducing it energy consumption, auser will reduce it stretch (slowdown), compared to increasing it energy consumption. To validate the general feasibility odour approach, we also implemented EFS as an extension forSLURM, a popular HPC resource and job management system.We validated our plugin both by emulating a large scale platform, and by experiments upon a real cluster with monitored energy consumption. We observed smaller waiting times for energy efficient users.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

高性能计算中能效的调度级激励机制

能源消耗已经成为高性能计算平台最重要的因素之一。然而，虽然有各种各样的算法和编程技术可以节省能源，但用户目前没有动力使用它们，因为它们可能导致更差的性能。我们提出了一种类似fairshare的调度算法——EnergyFairShare (EFS)来管理超级计算机的能量预算。FairShare是一个经典的调度规则，它优先处理属于过去分配了少量cpu秒的用户的作业。类似地，EFS跟踪用户的瓦特秒消耗，并优先考虑那些工作消耗较少能量的人。因此，EFS鼓励用户优化代码以提高能源效率。具有更高的优先级，作业的排队时间更短，因此周转时间也更短。为了验证这一原理，我们在调度模拟器中实现了EFS，并处理了来自不同HPC中心的工作负载。结果表明，与增加it能耗相比，通过降低it能耗，用户将减少it拉伸(减速)。为了验证通用的可行性方法，我们还实现了EFS作为slurm的扩展，slurm是一种流行的高性能计算资源和作业管理系统。我们通过模拟一个大规模的平台来验证我们的插件，并在一个监控能耗的真实集群上进行实验。我们观察到节能用户的等待时间更短。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing

自引率

0.00%

发文量