带有EAR的supermu - ng软集群电源帽

J. Corbalán, Lluis Alonso, C. Navarrete, Carla Guillén
{"title":"带有EAR的supermu - ng软集群电源帽","authors":"J. Corbalán, Lluis Alonso, C. Navarrete, Carla Guillén","doi":"10.1109/IGSC55832.2022.9969360","DOIUrl":null,"url":null,"abstract":"This paper describes the Soft cluster powercap management system implemented and evaluated on SuperMUC-NG using the EAR software. SuperMUC-NG is one of the biggest supercomputers in Europe with 6480 Intel Skylake Xeon Platinum 8174 and EAR is the system software used for energy management. SuperMUC-NG has a power limit with a certain degree of tolerance, being possible to exceed the limit for a short time, as long as the power is on average under the hard limit over a longer period. Otherwise, the data center would incur a cost penalty. We call this use case Soft Cluster Powercap, since it is different from the traditional Hard Cluster Powercap where the power limit cannot be exceeded. This paper presents the design of the EAR node and Soft Cluster Powercap and the evaluation of the EAR node powercap and the soft cluster powercap. The evaluation included in this paper has been limited to CPU-only kernels and applications for the node powercap and to one island of SuperMUC-NG (792 nodes) for the soft cluster powercap. Currently the solution is deployed in the whole cluster.","PeriodicalId":114200,"journal":{"name":"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Soft Cluster Powercap at SuperMUC-NG with EAR\",\"authors\":\"J. Corbalán, Lluis Alonso, C. Navarrete, Carla Guillén\",\"doi\":\"10.1109/IGSC55832.2022.9969360\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes the Soft cluster powercap management system implemented and evaluated on SuperMUC-NG using the EAR software. SuperMUC-NG is one of the biggest supercomputers in Europe with 6480 Intel Skylake Xeon Platinum 8174 and EAR is the system software used for energy management. SuperMUC-NG has a power limit with a certain degree of tolerance, being possible to exceed the limit for a short time, as long as the power is on average under the hard limit over a longer period. Otherwise, the data center would incur a cost penalty. We call this use case Soft Cluster Powercap, since it is different from the traditional Hard Cluster Powercap where the power limit cannot be exceeded. This paper presents the design of the EAR node and Soft Cluster Powercap and the evaluation of the EAR node powercap and the soft cluster powercap. The evaluation included in this paper has been limited to CPU-only kernels and applications for the node powercap and to one island of SuperMUC-NG (792 nodes) for the soft cluster powercap. Currently the solution is deployed in the whole cluster.\",\"PeriodicalId\":114200,\"journal\":{\"name\":\"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IGSC55832.2022.9969360\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IGSC55832.2022.9969360","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本文介绍了利用EAR软件在supermu - ng上实现和评估的软集群电源封盖管理系统。supermu - ng是欧洲最大的超级计算机之一,拥有6480台英特尔Skylake Xeon Platinum 8174, EAR是用于能源管理的系统软件。SuperMUC-NG具有一定的功率限制,具有一定的容限,只要功率在较长时间内平均低于硬限制,就有可能在短时间内超过该限制。否则,数据中心将产生成本损失。我们称这个用例为软集群Powercap,因为它不同于传统的不能超过功率限制的硬集群Powercap。本文介绍了EAR节点和软集群功率帽的设计,并对EAR节点和软集群功率帽进行了评估。本文中包含的评估仅限于用于节点功率上限的仅cpu内核和应用程序,以及用于软集群功率上限的supermu - ng(792个节点)的一个岛。目前该解决方案部署在整个集群中。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Soft Cluster Powercap at SuperMUC-NG with EAR
This paper describes the Soft cluster powercap management system implemented and evaluated on SuperMUC-NG using the EAR software. SuperMUC-NG is one of the biggest supercomputers in Europe with 6480 Intel Skylake Xeon Platinum 8174 and EAR is the system software used for energy management. SuperMUC-NG has a power limit with a certain degree of tolerance, being possible to exceed the limit for a short time, as long as the power is on average under the hard limit over a longer period. Otherwise, the data center would incur a cost penalty. We call this use case Soft Cluster Powercap, since it is different from the traditional Hard Cluster Powercap where the power limit cannot be exceeded. This paper presents the design of the EAR node and Soft Cluster Powercap and the evaluation of the EAR node powercap and the soft cluster powercap. The evaluation included in this paper has been limited to CPU-only kernels and applications for the node powercap and to one island of SuperMUC-NG (792 nodes) for the soft cluster powercap. Currently the solution is deployed in the whole cluster.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Exploring Automatic Gym Workouts Recognition Locally on Wearable Resource-Constrained Devices Toward a Behavioral-Level End-to-End Framework for Silicon Photonics Accelerators A Review of Smart Buildings Protocol and Systems with a Consideration of Security and Energy Awareness Less is More: Learning Simplicity in Datacenter Scheduling Optimizing Energy Efficiency of Node.js Applications with CPU DVFS Awareness
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1