J. Corbalán, Lluis Alonso, C. Navarrete, Carla Guillén
{"title":"带有EAR的supermu - ng软集群电源帽","authors":"J. Corbalán, Lluis Alonso, C. Navarrete, Carla Guillén","doi":"10.1109/IGSC55832.2022.9969360","DOIUrl":null,"url":null,"abstract":"This paper describes the Soft cluster powercap management system implemented and evaluated on SuperMUC-NG using the EAR software. SuperMUC-NG is one of the biggest supercomputers in Europe with 6480 Intel Skylake Xeon Platinum 8174 and EAR is the system software used for energy management. SuperMUC-NG has a power limit with a certain degree of tolerance, being possible to exceed the limit for a short time, as long as the power is on average under the hard limit over a longer period. Otherwise, the data center would incur a cost penalty. We call this use case Soft Cluster Powercap, since it is different from the traditional Hard Cluster Powercap where the power limit cannot be exceeded. This paper presents the design of the EAR node and Soft Cluster Powercap and the evaluation of the EAR node powercap and the soft cluster powercap. The evaluation included in this paper has been limited to CPU-only kernels and applications for the node powercap and to one island of SuperMUC-NG (792 nodes) for the soft cluster powercap. Currently the solution is deployed in the whole cluster.","PeriodicalId":114200,"journal":{"name":"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Soft Cluster Powercap at SuperMUC-NG with EAR\",\"authors\":\"J. Corbalán, Lluis Alonso, C. Navarrete, Carla Guillén\",\"doi\":\"10.1109/IGSC55832.2022.9969360\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes the Soft cluster powercap management system implemented and evaluated on SuperMUC-NG using the EAR software. SuperMUC-NG is one of the biggest supercomputers in Europe with 6480 Intel Skylake Xeon Platinum 8174 and EAR is the system software used for energy management. SuperMUC-NG has a power limit with a certain degree of tolerance, being possible to exceed the limit for a short time, as long as the power is on average under the hard limit over a longer period. Otherwise, the data center would incur a cost penalty. We call this use case Soft Cluster Powercap, since it is different from the traditional Hard Cluster Powercap where the power limit cannot be exceeded. This paper presents the design of the EAR node and Soft Cluster Powercap and the evaluation of the EAR node powercap and the soft cluster powercap. The evaluation included in this paper has been limited to CPU-only kernels and applications for the node powercap and to one island of SuperMUC-NG (792 nodes) for the soft cluster powercap. Currently the solution is deployed in the whole cluster.\",\"PeriodicalId\":114200,\"journal\":{\"name\":\"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IGSC55832.2022.9969360\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IGSC55832.2022.9969360","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
This paper describes the Soft cluster powercap management system implemented and evaluated on SuperMUC-NG using the EAR software. SuperMUC-NG is one of the biggest supercomputers in Europe with 6480 Intel Skylake Xeon Platinum 8174 and EAR is the system software used for energy management. SuperMUC-NG has a power limit with a certain degree of tolerance, being possible to exceed the limit for a short time, as long as the power is on average under the hard limit over a longer period. Otherwise, the data center would incur a cost penalty. We call this use case Soft Cluster Powercap, since it is different from the traditional Hard Cluster Powercap where the power limit cannot be exceeded. This paper presents the design of the EAR node and Soft Cluster Powercap and the evaluation of the EAR node powercap and the soft cluster powercap. The evaluation included in this paper has been limited to CPU-only kernels and applications for the node powercap and to one island of SuperMUC-NG (792 nodes) for the soft cluster powercap. Currently the solution is deployed in the whole cluster.