{"title":"Hardware and application aware performance, power and energy models for modern HPC servers with DVFS","authors":"Georges Da Costa","doi":"10.1016/j.suscom.2025.101106","DOIUrl":null,"url":null,"abstract":"<div><div>Energy usage and its ecological impact is now a major concern in High Performance Computing (HPC). To optimize supercomputers efficiency, researchers rely on models, as accessing actual platform is complex and costly. Changing DVFS (Dynamic Voltage and Frequency Scaling) is the most studied method, but it impacts power, performance and energy in a complex way.</div><div>We propose to bridge the gap between the theoretical and the practical approaches. We propose a multi cluster, multi application model accurately describing from a theoretical point of view the power and performance of applications subject to DVFS. We show how to use it on a runtime system with a minimal overhead, using only a few hardware performance counters and RAPL (Running Average Power Limit).</div><div>We validate our models using an extensive dataset, obtained using 18 different clusters and running 9 benchmarks. We also show how such model can be used to optimize the energy-to-solution for HPC workload.</div></div>","PeriodicalId":48686,"journal":{"name":"Sustainable Computing-Informatics & Systems","volume":"46 ","pages":"Article 101106"},"PeriodicalIF":3.8000,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sustainable Computing-Informatics & Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2210537925000265","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Energy usage and its ecological impact is now a major concern in High Performance Computing (HPC). To optimize supercomputers efficiency, researchers rely on models, as accessing actual platform is complex and costly. Changing DVFS (Dynamic Voltage and Frequency Scaling) is the most studied method, but it impacts power, performance and energy in a complex way.
We propose to bridge the gap between the theoretical and the practical approaches. We propose a multi cluster, multi application model accurately describing from a theoretical point of view the power and performance of applications subject to DVFS. We show how to use it on a runtime system with a minimal overhead, using only a few hardware performance counters and RAPL (Running Average Power Limit).
We validate our models using an extensive dataset, obtained using 18 different clusters and running 9 benchmarks. We also show how such model can be used to optimize the energy-to-solution for HPC workload.
期刊介绍:
Sustainable computing is a rapidly expanding research area spanning the fields of computer science and engineering, electrical engineering as well as other engineering disciplines. The aim of Sustainable Computing: Informatics and Systems (SUSCOM) is to publish the myriad research findings related to energy-aware and thermal-aware management of computing resource. Equally important is a spectrum of related research issues such as applications of computing that can have ecological and societal impacts. SUSCOM publishes original and timely research papers and survey articles in current areas of power, energy, temperature, and environment related research areas of current importance to readers. SUSCOM has an editorial board comprising prominent researchers from around the world and selects competitively evaluated peer-reviewed papers.