Precise Energy Consumption Measurements of Heterogeneous Artificial Intelligence Workloads

R. Caspart, Sebastian Ziegler, Arvid Weyrauch, Holger Obermaier, Simon Raffeiner, Leonie Schuhmacher, J. Scholtyssek, D. Trofimova, M. Nolden, I. Reinartz, Fabian Isensee, Markus Goetz, C. Debus
{"title":"Precise Energy Consumption Measurements of Heterogeneous Artificial Intelligence Workloads","authors":"R. Caspart, Sebastian Ziegler, Arvid Weyrauch, Holger Obermaier, Simon Raffeiner, Leonie Schuhmacher, J. Scholtyssek, D. Trofimova, M. Nolden, I. Reinartz, Fabian Isensee, Markus Goetz, C. Debus","doi":"10.48550/arXiv.2212.01698","DOIUrl":null,"url":null,"abstract":". With the rise of artificial intelligence (AI) in recent years and the subsequent increase in complexity of the applied models, the growing demand in computational resources is starting to pose a signif-icant challenge. The need for higher compute power is being met with increasingly more potent accelerator hardware as well as the use of large and powerful compute clusters. However, the gain in prediction accuracy from large models trained on distributed and accelerated systems ulti-mately comes at the price of a substantial increase in energy demand, and researchers have started questioning the environmental friendliness of such AI methods at scale. Consequently, awareness of energy efficiency plays an important role for AI model developers and hardware infrastructure operators likewise. The energy consumption of AI workloads depends both on the model implementation and the composition of the utilized hardware. Therefore, accurate measurements of the power draw of respective AI workflows on different types of compute nodes is key to algorithmic improvements and the design of future compute clusters and hardware. Towards this end, we present measurements of the energy consumption of two typical applications of deep learning models on different types of heterogeneous compute nodes. Our results indicate that 1. contrary to common approaches, deriving energy consumption directly from runtime is not accurate, but the consumption of the compute node needs to be considered regarding its composition; 2. neglecting accelerator hardware on mixed nodes results in overproportional ineffi-ciency regarding energy consumption; 3. energy consumption of model training and inference should be considered separately – while training on GPUs outperforms all other node types regarding both runtime and energy consumption, inference on CPU nodes can be comparably efficient. One advantage of our approach is the fact that the information on energy consumption is available to all users of the supercomputer and not just those with administrator rights, enabling an easy transfer to other workloads alongside a raise in user-awareness of energy consumption.","PeriodicalId":345133,"journal":{"name":"ISC Workshops","volume":"142 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISC Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2212.01698","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

. With the rise of artificial intelligence (AI) in recent years and the subsequent increase in complexity of the applied models, the growing demand in computational resources is starting to pose a signif-icant challenge. The need for higher compute power is being met with increasingly more potent accelerator hardware as well as the use of large and powerful compute clusters. However, the gain in prediction accuracy from large models trained on distributed and accelerated systems ulti-mately comes at the price of a substantial increase in energy demand, and researchers have started questioning the environmental friendliness of such AI methods at scale. Consequently, awareness of energy efficiency plays an important role for AI model developers and hardware infrastructure operators likewise. The energy consumption of AI workloads depends both on the model implementation and the composition of the utilized hardware. Therefore, accurate measurements of the power draw of respective AI workflows on different types of compute nodes is key to algorithmic improvements and the design of future compute clusters and hardware. Towards this end, we present measurements of the energy consumption of two typical applications of deep learning models on different types of heterogeneous compute nodes. Our results indicate that 1. contrary to common approaches, deriving energy consumption directly from runtime is not accurate, but the consumption of the compute node needs to be considered regarding its composition; 2. neglecting accelerator hardware on mixed nodes results in overproportional ineffi-ciency regarding energy consumption; 3. energy consumption of model training and inference should be considered separately – while training on GPUs outperforms all other node types regarding both runtime and energy consumption, inference on CPU nodes can be comparably efficient. One advantage of our approach is the fact that the information on energy consumption is available to all users of the supercomputer and not just those with administrator rights, enabling an easy transfer to other workloads alongside a raise in user-awareness of energy consumption.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
异构人工智能工作负载的精确能耗测量
. 近年来,随着人工智能(AI)的兴起,应用模型的复杂性随之增加,对计算资源的需求不断增长,开始构成重大挑战。越来越强大的加速器硬件以及大型和强大的计算集群的使用满足了对更高计算能力的需求。然而,在分布式和加速系统上训练的大型模型的预测准确性的提高最终是以能源需求的大幅增加为代价的,研究人员已经开始质疑这种大规模人工智能方法的环境友好性。因此,能效意识对人工智能模型开发者和硬件基础设施运营商同样起着重要作用。人工智能工作负载的能耗取决于模型实现和所使用硬件的组成。因此,准确测量不同类型计算节点上各自AI工作流的功耗是算法改进和未来计算集群和硬件设计的关键。为此,我们在不同类型的异构计算节点上对深度学习模型的两个典型应用的能耗进行了测量。我们的结果表明:1。与一般方法相反,直接从运行时计算能耗并不准确,但需要考虑计算节点的构成;2. 忽略混合节点上的加速器硬件会导致能量消耗方面的不成比例的低效率;3.模型训练和推理的能量消耗应该分开考虑——虽然gpu上的训练在运行时间和能量消耗方面优于所有其他节点类型,但CPU节点上的推理可以相当高效。我们的方法的一个优点是,有关能耗的信息对超级计算机的所有用户都是可用的,而不仅仅是那些具有管理员权限的用户,从而可以轻松地转移到其他工作负载,同时提高用户对能耗的认识。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Challenges and Opportunities for RISC-V Architectures towards Genomics-based Workloads Software Development Vehicles to enable extended and early co-design: a RISC-V and HPC case of study Test-driving RISC-V Vector hardware for HPC Backporting RISC-V Vector assembly Portability and Scalability of OpenMP Offloading on State-of-the-art Accelerators
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1