Power and Performance Characterization and Modeling of GPU-Accelerated Systems

2014 IEEE 28th International Parallel and Distributed Processing Symposium Pub Date : 2014-05-19 DOI:10.1109/IPDPS.2014.23

Yukitaka Abe, Hiroshi Sasaki, S. Kato, Koji Inoue, M. Edahiro, M. Peres

{"title":"Power and Performance Characterization and Modeling of GPU-Accelerated Systems","authors":"Yukitaka Abe, Hiroshi Sasaki, S. Kato, Koji Inoue, M. Edahiro, M. Peres","doi":"10.1109/IPDPS.2014.23","DOIUrl":null,"url":null,"abstract":"Graphics processing units (GPUs) provide an order-of-magnitude improvement on peak performance and performance-per-watt as compared to traditional multicore CPUs. However, GPU-accelerated systems currently lack a generalized method of power and performance prediction, which prevents system designers from an ultimate goal of dynamic power and performance optimization. This is due to the fact that their power and performance characteristics are not well captured across architectures, and as a result, existing power and performance modeling approaches are only available for a limited range of particular GPUs. In this paper, we present power and performance characterization and modeling of GPU-accelerated systems across multiple generations of architectures. Characterization and modeling both play a vital role in optimization and prediction of GPU-accelerated systems. We quantify the impact of voltage and frequency scaling on each architecture with a particularly intriguing result that a cutting-edge Kepler-based GPU achieves energy saving of 75% by lowering GPU clocks in the best scenario, while Fermi- and Tesla-based GPUs achieve no greater than 40% and 13%, respectively. Considering these characteristics, we provide statistical power and performance modeling of GPU-accelerated systems simplified enough to be applicable for multiple generations of architectures. One of our findings is that even simplified statistical models are able to predict power and performance of cutting-edge GPUs within errors of 20% to 30% for any set of voltage and frequency pair.","PeriodicalId":309291,"journal":{"name":"2014 IEEE 28th International Parallel and Distributed Processing Symposium","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"60","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 28th International Parallel and Distributed Processing Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS.2014.23","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 60

Abstract

Graphics processing units (GPUs) provide an order-of-magnitude improvement on peak performance and performance-per-watt as compared to traditional multicore CPUs. However, GPU-accelerated systems currently lack a generalized method of power and performance prediction, which prevents system designers from an ultimate goal of dynamic power and performance optimization. This is due to the fact that their power and performance characteristics are not well captured across architectures, and as a result, existing power and performance modeling approaches are only available for a limited range of particular GPUs. In this paper, we present power and performance characterization and modeling of GPU-accelerated systems across multiple generations of architectures. Characterization and modeling both play a vital role in optimization and prediction of GPU-accelerated systems. We quantify the impact of voltage and frequency scaling on each architecture with a particularly intriguing result that a cutting-edge Kepler-based GPU achieves energy saving of 75% by lowering GPU clocks in the best scenario, while Fermi- and Tesla-based GPUs achieve no greater than 40% and 13%, respectively. Considering these characteristics, we provide statistical power and performance modeling of GPU-accelerated systems simplified enough to be applicable for multiple generations of architectures. One of our findings is that even simplified statistical models are able to predict power and performance of cutting-edge GPUs within errors of 20% to 30% for any set of voltage and frequency pair.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

gpu加速系统的功率和性能表征与建模

与传统的多核cpu相比，图形处理单元(gpu)在峰值性能和每瓦特性能方面提供了数量级的改进。然而，目前gpu加速系统缺乏一种通用的功率和性能预测方法，这阻碍了系统设计者实现动态功率和性能优化的最终目标。这是由于它们的功率和性能特征不能很好地跨架构捕获，因此，现有的功率和性能建模方法仅适用于有限范围的特定gpu。在本文中，我们介绍了跨多代架构的gpu加速系统的功率和性能表征和建模。表征和建模在gpu加速系统的优化和预测中都起着至关重要的作用。我们量化了电压和频率缩放对每种架构的影响，得出了一个特别有趣的结果:在最佳情况下，基于开普勒的尖端GPU通过降低GPU时钟实现了75%的节能，而基于费米和特斯拉的GPU分别实现了不超过40%和13%的节能。考虑到这些特点，我们提供了gpu加速系统的统计能力和性能建模，简化到足以适用于多代架构。我们的发现之一是，即使是简化的统计模型也能够在任何电压和频率对的误差范围内预测先进gpu的功率和性能，误差在20%到30%之间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2014 IEEE 28th International Parallel and Distributed Processing Symposium

自引率

0.00%

发文量