使用交叉基准测试的GPGPU频率缩放性能估计

Qiang Wang, Chengjian Liu, X. Chu
{"title":"使用交叉基准测试的GPGPU频率缩放性能估计","authors":"Qiang Wang, Chengjian Liu, X. Chu","doi":"10.1145/3366428.3380767","DOIUrl":null,"url":null,"abstract":"Dynamic Voltage and Frequency Scaling (D VFS) on General-Purpose Graphics Processing Units (GPGPUs) is now becoming one of the most significant techniques to balance computational performance and energy consumption. However, there are still few fast and accurate models for predicting GPU kernel execution time under different core and memory frequency settings, which is important to determine the best frequency configuration for energy saving. Accordingly, a novel GPGPU performance estimation model with both core and memory frequency scaling is herein proposed. We design a cross-benchmarking suite, which simulates kernels with a wide range of instruction distributions. The synthetic kernels generated by this suite can be used for model pre-training or as supplementary training samples. Then we apply two different machine learning algorithms, Support Vector Regression (SVR) and Gradient Boosting Decision Tree (GBDT), to study the correlation between kernel performance counters and kernel performance. The models trained only with our cross-benchmarking suite achieve satisfying accuracy (16%~22% mean absolute error) on 24 unseen real application kernels. Validated on three modern GPUs with a wide frequency scaling range, by using a collection of 24 real application kernels, the proposed model is able to achieve accurate results (5.1%, 2.8%, 6.5% mean absolute error) for the target GPUs (GTX 980, Titan X Pascal and Tesla P100).","PeriodicalId":266831,"journal":{"name":"Proceedings of the 13th Annual Workshop on General Purpose Processing using Graphics Processing Unit","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"GPGPU performance estimation for frequency scaling using cross-benchmarking\",\"authors\":\"Qiang Wang, Chengjian Liu, X. Chu\",\"doi\":\"10.1145/3366428.3380767\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Dynamic Voltage and Frequency Scaling (D VFS) on General-Purpose Graphics Processing Units (GPGPUs) is now becoming one of the most significant techniques to balance computational performance and energy consumption. However, there are still few fast and accurate models for predicting GPU kernel execution time under different core and memory frequency settings, which is important to determine the best frequency configuration for energy saving. Accordingly, a novel GPGPU performance estimation model with both core and memory frequency scaling is herein proposed. We design a cross-benchmarking suite, which simulates kernels with a wide range of instruction distributions. The synthetic kernels generated by this suite can be used for model pre-training or as supplementary training samples. Then we apply two different machine learning algorithms, Support Vector Regression (SVR) and Gradient Boosting Decision Tree (GBDT), to study the correlation between kernel performance counters and kernel performance. The models trained only with our cross-benchmarking suite achieve satisfying accuracy (16%~22% mean absolute error) on 24 unseen real application kernels. Validated on three modern GPUs with a wide frequency scaling range, by using a collection of 24 real application kernels, the proposed model is able to achieve accurate results (5.1%, 2.8%, 6.5% mean absolute error) for the target GPUs (GTX 980, Titan X Pascal and Tesla P100).\",\"PeriodicalId\":266831,\"journal\":{\"name\":\"Proceedings of the 13th Annual Workshop on General Purpose Processing using Graphics Processing Unit\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-02-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 13th Annual Workshop on General Purpose Processing using Graphics Processing Unit\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3366428.3380767\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 13th Annual Workshop on General Purpose Processing using Graphics Processing Unit","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3366428.3380767","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

通用图形处理单元(gpgpu)上的动态电压和频率缩放(D VFS)目前已成为平衡计算性能和能耗的最重要技术之一。然而,在不同的内核和内存频率设置下,预测GPU内核执行时间的快速和准确的模型仍然很少,这对于确定最佳的节能频率配置至关重要。在此基础上,提出了一种同时考虑内核和内存频率缩放的GPGPU性能估计模型。我们设计了一个交叉基准测试套件,它模拟了具有广泛指令分布的内核。该套件生成的合成核可用于模型预训练或作为补充训练样本。然后,我们应用两种不同的机器学习算法,支持向量回归(SVR)和梯度提升决策树(GBDT),研究核性能计数器与核性能之间的相关性。仅使用我们的交叉基准测试套件训练的模型在24个未见过的实际应用内核上获得了令人满意的准确率(平均绝对误差为16%~22%)。通过在三种具有宽频率缩放范围的现代gpu上进行验证,通过使用24个实际应用内核,所提出的模型能够对目标gpu (GTX 980, Titan X Pascal和Tesla P100)获得准确的结果(平均绝对误差为5.1%,2.8%和6.5%)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
GPGPU performance estimation for frequency scaling using cross-benchmarking
Dynamic Voltage and Frequency Scaling (D VFS) on General-Purpose Graphics Processing Units (GPGPUs) is now becoming one of the most significant techniques to balance computational performance and energy consumption. However, there are still few fast and accurate models for predicting GPU kernel execution time under different core and memory frequency settings, which is important to determine the best frequency configuration for energy saving. Accordingly, a novel GPGPU performance estimation model with both core and memory frequency scaling is herein proposed. We design a cross-benchmarking suite, which simulates kernels with a wide range of instruction distributions. The synthetic kernels generated by this suite can be used for model pre-training or as supplementary training samples. Then we apply two different machine learning algorithms, Support Vector Regression (SVR) and Gradient Boosting Decision Tree (GBDT), to study the correlation between kernel performance counters and kernel performance. The models trained only with our cross-benchmarking suite achieve satisfying accuracy (16%~22% mean absolute error) on 24 unseen real application kernels. Validated on three modern GPUs with a wide frequency scaling range, by using a collection of 24 real application kernels, the proposed model is able to achieve accurate results (5.1%, 2.8%, 6.5% mean absolute error) for the target GPUs (GTX 980, Titan X Pascal and Tesla P100).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
The Minos Computing Library: efficient parallel programming for extremely heterogeneous systems GPGPU performance estimation for frequency scaling using cross-benchmarking Unveiling kernel concurrency in multiresolution filters on GPUs with an image processing DSL Automatic generation of specialized direct convolutions for mobile GPUs High-level hardware feature extraction for GPU performance prediction of stencils
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1