限制GPU内核中分区露营的效果

Ashwin M. Aji, Mayank Daga, Wu-chun Feng
{"title":"限制GPU内核中分区露营的效果","authors":"Ashwin M. Aji, Mayank Daga, Wu-chun Feng","doi":"10.1145/2016604.2016637","DOIUrl":null,"url":null,"abstract":"Current GPU tools and performance models provide some common architectural insights that guide the programmers to write optimal code. We challenge and complement these performance models and tools, by modeling and analyzing a lesser known, but very severe performance pitfall, called Partition Camping, in NVIDIA GPUs. Partition Camping is caused by memory accesses that are skewed towards a subset of the available memory partitions, which may degrade the performance of GPU kernels by up to seven-fold. There is no existing tool that can detect the partition camping effect in GPU kernels.\n Unlike the traditional performance modeling approaches, we predict a performance range that bounds the partition camping effect in the GPU kernel. Our idea of predicting a performance range, instead of the exact performance, is more realistic due to the large performance variations induced by partition camping. We design and develop the prediction model by first characterizing the effects of partition camping with an indigenous suite of micro-benchmarks. We then apply rigorous statistical regression techniques over the micro-benchmark data to predict the performance bounds of real GPU kernels, with and without the partition camping effect. We test the accuracy of our performance model by analyzing three real applications with known memory access patterns and partition camping effects. Our results show that the geometric mean of errors in our performance range prediction model is within 12% of the actual execution times.\n We also develop and present a very easy-to-use spreadsheet based tool called CampProf, which is a visual front-end to our performance range prediction model and can be used to gain insights into the degree of partition camping in GPU kernels. Lastly, we demonstrate how CampProf can be used to visually monitor the performance improvements in the kernels, as the partition camping effect is being removed.","PeriodicalId":430420,"journal":{"name":"ACM International Conference on Computing Frontiers","volume":"114 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":"{\"title\":\"Bounding the effect of partition camping in GPU kernels\",\"authors\":\"Ashwin M. Aji, Mayank Daga, Wu-chun Feng\",\"doi\":\"10.1145/2016604.2016637\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Current GPU tools and performance models provide some common architectural insights that guide the programmers to write optimal code. We challenge and complement these performance models and tools, by modeling and analyzing a lesser known, but very severe performance pitfall, called Partition Camping, in NVIDIA GPUs. Partition Camping is caused by memory accesses that are skewed towards a subset of the available memory partitions, which may degrade the performance of GPU kernels by up to seven-fold. There is no existing tool that can detect the partition camping effect in GPU kernels.\\n Unlike the traditional performance modeling approaches, we predict a performance range that bounds the partition camping effect in the GPU kernel. Our idea of predicting a performance range, instead of the exact performance, is more realistic due to the large performance variations induced by partition camping. We design and develop the prediction model by first characterizing the effects of partition camping with an indigenous suite of micro-benchmarks. We then apply rigorous statistical regression techniques over the micro-benchmark data to predict the performance bounds of real GPU kernels, with and without the partition camping effect. We test the accuracy of our performance model by analyzing three real applications with known memory access patterns and partition camping effects. Our results show that the geometric mean of errors in our performance range prediction model is within 12% of the actual execution times.\\n We also develop and present a very easy-to-use spreadsheet based tool called CampProf, which is a visual front-end to our performance range prediction model and can be used to gain insights into the degree of partition camping in GPU kernels. Lastly, we demonstrate how CampProf can be used to visually monitor the performance improvements in the kernels, as the partition camping effect is being removed.\",\"PeriodicalId\":430420,\"journal\":{\"name\":\"ACM International Conference on Computing Frontiers\",\"volume\":\"114 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-05-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"27\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM International Conference on Computing Frontiers\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2016604.2016637\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM International Conference on Computing Frontiers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2016604.2016637","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 27

摘要

当前的GPU工具和性能模型提供了一些常见的架构见解,指导程序员编写最佳代码。我们通过建模和分析NVIDIA gpu中一个鲜为人知但非常严重的性能陷阱,即分区露营,来挑战和补充这些性能模型和工具。分区露营是由于内存访问倾向于可用内存分区的一个子集造成的,这可能会使GPU内核的性能降低多达七倍。目前还没有工具可以检测GPU内核中的分区露营效果。与传统的性能建模方法不同,我们预测了GPU内核中分区露营效应的性能范围。我们预测性能范围(而不是准确的性能)的想法更现实,因为分区露营会导致很大的性能变化。我们设计并开发了预测模型,首先用一套本土的微基准来描述分区露营的影响。然后,我们在微基准数据上应用严格的统计回归技术来预测真实GPU内核的性能界限,无论是否存在分区露营效应。我们通过分析三个具有已知内存访问模式和分区露营效果的实际应用程序来测试性能模型的准确性。我们的结果表明,在我们的性能范围预测模型中,误差的几何平均值在实际执行时间的12%以内。我们还开发并展示了一个非常易于使用的基于电子表格的工具,称为CampProf,它是我们性能范围预测模型的可视化前端,可用于深入了解GPU内核中的分区露营程度。最后,我们将演示如何使用CampProf可视化地监控内核中的性能改进,因为分区露营效应正在被移除。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Bounding the effect of partition camping in GPU kernels
Current GPU tools and performance models provide some common architectural insights that guide the programmers to write optimal code. We challenge and complement these performance models and tools, by modeling and analyzing a lesser known, but very severe performance pitfall, called Partition Camping, in NVIDIA GPUs. Partition Camping is caused by memory accesses that are skewed towards a subset of the available memory partitions, which may degrade the performance of GPU kernels by up to seven-fold. There is no existing tool that can detect the partition camping effect in GPU kernels. Unlike the traditional performance modeling approaches, we predict a performance range that bounds the partition camping effect in the GPU kernel. Our idea of predicting a performance range, instead of the exact performance, is more realistic due to the large performance variations induced by partition camping. We design and develop the prediction model by first characterizing the effects of partition camping with an indigenous suite of micro-benchmarks. We then apply rigorous statistical regression techniques over the micro-benchmark data to predict the performance bounds of real GPU kernels, with and without the partition camping effect. We test the accuracy of our performance model by analyzing three real applications with known memory access patterns and partition camping effects. Our results show that the geometric mean of errors in our performance range prediction model is within 12% of the actual execution times. We also develop and present a very easy-to-use spreadsheet based tool called CampProf, which is a visual front-end to our performance range prediction model and can be used to gain insights into the degree of partition camping in GPU kernels. Lastly, we demonstrate how CampProf can be used to visually monitor the performance improvements in the kernels, as the partition camping effect is being removed.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Strategies for improving performance and energy efficiency on a many-core Cost-effective soft-error protection for SRAM-based structures in GPGPUs Kinship: efficient resource management for performance and functionally asymmetric platforms An algorithm for parallel calculation of trigonometric functions DCNSim: a unified and cross-layer computer architecture simulation framework for data center network research
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1