On the Feasibility of Advanced Cache Indexing for High-Performance and Energy-Efficient GPGPU Computing

Kyu Yeun Kim, Seunghoe Kim, Woongki Baek
{"title":"On the Feasibility of Advanced Cache Indexing for High-Performance and Energy-Efficient GPGPU Computing","authors":"Kyu Yeun Kim, Seunghoe Kim, Woongki Baek","doi":"10.1145/2768177.2768179","DOIUrl":null,"url":null,"abstract":"To achieve higher performance and energy efficiency, GPGPU architectures have recently begun to employ hardware caches. Adding hardware caches to GPGPUs, however, does not automatically guarantee improved performance and energy efficiency due to the thrashing in small hardware caches shared by thousands of threads. While prior work has proposed warp scheduling and cache bypassing techniques to address this issue, relatively little work has been done in the context of advanced cache indexing. To bridge this gap, this work investigates the feasibility of advanced cache indexing for high-performance and energy-efficient GPGPU computing. We first discuss the design and implementation of static and adaptive cache indexing schemes for GPGPUs. We then quantify the effectiveness of the advanced indexing schemes using GPGPU benchmarks. Our quantitative evaluation demonstrates that the advanced cache indexing schemes are promising in that they significantly outperform the conventional cache indexing scheme. In addition, for a subset of cache-sensitive benchmarks, the adaptive indexing scheme substantially outperforms the static indexing scheme by effectively identifying and utilizing high-quality indexing bits based on runtime information. Finally, our evaluation shows that the effectiveness of advanced cache indexing is sensitive to different warp schedulers, motivating further research on coordinated cache indexing and warp scheduling techniques.","PeriodicalId":374555,"journal":{"name":"Proceedings of the 3rd International Workshop on Many-core Embedded Systems","volume":"226 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd International Workshop on Many-core Embedded Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2768177.2768179","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

To achieve higher performance and energy efficiency, GPGPU architectures have recently begun to employ hardware caches. Adding hardware caches to GPGPUs, however, does not automatically guarantee improved performance and energy efficiency due to the thrashing in small hardware caches shared by thousands of threads. While prior work has proposed warp scheduling and cache bypassing techniques to address this issue, relatively little work has been done in the context of advanced cache indexing. To bridge this gap, this work investigates the feasibility of advanced cache indexing for high-performance and energy-efficient GPGPU computing. We first discuss the design and implementation of static and adaptive cache indexing schemes for GPGPUs. We then quantify the effectiveness of the advanced indexing schemes using GPGPU benchmarks. Our quantitative evaluation demonstrates that the advanced cache indexing schemes are promising in that they significantly outperform the conventional cache indexing scheme. In addition, for a subset of cache-sensitive benchmarks, the adaptive indexing scheme substantially outperforms the static indexing scheme by effectively identifying and utilizing high-quality indexing bits based on runtime information. Finally, our evaluation shows that the effectiveness of advanced cache indexing is sensitive to different warp schedulers, motivating further research on coordinated cache indexing and warp scheduling techniques.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于高效节能GPGPU计算的高级缓存索引可行性研究
为了实现更高的性能和能源效率,GPGPU架构最近开始使用硬件缓存。然而,向gpgpu添加硬件缓存并不能自动保证性能和能源效率的提高,因为由数千个线程共享的小型硬件缓存会出现抖动。虽然之前的工作已经提出了曲速调度和缓存绕过技术来解决这个问题,但在高级缓存索引方面做的工作相对较少。为了弥补这一差距,本工作研究了高性能和节能GPGPU计算的高级缓存索引的可行性。我们首先讨论了gpgpu的静态和自适应缓存索引方案的设计和实现。然后,我们使用GPGPU基准来量化高级索引方案的有效性。我们的定量评估表明,先进的缓存索引方案是有前途的,因为它们明显优于传统的缓存索引方案。此外,对于缓存敏感基准测试的一个子集,自适应索引方案通过基于运行时信息有效地识别和利用高质量索引位,大大优于静态索引方案。最后,我们的评估表明,高级缓存索引的有效性对不同的曲调度程序很敏感,这激励了进一步研究协调缓存索引和曲调度技术。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Improved Route Selection Approaches using Q-learning framework for 2D NoCs Proceedings of the 3rd International Workshop on Many-core Embedded Systems Hardware Scheduler Performance on the Plural Many-Core Architecture Parallel Programming Model for the Epiphany Many-Core Coprocessor Using Threaded MPI A Design Methodology for Performance Maintenance of 3D Network-on-Chip with Multiplexed Through-Silicon Vias
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1