使用布隆过滤器运行提前缓存失败

Xi Tao, Qi Zeng, J. Peir, Shih-Lien Lu
{"title":"使用布隆过滤器运行提前缓存失败","authors":"Xi Tao, Qi Zeng, J. Peir, Shih-Lien Lu","doi":"10.1109/PDCAT.2016.017","DOIUrl":null,"url":null,"abstract":"In order to hide long memory latency and alleviate memory bandwidth requirement, a fourth-level cache (L4) is introduced in modern high-performance multi-core systems for supporting parallel computation. However, additional cache level causes higher cache miss penalty since a request needs to go through all levels of caches to reach to the main memory. In this paper, we introduce a new way of using a Bloom Filter (BF) to predict cache misses at any cache level in a multicore system. These misses can runahead to access lower-level caches or memory to reduce the miss penalty. The proposed hashing scheme extends the cache index of the target set and uses it for accessing the BF array to avoid counters in the BF array. Performance evaluation using a set of SPEC2006 benchmarks on 8-core systems with 4-level cache hierarchy shows that using a BF for the third-level (L3) cache to filter and runahead L3 misses, the IPCs can be improved by 4-20% with an average 10.5%. In comparison with the delay-recalibration scheme, the improvement is 3.5-4.8%.","PeriodicalId":203925,"journal":{"name":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Runahead Cache Misses Using Bloom Filter\",\"authors\":\"Xi Tao, Qi Zeng, J. Peir, Shih-Lien Lu\",\"doi\":\"10.1109/PDCAT.2016.017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to hide long memory latency and alleviate memory bandwidth requirement, a fourth-level cache (L4) is introduced in modern high-performance multi-core systems for supporting parallel computation. However, additional cache level causes higher cache miss penalty since a request needs to go through all levels of caches to reach to the main memory. In this paper, we introduce a new way of using a Bloom Filter (BF) to predict cache misses at any cache level in a multicore system. These misses can runahead to access lower-level caches or memory to reduce the miss penalty. The proposed hashing scheme extends the cache index of the target set and uses it for accessing the BF array to avoid counters in the BF array. Performance evaluation using a set of SPEC2006 benchmarks on 8-core systems with 4-level cache hierarchy shows that using a BF for the third-level (L3) cache to filter and runahead L3 misses, the IPCs can be improved by 4-20% with an average 10.5%. In comparison with the delay-recalibration scheme, the improvement is 3.5-4.8%.\",\"PeriodicalId\":203925,\"journal\":{\"name\":\"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)\",\"volume\":\"87 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PDCAT.2016.017\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDCAT.2016.017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

为了隐藏较长的内存延迟和降低内存带宽需求,在现代高性能多核系统中引入了支持并行计算的第四级缓存(L4)。但是,额外的缓存级别会导致更高的缓存丢失损失,因为请求需要经过所有级别的缓存才能到达主内存。本文介绍了一种利用布隆滤波器(BF)来预测多核系统中任意缓存级别的缓存缺失的新方法。这些丢失可以提前运行以访问低级缓存或内存,以减少丢失的损失。提出的散列方案扩展了目标集的缓存索引,并使用它来访问BF数组,以避免BF数组中的计数器。在具有4级缓存层次结构的8核系统上使用一组SPEC2006基准测试进行性能评估表明,在第三级(L3)缓存中使用BF来过滤和提前运行L3错误,ipc可以提高4-20%,平均提高10.5%。与延迟重新校准方案相比,改进幅度为3.5 ~ 4.8%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Runahead Cache Misses Using Bloom Filter
In order to hide long memory latency and alleviate memory bandwidth requirement, a fourth-level cache (L4) is introduced in modern high-performance multi-core systems for supporting parallel computation. However, additional cache level causes higher cache miss penalty since a request needs to go through all levels of caches to reach to the main memory. In this paper, we introduce a new way of using a Bloom Filter (BF) to predict cache misses at any cache level in a multicore system. These misses can runahead to access lower-level caches or memory to reduce the miss penalty. The proposed hashing scheme extends the cache index of the target set and uses it for accessing the BF array to avoid counters in the BF array. Performance evaluation using a set of SPEC2006 benchmarks on 8-core systems with 4-level cache hierarchy shows that using a BF for the third-level (L3) cache to filter and runahead L3 misses, the IPCs can be improved by 4-20% with an average 10.5%. In comparison with the delay-recalibration scheme, the improvement is 3.5-4.8%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Learning-Based System for Monitoring Electrical Load in Smart Grid A Domain-Independent Hybrid Approach for Automatic Taxonomy Induction CUDA-Based Parallel Implementation of IBM Word Alignment Algorithm for Statistical Machine Translation Optimal Scheduling Algorithm of MapReduce Tasks Based on QoS in the Hybrid Cloud Pre-Impact Fall Detection Based on Wearable Device Using Dynamic Threshold Model
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1