A framework for data prefetching using off-line training of Markovian predictors

Jinwoo Kim, K. Palem, W. Wong
{"title":"A framework for data prefetching using off-line training of Markovian predictors","authors":"Jinwoo Kim, K. Palem, W. Wong","doi":"10.1109/ICCD.2002.1106792","DOIUrl":null,"url":null,"abstract":"An important technique for alleviating the memory bottleneck is data prefetching. Data prefetching solutions ranging from pure software approach by inserting prefetch instructions through program analysis to purely hardware mechanisms have been proposed. The degrees of success of those techniques are dependent on the nature of the applications. The need for innovative approach is rapidly growing with the introduction of applications such as object-oriented applications that show dynamically changing memory access behavior In this paper, we propose a novel framework for the use of data prefetchers that are trained off-line using smart learning algorithms to produce prediction models which captures hidden memory access patterns. Once built, those prediction models are loaded into a data prefetching unit in the CPU at the appropriate point during the runtime to drive the prefetching. On average by using table size of about 8KB size, we were able to achieve prediction accuracy of about 68% through our own proposed learning method and performance was boosted about 37% on average on the benchmarks we tested. Furthermore, we believe our proposed framework is amenable to other predictors and can be done as a phase of the profiling-optimizing-compiler.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCD.2002.1106792","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

Abstract

An important technique for alleviating the memory bottleneck is data prefetching. Data prefetching solutions ranging from pure software approach by inserting prefetch instructions through program analysis to purely hardware mechanisms have been proposed. The degrees of success of those techniques are dependent on the nature of the applications. The need for innovative approach is rapidly growing with the introduction of applications such as object-oriented applications that show dynamically changing memory access behavior In this paper, we propose a novel framework for the use of data prefetchers that are trained off-line using smart learning algorithms to produce prediction models which captures hidden memory access patterns. Once built, those prediction models are loaded into a data prefetching unit in the CPU at the appropriate point during the runtime to drive the prefetching. On average by using table size of about 8KB size, we were able to achieve prediction accuracy of about 68% through our own proposed learning method and performance was boosted about 37% on average on the benchmarks we tested. Furthermore, we believe our proposed framework is amenable to other predictors and can be done as a phase of the profiling-optimizing-compiler.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用离线训练马尔可夫预测器的数据预取框架
缓解内存瓶颈的一个重要技术是数据预取。数据预取的解决方案从通过程序分析插入预取指令的纯软件方法到纯硬件机制都有。这些技术的成功程度取决于应用程序的性质。随着应用程序的引入,对创新方法的需求正在迅速增长,例如显示动态变化的内存访问行为的面向对象应用程序。在本文中,我们提出了一个使用数据预取器的新框架,该数据预取器使用智能学习算法离线训练,以产生捕获隐藏内存访问模式的预测模型。一旦构建完成,这些预测模型将在运行期间的适当时间点加载到CPU中的数据预取单元中,以驱动预取。平均而言,通过使用大约8KB大小的表,我们能够通过我们自己提出的学习方法实现大约68%的预测准确度,并且在我们测试的基准测试中,性能平均提高了约37%。此外,我们相信我们提出的框架适用于其他预测器,并且可以作为分析优化编译器的一个阶段来完成。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
2.30
自引率
0.00%
发文量
0
期刊最新文献
JMA: the Java-multithreading architecture for embedded processors Legacy SystemC co-simulation of multi-processor systems-on-chip Accurate and efficient static timing analysis with crosstalk Register binding based power management for high-level synthesis of control-flow intensive behaviors On the impact of technology scaling on mixed PTL/static circuits
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1