A framework for data prefetching using off-line training of Markovian predictors

Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors Pub Date : 2002-09-16 DOI:10.1109/ICCD.2002.1106792

Jinwoo Kim, K. Palem, W. Wong

{"title":"A framework for data prefetching using off-line training of Markovian predictors","authors":"Jinwoo Kim, K. Palem, W. Wong","doi":"10.1109/ICCD.2002.1106792","DOIUrl":null,"url":null,"abstract":"An important technique for alleviating the memory bottleneck is data prefetching. Data prefetching solutions ranging from pure software approach by inserting prefetch instructions through program analysis to purely hardware mechanisms have been proposed. The degrees of success of those techniques are dependent on the nature of the applications. The need for innovative approach is rapidly growing with the introduction of applications such as object-oriented applications that show dynamically changing memory access behavior In this paper, we propose a novel framework for the use of data prefetchers that are trained off-line using smart learning algorithms to produce prediction models which captures hidden memory access patterns. Once built, those prediction models are loaded into a data prefetching unit in the CPU at the appropriate point during the runtime to drive the prefetching. On average by using table size of about 8KB size, we were able to achieve prediction accuracy of about 68% through our own proposed learning method and performance was boosted about 37% on average on the benchmarks we tested. Furthermore, we believe our proposed framework is amenable to other predictors and can be done as a phase of the profiling-optimizing-compiler.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCD.2002.1106792","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

Abstract

An important technique for alleviating the memory bottleneck is data prefetching. Data prefetching solutions ranging from pure software approach by inserting prefetch instructions through program analysis to purely hardware mechanisms have been proposed. The degrees of success of those techniques are dependent on the nature of the applications. The need for innovative approach is rapidly growing with the introduction of applications such as object-oriented applications that show dynamically changing memory access behavior In this paper, we propose a novel framework for the use of data prefetchers that are trained off-line using smart learning algorithms to produce prediction models which captures hidden memory access patterns. Once built, those prediction models are loaded into a data prefetching unit in the CPU at the appropriate point during the runtime to drive the prefetching. On average by using table size of about 8KB size, we were able to achieve prediction accuracy of about 68% through our own proposed learning method and performance was boosted about 37% on average on the benchmarks we tested. Furthermore, we believe our proposed framework is amenable to other predictors and can be done as a phase of the profiling-optimizing-compiler.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用离线训练马尔可夫预测器的数据预取框架

缓解内存瓶颈的一个重要技术是数据预取。数据预取的解决方案从通过程序分析插入预取指令的纯软件方法到纯硬件机制都有。这些技术的成功程度取决于应用程序的性质。随着应用程序的引入，对创新方法的需求正在迅速增长，例如显示动态变化的内存访问行为的面向对象应用程序。在本文中，我们提出了一个使用数据预取器的新框架，该数据预取器使用智能学习算法离线训练，以产生捕获隐藏内存访问模式的预测模型。一旦构建完成，这些预测模型将在运行期间的适当时间点加载到CPU中的数据预取单元中，以驱动预取。平均而言，通过使用大约8KB大小的表，我们能够通过我们自己提出的学习方法实现大约68%的预测准确度，并且在我们测试的基准测试中，性能平均提高了约37%。此外，我们相信我们提出的框架适用于其他预测器，并且可以作为分析优化编译器的一个阶段来完成。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊