集体重用连续迭代的自动记忆处理器加速技术

2010 First International Conference on Networking and Computing Pub Date : 2010-11-17 DOI:10.1109/IC-NC.2010.46

Tomoki Ikegaya, Tomoaki Tsumura, H. Matsuo, Y. Nakashima

{"title":"集体重用连续迭代的自动记忆处理器加速技术","authors":"Tomoki Ikegaya, Tomoaki Tsumura, H. Matsuo, Y. Nakashima","doi":"10.1109/IC-NC.2010.46","DOIUrl":null,"url":null,"abstract":"We have proposed an auto-memoization processor based on computation reuse, and merged it with speculative multithreading based on value prediction into a parallel early computation. In the past model, the parallel early computation detects each iteration of loops as a reusable block. This paper proposes a new parallel early computation model, which integrates multiple continuous iterations into a reusable block automatically and dynamically without modifing executable binaries. We also propose a model for automatically detecting how many iterations should be integrated into one reusable block. Our model reduces the overhead of computation reuse, and further exploits reuse tables. The result of the experiment with SPEC CPU95 FP suite benchmarks shows that the new model improves the maximum speedup from 40.5% to 57.6%, and the average speedup from 15.0% to 26.0%.","PeriodicalId":375145,"journal":{"name":"2010 First International Conference on Networking and Computing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"A Speed-Up Technique for an Auto-Memoization Processor by Collectively Reusing Continuous Iterations\",\"authors\":\"Tomoki Ikegaya, Tomoaki Tsumura, H. Matsuo, Y. Nakashima\",\"doi\":\"10.1109/IC-NC.2010.46\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We have proposed an auto-memoization processor based on computation reuse, and merged it with speculative multithreading based on value prediction into a parallel early computation. In the past model, the parallel early computation detects each iteration of loops as a reusable block. This paper proposes a new parallel early computation model, which integrates multiple continuous iterations into a reusable block automatically and dynamically without modifing executable binaries. We also propose a model for automatically detecting how many iterations should be integrated into one reusable block. Our model reduces the overhead of computation reuse, and further exploits reuse tables. The result of the experiment with SPEC CPU95 FP suite benchmarks shows that the new model improves the maximum speedup from 40.5% to 57.6%, and the average speedup from 15.0% to 26.0%.\",\"PeriodicalId\":375145,\"journal\":{\"name\":\"2010 First International Conference on Networking and Computing\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-11-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 First International Conference on Networking and Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IC-NC.2010.46\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 First International Conference on Networking and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC-NC.2010.46","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

提出了一种基于计算重用的自动记忆处理器，并将其与基于值预测的推测多线程融合为并行的早期计算。在过去的模型中，并行早期计算将循环的每次迭代检测为可重用块。本文提出了一种新的并行早期计算模型，该模型在不修改可执行二进制文件的情况下，自动动态地将多个连续迭代集成到一个可重用的块中。我们还提出了一个模型，用于自动检测应该将多少迭代集成到一个可重用块中。我们的模型减少了计算重用的开销，并进一步利用了重用表。在SPEC CPU95 FP套件基准测试上的实验结果表明，新模型将最大加速从40.5%提高到57.6%，平均加速从15.0%提高到26.0%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A Speed-Up Technique for an Auto-Memoization Processor by Collectively Reusing Continuous Iterations

We have proposed an auto-memoization processor based on computation reuse, and merged it with speculative multithreading based on value prediction into a parallel early computation. In the past model, the parallel early computation detects each iteration of loops as a reusable block. This paper proposes a new parallel early computation model, which integrates multiple continuous iterations into a reusable block automatically and dynamically without modifing executable binaries. We also propose a model for automatically detecting how many iterations should be integrated into one reusable block. Our model reduces the overhead of computation reuse, and further exploits reuse tables. The result of the experiment with SPEC CPU95 FP suite benchmarks shows that the new model improves the maximum speedup from 40.5% to 57.6%, and the average speedup from 15.0% to 26.0%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2010 First International Conference on Networking and Computing

自引率

0.00%

发文量

期刊最新文献

An Evaluation on Sensor Network Technologies for AMI Associated Mudslide Warning System Power Saving in Mobile Devices Using Context-Aware Resource Control An Adaptive Timeout Strategy for Profiling UDP Flows Adaptive Prefetching Scheme for Peer-to-Peer Video-on-Demand Systems with a Media Server Softassign and EM-ICP on GPU