Speculative DMA for architecturally visible storage in instruction set extensions

International Conference on Hardware/Software Codesign and System Synthesis Pub Date : 2008-10-19 DOI:10.1145/1450135.1450191

Theo Kluter, P. Brisk, P. Ienne, E. Charbon

{"title":"Speculative DMA for architecturally visible storage in instruction set extensions","authors":"Theo Kluter, P. Brisk, P. Ienne, E. Charbon","doi":"10.1145/1450135.1450191","DOIUrl":null,"url":null,"abstract":"Instruction set extensions (ISEs) can accelerate embedded processor performance. Many algorithms for ISE generation have shown good potential; some of them have recently been expanded to include Architecturally Visible Storage (AVS) - compiler-controlled memories, similar to scratchpads, that are accessible only to ISEs. To achieve a speedup using AVS, Direct Memory Access (DMA) transfers are required to move data from the main memory to the AVS; unfortunately, this creates coherence problems between the AVS and the cache, which previous methods for ISEs with AVS failed to address; additionally, these methods need to leave many conservative DMA transfers in place, whose execution significantly limits the achievable speedup. This paper presents a memory coherence scheme for ISEs with AVS, which can ensure execution correctness and memory consistency with minimal area overhead. We also present a method that speculatively removes redundant DMA transfers. Cycle-accurate experimental results were obtained using an FPGA-emulation platform. These results show that the application-specific instruction-set extended processors with speculative DMA-enhanced AVS gain significantly over previous techniques, despite the overhead of the coherence mechanism.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Hardware/Software Codesign and System Synthesis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1450135.1450191","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 18

Abstract

Instruction set extensions (ISEs) can accelerate embedded processor performance. Many algorithms for ISE generation have shown good potential; some of them have recently been expanded to include Architecturally Visible Storage (AVS) - compiler-controlled memories, similar to scratchpads, that are accessible only to ISEs. To achieve a speedup using AVS, Direct Memory Access (DMA) transfers are required to move data from the main memory to the AVS; unfortunately, this creates coherence problems between the AVS and the cache, which previous methods for ISEs with AVS failed to address; additionally, these methods need to leave many conservative DMA transfers in place, whose execution significantly limits the achievable speedup. This paper presents a memory coherence scheme for ISEs with AVS, which can ensure execution correctness and memory consistency with minimal area overhead. We also present a method that speculatively removes redundant DMA transfers. Cycle-accurate experimental results were obtained using an FPGA-emulation platform. These results show that the application-specific instruction-set extended processors with speculative DMA-enhanced AVS gain significantly over previous techniques, despite the overhead of the coherence mechanism.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

指令集扩展中架构可见存储的推测性DMA

指令集扩展(ISEs)可以提高嵌入式处理器的性能。许多生成ISE的算法已经显示出良好的潜力;其中一些最近被扩展到包括架构可见存储(AVS)——编译器控制的存储器，类似于刮擦板，只有ise可以访问。为了使用AVS实现加速，需要直接内存访问(DMA)传输将数据从主存储器移动到AVS;不幸的是，这造成了AVS和缓存之间的一致性问题，这是以前的AVS ise方法未能解决的问题;此外，这些方法需要保留许多保守的DMA传输，这些传输的执行会极大地限制可实现的加速。本文提出了一种具有AVS的ISEs的内存一致性方案，该方案可以在最小的空间开销下保证执行正确性和内存一致性。我们还提出了一种推测性地去除冗余DMA传输的方法。在fpga仿真平台上获得了周期精度的实验结果。这些结果表明，尽管相干机制的开销很大，但具有推测性dma增强AVS的特定应用指令集扩展处理器比以前的技术获得了显着的增益。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Conference on Hardware/Software Codesign and System Synthesis

自引率

0.00%

发文量