指令集扩展中架构可见存储的推测性DMA

International Conference on Hardware/Software Codesign and System Synthesis Pub Date : 2008-10-19 DOI:10.1145/1450135.1450191

Theo Kluter, P. Brisk, P. Ienne, E. Charbon

{"title":"指令集扩展中架构可见存储的推测性DMA","authors":"Theo Kluter, P. Brisk, P. Ienne, E. Charbon","doi":"10.1145/1450135.1450191","DOIUrl":null,"url":null,"abstract":"Instruction set extensions (ISEs) can accelerate embedded processor performance. Many algorithms for ISE generation have shown good potential; some of them have recently been expanded to include Architecturally Visible Storage (AVS) - compiler-controlled memories, similar to scratchpads, that are accessible only to ISEs. To achieve a speedup using AVS, Direct Memory Access (DMA) transfers are required to move data from the main memory to the AVS; unfortunately, this creates coherence problems between the AVS and the cache, which previous methods for ISEs with AVS failed to address; additionally, these methods need to leave many conservative DMA transfers in place, whose execution significantly limits the achievable speedup. This paper presents a memory coherence scheme for ISEs with AVS, which can ensure execution correctness and memory consistency with minimal area overhead. We also present a method that speculatively removes redundant DMA transfers. Cycle-accurate experimental results were obtained using an FPGA-emulation platform. These results show that the application-specific instruction-set extended processors with speculative DMA-enhanced AVS gain significantly over previous techniques, despite the overhead of the coherence mechanism.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":"{\"title\":\"Speculative DMA for architecturally visible storage in instruction set extensions\",\"authors\":\"Theo Kluter, P. Brisk, P. Ienne, E. Charbon\",\"doi\":\"10.1145/1450135.1450191\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Instruction set extensions (ISEs) can accelerate embedded processor performance. Many algorithms for ISE generation have shown good potential; some of them have recently been expanded to include Architecturally Visible Storage (AVS) - compiler-controlled memories, similar to scratchpads, that are accessible only to ISEs. To achieve a speedup using AVS, Direct Memory Access (DMA) transfers are required to move data from the main memory to the AVS; unfortunately, this creates coherence problems between the AVS and the cache, which previous methods for ISEs with AVS failed to address; additionally, these methods need to leave many conservative DMA transfers in place, whose execution significantly limits the achievable speedup. This paper presents a memory coherence scheme for ISEs with AVS, which can ensure execution correctness and memory consistency with minimal area overhead. We also present a method that speculatively removes redundant DMA transfers. Cycle-accurate experimental results were obtained using an FPGA-emulation platform. These results show that the application-specific instruction-set extended processors with speculative DMA-enhanced AVS gain significantly over previous techniques, despite the overhead of the coherence mechanism.\",\"PeriodicalId\":300268,\"journal\":{\"name\":\"International Conference on Hardware/Software Codesign and System Synthesis\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-10-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Hardware/Software Codesign and System Synthesis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1450135.1450191\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Hardware/Software Codesign and System Synthesis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1450135.1450191","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 18

摘要

指令集扩展(ISEs)可以提高嵌入式处理器的性能。许多生成ISE的算法已经显示出良好的潜力;其中一些最近被扩展到包括架构可见存储(AVS)——编译器控制的存储器，类似于刮擦板，只有ise可以访问。为了使用AVS实现加速，需要直接内存访问(DMA)传输将数据从主存储器移动到AVS;不幸的是，这造成了AVS和缓存之间的一致性问题，这是以前的AVS ise方法未能解决的问题;此外，这些方法需要保留许多保守的DMA传输，这些传输的执行会极大地限制可实现的加速。本文提出了一种具有AVS的ISEs的内存一致性方案，该方案可以在最小的空间开销下保证执行正确性和内存一致性。我们还提出了一种推测性地去除冗余DMA传输的方法。在fpga仿真平台上获得了周期精度的实验结果。这些结果表明，尽管相干机制的开销很大，但具有推测性dma增强AVS的特定应用指令集扩展处理器比以前的技术获得了显着的增益。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Speculative DMA for architecturally visible storage in instruction set extensions

Instruction set extensions (ISEs) can accelerate embedded processor performance. Many algorithms for ISE generation have shown good potential; some of them have recently been expanded to include Architecturally Visible Storage (AVS) - compiler-controlled memories, similar to scratchpads, that are accessible only to ISEs. To achieve a speedup using AVS, Direct Memory Access (DMA) transfers are required to move data from the main memory to the AVS; unfortunately, this creates coherence problems between the AVS and the cache, which previous methods for ISEs with AVS failed to address; additionally, these methods need to leave many conservative DMA transfers in place, whose execution significantly limits the achievable speedup. This paper presents a memory coherence scheme for ISEs with AVS, which can ensure execution correctness and memory consistency with minimal area overhead. We also present a method that speculatively removes redundant DMA transfers. Cycle-accurate experimental results were obtained using an FPGA-emulation platform. These results show that the application-specific instruction-set extended processors with speculative DMA-enhanced AVS gain significantly over previous techniques, despite the overhead of the coherence mechanism.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Conference on Hardware/Software Codesign and System Synthesis

自引率

0.00%

发文量