Introduction of local memory elements in instruction set extensions

P. Biswas, Vinay Choudhary, K. Atasu, L. Pozzi, P. Ienne, N. Dutt
{"title":"Introduction of local memory elements in instruction set extensions","authors":"P. Biswas, Vinay Choudhary, K. Atasu, L. Pozzi, P. Ienne, N. Dutt","doi":"10.1145/996566.996765","DOIUrl":null,"url":null,"abstract":"Automatic generation of Instruction Set Extensions (ISEs), to be executed on a custom processing unit or a coprocessor is an important step towards processor customization. A typical goal of a manual designer is to combine a large number of atomic instructions into an ISE satisfying microarchitectural constraints. However, memory operations pose a challenge for previous ISE approaches by limiting the size of the resulting instruction. In this paper, we introduce memory elements into custom units which result in ISEs closer to those sought after by the designers. We consider two kinds of memory elements for mapping to the specialized hardware: small hardware tables and architecturally-visible state registers. We devised a genetic algorithm to specifically exploit opportunities of introducing memory elements during ISE generation. Finally, we demonstrate the effectiveness of our approach by a detailed study of the variation in performance, area and energy in the presence of the generated ISEs, on a number of MediaBench, EEMBC and cryptographic applications. With the introduction of memory, the average speedup varied from 2.7X to 5X depending on the architectural configuration with a nominal area overhead. Moreover, we obtained an average energy reduction of 26% with respect to a 32-KB cache.","PeriodicalId":115059,"journal":{"name":"Proceedings. 41st Design Automation Conference, 2004.","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"70","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. 41st Design Automation Conference, 2004.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/996566.996765","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 70

Abstract

Automatic generation of Instruction Set Extensions (ISEs), to be executed on a custom processing unit or a coprocessor is an important step towards processor customization. A typical goal of a manual designer is to combine a large number of atomic instructions into an ISE satisfying microarchitectural constraints. However, memory operations pose a challenge for previous ISE approaches by limiting the size of the resulting instruction. In this paper, we introduce memory elements into custom units which result in ISEs closer to those sought after by the designers. We consider two kinds of memory elements for mapping to the specialized hardware: small hardware tables and architecturally-visible state registers. We devised a genetic algorithm to specifically exploit opportunities of introducing memory elements during ISE generation. Finally, we demonstrate the effectiveness of our approach by a detailed study of the variation in performance, area and energy in the presence of the generated ISEs, on a number of MediaBench, EEMBC and cryptographic applications. With the introduction of memory, the average speedup varied from 2.7X to 5X depending on the architectural configuration with a nominal area overhead. Moreover, we obtained an average energy reduction of 26% with respect to a 32-KB cache.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在指令集扩展中引入本地内存元素
指令集扩展(ISEs)的自动生成,将在自定义处理单元或协处理器上执行,是处理器自定义的重要一步。手动设计器的典型目标是将大量原子指令组合到满足微架构约束的ISE中。然而,内存操作通过限制结果指令的大小,对以前的ISE方法提出了挑战。在本文中,我们在定制单元中引入了存储元素,从而使ise更接近设计者所追求的目标。我们考虑两种用于映射到专用硬件的内存元素:小型硬件表和体系结构可见的状态寄存器。我们设计了一种遗传算法,专门利用在ISE生成过程中引入存储元素的机会。最后,我们通过在mediabbench、EEMBC和加密应用程序上对生成的ISEs存在的性能、面积和能量变化的详细研究来证明我们方法的有效性。随着内存的引入,平均加速从2.7倍到5倍不等,这取决于具有标称面积开销的体系结构配置。此外,相对于32kb的缓存,我们获得了26%的平均能耗降低。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
STAC: statistical timing analysis with correlation Large-scale placement by grid-warping Security as a new dimension in embedded system design An integrated hardware/software approach for run-time scratchpad management Reliability-driven layout decompaction for electromigration failure avoidance in complex mixed-signal IC designs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1