可扩展处理器的合法自定义指令的详尽枚举

N. Pothineni, Anshul Kumar, K. Paul
{"title":"可扩展处理器的合法自定义指令的详尽枚举","authors":"N. Pothineni, Anshul Kumar, K. Paul","doi":"10.1109/VLSI.2008.93","DOIUrl":null,"url":null,"abstract":"Today's customizable processors allow the to augment the base processor with custom accelerators. By choosing appropriate set of accelerators, designer can significantly enhance the performance and power of an application. Due to the large number of accelerator choices and their complex trade-offs among reuse, gain and area, manually deciding the optimal combination of accelerators is quite cumbersome and time consuming. This calls for CAD tools that select optimal combination of accelerators by thoroughly searching the entire design space. The term pattern is commonly used to represent the computation performed by a custom accelerator. In this paper, we propose an algorithm for rapidly enumerating all the legal patterns taking into account several constraints posed by a typical micro-architecture. The proposed algorithm achieves significant reduction in run-time by a) enumerating the patterns in the increasing order of sizes and b) relating the characteristics of a (k + 1) node pattern with the characteristics of its k node subgraphs. Also, in scenarios where I/O is not a bottleneck, designer can optionally relax the I/O constraint and our algorithm efficiently enumerates all legal I/O unbound legal patterns. The experimental evidence indicate an order of two run-time speedup over state of the art techniques.","PeriodicalId":143886,"journal":{"name":"21st International Conference on VLSI Design (VLSID 2008)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Exhaustive Enumeration of Legal Custom Instructions for Extensible Processors\",\"authors\":\"N. Pothineni, Anshul Kumar, K. Paul\",\"doi\":\"10.1109/VLSI.2008.93\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Today's customizable processors allow the to augment the base processor with custom accelerators. By choosing appropriate set of accelerators, designer can significantly enhance the performance and power of an application. Due to the large number of accelerator choices and their complex trade-offs among reuse, gain and area, manually deciding the optimal combination of accelerators is quite cumbersome and time consuming. This calls for CAD tools that select optimal combination of accelerators by thoroughly searching the entire design space. The term pattern is commonly used to represent the computation performed by a custom accelerator. In this paper, we propose an algorithm for rapidly enumerating all the legal patterns taking into account several constraints posed by a typical micro-architecture. The proposed algorithm achieves significant reduction in run-time by a) enumerating the patterns in the increasing order of sizes and b) relating the characteristics of a (k + 1) node pattern with the characteristics of its k node subgraphs. Also, in scenarios where I/O is not a bottleneck, designer can optionally relax the I/O constraint and our algorithm efficiently enumerates all legal I/O unbound legal patterns. The experimental evidence indicate an order of two run-time speedup over state of the art techniques.\",\"PeriodicalId\":143886,\"journal\":{\"name\":\"21st International Conference on VLSI Design (VLSID 2008)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-01-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"21st International Conference on VLSI Design (VLSID 2008)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VLSI.2008.93\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"21st International Conference on VLSI Design (VLSID 2008)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VLSI.2008.93","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

摘要

如今的可定制处理器允许使用定制加速器来增强基本处理器。通过选择一组合适的加速器,设计人员可以显著提高应用程序的性能和功能。由于有大量的加速器选择,以及它们在重用、增益和面积之间的复杂权衡,手动决定加速器的最佳组合非常麻烦且耗时。这就要求CAD工具通过彻底搜索整个设计空间来选择加速器的最佳组合。术语模式通常用于表示自定义加速器执行的计算。在本文中,我们提出了一种算法,用于快速枚举所有合法模式,同时考虑到典型微体系结构所带来的几个约束。该算法通过a)按大小递增顺序枚举模式,b)将(k + 1)节点模式的特征与其k个节点子图的特征联系起来,显著减少了运行时间。此外,在I/O不是瓶颈的情况下,设计人员可以选择性地放松I/O约束,我们的算法可以有效地枚举所有合法的I/O未绑定的合法模式。实验证据表明,运行时速度比最先进的技术提高了两倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Exhaustive Enumeration of Legal Custom Instructions for Extensible Processors
Today's customizable processors allow the to augment the base processor with custom accelerators. By choosing appropriate set of accelerators, designer can significantly enhance the performance and power of an application. Due to the large number of accelerator choices and their complex trade-offs among reuse, gain and area, manually deciding the optimal combination of accelerators is quite cumbersome and time consuming. This calls for CAD tools that select optimal combination of accelerators by thoroughly searching the entire design space. The term pattern is commonly used to represent the computation performed by a custom accelerator. In this paper, we propose an algorithm for rapidly enumerating all the legal patterns taking into account several constraints posed by a typical micro-architecture. The proposed algorithm achieves significant reduction in run-time by a) enumerating the patterns in the increasing order of sizes and b) relating the characteristics of a (k + 1) node pattern with the characteristics of its k node subgraphs. Also, in scenarios where I/O is not a bottleneck, designer can optionally relax the I/O constraint and our algorithm efficiently enumerates all legal I/O unbound legal patterns. The experimental evidence indicate an order of two run-time speedup over state of the art techniques.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Memory Design and Advanced Semiconductor Technology A Robust Architecture for Flip-Flops Tolerant to Soft-Errors and Transients from Combinational Circuits IEEE Market-Oriented Standards Process and the EDA Industry Concurrent Multi-Dimensional Adaptation for Low-Power Operation in Wireless Devices MoCSYS: A Multi-Clock Hybrid Two-Layer Router Architecture and Integrated Topology Synthesis Framework for System-Level Design of FPGA Based On-Chip Networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1