Cyclist: Accelerating hardware development

J. Bachrach, Albert Magyar, D. Dabbelt, Patrick Li, Richard Lin, K. Asanović
{"title":"Cyclist: Accelerating hardware development","authors":"J. Bachrach, Albert Magyar, D. Dabbelt, Patrick Li, Richard Lin, K. Asanović","doi":"10.1109/ICCAD.2017.8203892","DOIUrl":null,"url":null,"abstract":"The end of Dennard scaling has led to an increase in demand for energy-efficient custom hardware accelerators, but current hardware design is slow and laborious, partly because each iteration of the compile-run-debug cycle can take hours or even days with existing simulation and emulation platforms. Cyclist is a new emulation platform designed specifically to shorten the total compile-run-debug cycle. The Cyclist toolflow converts a Chisel RTL design to a parallel dataflow graph, which is then mapped to the Cyclist hardware architecture, consisting of a tiled array of custom parallel emulation engines. Cyclist provides cycle-accurate/bit-accurate RTL emulation at speeds approaching FPGA emulation, but with compile time closer to software simulation. Cyclist provides full visibility and debuggability of the hardware design, including moving forwards and backwards in simulation time while searching for trigger events. The snapshot facility used for debugging is also used to provide a “pay-as-you-go” mapping strategy, which allows emulation to begin execution with a low-effort placement, while higher-quality emulation placements are optimized in the background and swapped in to a running emulation. The Cyclist ASIC design requires 0.069mm2 per tile and runs at 2GHz in a 45nm CMOS process. Our evaluation demonstrate that Cyclist outperforms FPGA emulation, VCS, and C+,+, simulation on combined compile and run time for up to a billion cycles for a set of real-world hardware benchmarks.","PeriodicalId":126686,"journal":{"name":"2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCAD.2017.8203892","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

The end of Dennard scaling has led to an increase in demand for energy-efficient custom hardware accelerators, but current hardware design is slow and laborious, partly because each iteration of the compile-run-debug cycle can take hours or even days with existing simulation and emulation platforms. Cyclist is a new emulation platform designed specifically to shorten the total compile-run-debug cycle. The Cyclist toolflow converts a Chisel RTL design to a parallel dataflow graph, which is then mapped to the Cyclist hardware architecture, consisting of a tiled array of custom parallel emulation engines. Cyclist provides cycle-accurate/bit-accurate RTL emulation at speeds approaching FPGA emulation, but with compile time closer to software simulation. Cyclist provides full visibility and debuggability of the hardware design, including moving forwards and backwards in simulation time while searching for trigger events. The snapshot facility used for debugging is also used to provide a “pay-as-you-go” mapping strategy, which allows emulation to begin execution with a low-effort placement, while higher-quality emulation placements are optimized in the background and swapped in to a running emulation. The Cyclist ASIC design requires 0.069mm2 per tile and runs at 2GHz in a 45nm CMOS process. Our evaluation demonstrate that Cyclist outperforms FPGA emulation, VCS, and C+,+, simulation on combined compile and run time for up to a billion cycles for a set of real-world hardware benchmarks.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
自行车手:加速硬件开发
Dennard扩展的终结导致了对高能效定制硬件加速器的需求增加,但是当前的硬件设计缓慢而费力,部分原因是使用现有的仿真和仿真平台,编译-运行-调试周期的每次迭代可能需要数小时甚至数天的时间。自行车是一个新的仿真平台,专为缩短总编译-运行-调试周期而设计。cycling工具流将Chisel RTL设计转换为并行数据流图,然后将其映射到由自定义并行仿真引擎的平排阵列组成的cycling硬件架构。cycling以接近FPGA仿真的速度提供周期精确/位精确的RTL仿真,但编译时间更接近软件仿真。自行车提供了硬件设计的完整可见性和可调试性,包括在搜索触发事件时在模拟时间内向前和向后移动。用于调试的快照功能还用于提供“按需付费”的映射策略,该策略允许模拟以低工作量的放置开始执行,而高质量的模拟放置在后台进行优化并交换到正在运行的模拟中。自行车ASIC设计要求每瓦0.069mm2,在45纳米CMOS工艺中以2GHz运行。我们的评估表明,在一组真实的硬件基准测试中,在组合编译和运行时,骑车者的性能优于FPGA仿真、VCS和c++、+仿真,可达到10亿次循环。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Clepsydra: Modeling timing flows in hardware designs A case for low frequency single cycle multi hop NoCs for energy efficiency and high performance P4: Phase-based power/performance prediction of heterogeneous systems via neural networks Cyclist: Accelerating hardware development A coordinated synchronous and asynchronous parallel routing approach for FPGAs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1