快速和有效的布局和路由定向高级合成的fpga

Hongbin Zheng, S. Gurumani, K. Rupnow, Deming Chen
{"title":"快速和有效的布局和路由定向高级合成的fpga","authors":"Hongbin Zheng, S. Gurumani, K. Rupnow, Deming Chen","doi":"10.1145/2554688.2554775","DOIUrl":null,"url":null,"abstract":"Achievable frequency (fmax) is a widely used input constraint for designs targeting Field-Programmable Gate Arrays (FPGA), because of its impact on design latency and throughput. Fmax is limited by critical path delay, which is highly influenced by lower-level details of the circuit implementation such as technology mapping, placement and routing. However, for high-level synthesis~(HLS) design flows, it is challenging to evaluate the real critical delay at the behavioral level. Current HLS flows typically use module pre-characterization for delay estimates. However, we will demonstrate that such delay estimates are not sufficient to obtain high fmax and also minimize total execution latency. In this paper, we introduce a new HLS flow that integrates with Altera's Quartus synthesis and fast placement and routing (PAR) tool to obtain realistic post-PAR delay estimates. This integration enables an iterative flow that improves the performance of the design with both behavioral-level and circuit-level optimizations using realistic delay information. We demonstrate our HLS flow produces up to 24% (on average 20%) improvement in fmax and upto 22% (on average 20%) improvement in execution latency. Furthermore, results demonstrate that our flow is able to achieve from 65% to 91% of the theoretical fmax on Stratix IV devices (550MHz).","PeriodicalId":390562,"journal":{"name":"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2014-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"38","resultStr":"{\"title\":\"Fast and effective placement and routing directed high-level synthesis for FPGAs\",\"authors\":\"Hongbin Zheng, S. Gurumani, K. Rupnow, Deming Chen\",\"doi\":\"10.1145/2554688.2554775\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Achievable frequency (fmax) is a widely used input constraint for designs targeting Field-Programmable Gate Arrays (FPGA), because of its impact on design latency and throughput. Fmax is limited by critical path delay, which is highly influenced by lower-level details of the circuit implementation such as technology mapping, placement and routing. However, for high-level synthesis~(HLS) design flows, it is challenging to evaluate the real critical delay at the behavioral level. Current HLS flows typically use module pre-characterization for delay estimates. However, we will demonstrate that such delay estimates are not sufficient to obtain high fmax and also minimize total execution latency. In this paper, we introduce a new HLS flow that integrates with Altera's Quartus synthesis and fast placement and routing (PAR) tool to obtain realistic post-PAR delay estimates. This integration enables an iterative flow that improves the performance of the design with both behavioral-level and circuit-level optimizations using realistic delay information. We demonstrate our HLS flow produces up to 24% (on average 20%) improvement in fmax and upto 22% (on average 20%) improvement in execution latency. Furthermore, results demonstrate that our flow is able to achieve from 65% to 91% of the theoretical fmax on Stratix IV devices (550MHz).\",\"PeriodicalId\":390562,\"journal\":{\"name\":\"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-02-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"38\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2554688.2554775\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2554688.2554775","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 38

摘要

可实现频率(fmax)是一种广泛应用于现场可编程门阵列(FPGA)设计的输入约束,因为它会影响设计延迟和吞吐量。Fmax受关键路径延迟的限制,关键路径延迟受电路实现的低级细节(如技术映射、放置和路由)的高度影响。然而,对于高层次的综合设计流程,在行为层面评估真正的临界延迟是一项挑战。当前的HLS流通常使用模块预表征进行延迟估计。然而,我们将证明这样的延迟估计不足以获得高fmax和最小化总执行延迟。在本文中,我们介绍了一个新的HLS流,它集成了Altera的Quartus合成和快速放置和路由(PAR)工具,以获得现实的PAR后延迟估计。这种集成实现了迭代流程,通过使用实际延迟信息进行行为级和电路级优化,提高了设计的性能。我们证明了我们的HLS流在fmax方面提高了24%(平均20%),在执行延迟方面提高了22%(平均20%)。此外,结果表明,我们的流量能够在Stratix IV器件(550MHz)上达到理论fmax的65%至91%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Fast and effective placement and routing directed high-level synthesis for FPGAs
Achievable frequency (fmax) is a widely used input constraint for designs targeting Field-Programmable Gate Arrays (FPGA), because of its impact on design latency and throughput. Fmax is limited by critical path delay, which is highly influenced by lower-level details of the circuit implementation such as technology mapping, placement and routing. However, for high-level synthesis~(HLS) design flows, it is challenging to evaluate the real critical delay at the behavioral level. Current HLS flows typically use module pre-characterization for delay estimates. However, we will demonstrate that such delay estimates are not sufficient to obtain high fmax and also minimize total execution latency. In this paper, we introduce a new HLS flow that integrates with Altera's Quartus synthesis and fast placement and routing (PAR) tool to obtain realistic post-PAR delay estimates. This integration enables an iterative flow that improves the performance of the design with both behavioral-level and circuit-level optimizations using realistic delay information. We demonstrate our HLS flow produces up to 24% (on average 20%) improvement in fmax and upto 22% (on average 20%) improvement in execution latency. Furthermore, results demonstrate that our flow is able to achieve from 65% to 91% of the theoretical fmax on Stratix IV devices (550MHz).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Energy-efficient multiplier-less discrete convolver through probabilistic domain transformation Revisiting and-inverter cones Pushing the performance boundary of linear projection designs through device specific optimisations (abstract only) MORP: makespan optimization for processors with an embedded reconfigurable fabric Co-processing with dynamic reconfiguration on heterogeneous MPSoC: practices and design tradeoffs (abstract only)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1