Operation scheduling and architecture co-synthesis for energy-efficient dataflow computations on FPGAs (abstract only)

C. Y. Lin, N. Wong, Hayden Kwok-Hay So
{"title":"Operation scheduling and architecture co-synthesis for energy-efficient dataflow computations on FPGAs (abstract only)","authors":"C. Y. Lin, N. Wong, Hayden Kwok-Hay So","doi":"10.1145/2145694.2145757","DOIUrl":null,"url":null,"abstract":"Compiling high-level user applications for execution on FPGAs often involves synthesizing dataflow graphs beyond the size of the available on-chip computational resources. One way to address this is by folding the execution of the given dataflow graphs onto an array of directly connected simple configurable processing elements (CPEs). Under this scenario, the performance and energy-efficiency of the resulting system depends not only on the mapping schedule of the compute operations on the CPEs, but also on the topology of the interconnect array that connects the CPEs. This paper presents a framework in which the operation scheduler and the underlying CPE interconnect network topology are co-optimized on a per-application basis for energy-efficient FPGA computation. Given the same application, more than 2.5x difference in energy-efficiency was achievable by the use of different common regular array topologies to connect the CPEs. Moreover, by using irregular application-specific interconnect topologies derived from a genetic algorithm, up to 50% improvement in energy-delay-product was achievable when compared to the use of even the best regular topology. The use of such framework is anticipated to serve as part of a rapid high-level FPGA application compiler since minimum hardware place-and-route is needed to generate the optimal schedule and topology.","PeriodicalId":87257,"journal":{"name":"FPGA. ACM International Symposium on Field-Programmable Gate Arrays","volume":"38 8","pages":"270"},"PeriodicalIF":0.0000,"publicationDate":"2012-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"FPGA. ACM International Symposium on Field-Programmable Gate Arrays","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2145694.2145757","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Compiling high-level user applications for execution on FPGAs often involves synthesizing dataflow graphs beyond the size of the available on-chip computational resources. One way to address this is by folding the execution of the given dataflow graphs onto an array of directly connected simple configurable processing elements (CPEs). Under this scenario, the performance and energy-efficiency of the resulting system depends not only on the mapping schedule of the compute operations on the CPEs, but also on the topology of the interconnect array that connects the CPEs. This paper presents a framework in which the operation scheduler and the underlying CPE interconnect network topology are co-optimized on a per-application basis for energy-efficient FPGA computation. Given the same application, more than 2.5x difference in energy-efficiency was achievable by the use of different common regular array topologies to connect the CPEs. Moreover, by using irregular application-specific interconnect topologies derived from a genetic algorithm, up to 50% improvement in energy-delay-product was achievable when compared to the use of even the best regular topology. The use of such framework is anticipated to serve as part of a rapid high-level FPGA application compiler since minimum hardware place-and-route is needed to generate the optimal schedule and topology.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
fpga上高能效数据流计算的操作调度与架构协同合成(仅摘要)
编译在fpga上执行的高级用户应用程序通常涉及合成超出可用片上计算资源大小的数据流图。解决这个问题的一种方法是将给定数据流图的执行折叠到直接连接的简单可配置处理元素(cpe)数组中。在这种情况下,最终系统的性能和能效不仅取决于cpe上计算操作的映射调度,还取决于连接cpe的互连阵列的拓扑结构。本文提出了一个框架,在该框架中,操作调度程序和底层CPE互连网络拓扑在每个应用的基础上共同优化,以实现节能的FPGA计算。对于相同的应用程序,通过使用不同的通用规则阵列拓扑来连接cpe,可以实现2.5倍以上的能效差异。此外,通过使用源自遗传算法的不规则应用特定互连拓扑,与使用最好的规则拓扑相比,可以实现高达50%的能量延迟积改进。这种框架的使用预计将作为快速高级FPGA应用程序编译器的一部分,因为生成最佳调度和拓扑需要最少的硬件放置和路由。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
FPGA '22: The 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Virtual Event, USA, 27 February 2022 - 1 March 2022 HBM Connect: High-Performance HLS Interconnect for FPGA HBM. AutoBridge: Coupling Coarse-Grained Floorplanning and Pipelining for High-Frequency HLS Design on Multi-Die FPGAs. FPGA '21: The 2021 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Virtual Event, USA, February 28 - March 2, 2021 FPGA '20: The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Seaside, CA, USA, February 23-25, 2020
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1