High-level synthesis of dynamic data structures: A case study using Vivado HLS

F. Winterstein, Samuel Bayliss, G. Constantinides
{"title":"High-level synthesis of dynamic data structures: A case study using Vivado HLS","authors":"F. Winterstein, Samuel Bayliss, G. Constantinides","doi":"10.1109/FPT.2013.6718388","DOIUrl":null,"url":null,"abstract":"High-level synthesis promises a significant shortening of the FPGA design cycle when compared with design entry using register transfer level (RTL) languages. Recent evaluations report that C-to-RTL flows can produce results with a quality close to hand-crafted designs [1]. Algorithms which use dynamic, pointer-based data structures, which are common in software, remain difficult to implement well. In this paper, we describe a comparative case study using Xilinx Vivado HLS as an exemplary state-of-the-art high-level synthesis tool. Our test cases are two alternative algorithms for the same compute-intensive machine learning technique (clustering) with significantly different computational properties. We compare a data-flow centric implementation to a recursive tree traversal implementation which incorporates complex data-dependent control flow and makes use of pointer-linked data structures and dynamic memory allocation. The outcome of this case study is twofold: We confirm similar performance between the hand-written and automatically generated RTL designs for the first test case. The second case reveals a degradation in latency by a factor greater than 30× if the source code is not altered prior to high-level synthesis. We identify the reasons for this shortcoming and present code transformations that narrow the performance gap to a factor of four. We generalise our source-to-source transformations whose automation motivates research directions to improve high-level synthesis of dynamic data structures in the future.","PeriodicalId":344469,"journal":{"name":"2013 International Conference on Field-Programmable Technology (FPT)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"113","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Field-Programmable Technology (FPT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FPT.2013.6718388","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 113

Abstract

High-level synthesis promises a significant shortening of the FPGA design cycle when compared with design entry using register transfer level (RTL) languages. Recent evaluations report that C-to-RTL flows can produce results with a quality close to hand-crafted designs [1]. Algorithms which use dynamic, pointer-based data structures, which are common in software, remain difficult to implement well. In this paper, we describe a comparative case study using Xilinx Vivado HLS as an exemplary state-of-the-art high-level synthesis tool. Our test cases are two alternative algorithms for the same compute-intensive machine learning technique (clustering) with significantly different computational properties. We compare a data-flow centric implementation to a recursive tree traversal implementation which incorporates complex data-dependent control flow and makes use of pointer-linked data structures and dynamic memory allocation. The outcome of this case study is twofold: We confirm similar performance between the hand-written and automatically generated RTL designs for the first test case. The second case reveals a degradation in latency by a factor greater than 30× if the source code is not altered prior to high-level synthesis. We identify the reasons for this shortcoming and present code transformations that narrow the performance gap to a factor of four. We generalise our source-to-source transformations whose automation motivates research directions to improve high-level synthesis of dynamic data structures in the future.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
动态数据结构的高级综合:使用Vivado HLS的案例研究
与使用寄存器传输级(RTL)语言的设计入口相比,高级综合承诺显著缩短FPGA设计周期。最近的评估报告称,C-to-RTL流程可以产生接近手工设计的质量结果[1]。使用动态的、基于指针的数据结构的算法在软件中很常见,但很难很好地实现。在本文中,我们描述了使用Xilinx Vivado HLS作为示范性的最先进的高级合成工具的比较案例研究。我们的测试用例是同一种计算密集型机器学习技术(聚类)的两种可选算法,它们具有显著不同的计算特性。我们将以数据流为中心的实现与递归树遍历实现进行比较,后者结合了复杂的依赖数据的控制流,并利用指针链接的数据结构和动态内存分配。这个案例研究的结果是双重的:对于第一个测试用例,我们确认了手写和自动生成的RTL设计之间的相似性能。第二种情况表明,如果在高级合成之前未更改源代码,则延迟的降低幅度将大于30倍。我们确定了这个缺点的原因,并给出了将性能差距缩小到四倍的代码转换。我们概括了源到源的转换,这些转换的自动化激励了研究方向,以提高未来动态数据结构的高级综合。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Design and optimization of heterogeneous tree-based FPGA using 3D technology Mobile GPU shader processor based on non-blocking Coarse Grained Reconfigurable Arrays architecture An FPGA-cluster-accelerated match engine for content-based image retrieval A non-intrusive portable fault injection framework to assess reliability of FPGA-based designs Quantum FPGA architecture design
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1