具有解耦内存访问的计算管道的体系结构综合

Shaoyi Cheng, J. Wawrzynek
{"title":"具有解耦内存访问的计算管道的体系结构综合","authors":"Shaoyi Cheng, J. Wawrzynek","doi":"10.1109/FPT.2014.7082758","DOIUrl":null,"url":null,"abstract":"As high level synthesis (HLS) moves towards mainstream adoption among FPGA designers, it has proven to be an effective method for rapid hardware generation. However, in the context of offloading compute intensive software kernels to FPGA accelerators, current HLS tools do not always take full advantage of the hardware platforms. In this paper, we present an automatic flow to refactor and restructure processor-centric software implementations, making them better suited for FPGA platforms. The methodology generates pipelines that decouple memory operations and data access from computation. The resulting pipelines have much better throughput due to their efficient use of the memory bandwidth and improved tolerance to data access latency. The methodology complements existing work in high-level synthesis, easing the creation of heterogeneous systems with high performance accelerators and general purpose processors. With this approach, for a set of non-regular algorithm kernels written in C, a performance improvement of 3.3 to 9.1x is observed over direct C-to-Hardware mapping using a state-of-the-art HLS tool.","PeriodicalId":6877,"journal":{"name":"2014 International Conference on Field-Programmable Technology (FPT)","volume":"18 1","pages":"83-90"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Architectural synthesis of computational pipelines with decoupled memory access\",\"authors\":\"Shaoyi Cheng, J. Wawrzynek\",\"doi\":\"10.1109/FPT.2014.7082758\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As high level synthesis (HLS) moves towards mainstream adoption among FPGA designers, it has proven to be an effective method for rapid hardware generation. However, in the context of offloading compute intensive software kernels to FPGA accelerators, current HLS tools do not always take full advantage of the hardware platforms. In this paper, we present an automatic flow to refactor and restructure processor-centric software implementations, making them better suited for FPGA platforms. The methodology generates pipelines that decouple memory operations and data access from computation. The resulting pipelines have much better throughput due to their efficient use of the memory bandwidth and improved tolerance to data access latency. The methodology complements existing work in high-level synthesis, easing the creation of heterogeneous systems with high performance accelerators and general purpose processors. With this approach, for a set of non-regular algorithm kernels written in C, a performance improvement of 3.3 to 9.1x is observed over direct C-to-Hardware mapping using a state-of-the-art HLS tool.\",\"PeriodicalId\":6877,\"journal\":{\"name\":\"2014 International Conference on Field-Programmable Technology (FPT)\",\"volume\":\"18 1\",\"pages\":\"83-90\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 International Conference on Field-Programmable Technology (FPT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FPT.2014.7082758\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on Field-Programmable Technology (FPT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FPT.2014.7082758","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

摘要

随着高层次综合(HLS)在FPGA设计人员中逐渐成为主流,它已被证明是快速硬件生成的有效方法。然而,在将计算密集型软件内核卸载到FPGA加速器的背景下,当前的HLS工具并不总是充分利用硬件平台。在本文中,我们提出了一个自动流程来重构和重构以处理器为中心的软件实现,使它们更适合FPGA平台。该方法生成了将内存操作和数据访问与计算解耦的管道。由于有效地利用了内存带宽并提高了对数据访问延迟的容忍度,因此生成的管道具有更好的吞吐量。该方法补充了现有的高级综合工作,简化了具有高性能加速器和通用处理器的异构系统的创建。使用这种方法,对于用C编写的一组非规则算法内核,使用最先进的HLS工具进行直接C到硬件映射,可以观察到3.3到9.1倍的性能改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Architectural synthesis of computational pipelines with decoupled memory access
As high level synthesis (HLS) moves towards mainstream adoption among FPGA designers, it has proven to be an effective method for rapid hardware generation. However, in the context of offloading compute intensive software kernels to FPGA accelerators, current HLS tools do not always take full advantage of the hardware platforms. In this paper, we present an automatic flow to refactor and restructure processor-centric software implementations, making them better suited for FPGA platforms. The methodology generates pipelines that decouple memory operations and data access from computation. The resulting pipelines have much better throughput due to their efficient use of the memory bandwidth and improved tolerance to data access latency. The methodology complements existing work in high-level synthesis, easing the creation of heterogeneous systems with high performance accelerators and general purpose processors. With this approach, for a set of non-regular algorithm kernels written in C, a performance improvement of 3.3 to 9.1x is observed over direct C-to-Hardware mapping using a state-of-the-art HLS tool.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Message from the General Chair and Program Co-Chairs Accelerator-in-Switch: A Novel Cooperation Framework for FPGAs and GPUs FPGA Accelerated HPC and Data Analytics Novel Neural Network Applications on New Python Enabled Platforms High-level synthesis - the right side of history
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1