MPack: global memory optimization for stream applications in high-level synthesis

Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays Pub Date : 2014-02-26 DOI:10.1145/2554688.2554761

Jasmina Vasiljevic, P. Chow

{"title":"MPack: global memory optimization for stream applications in high-level synthesis","authors":"Jasmina Vasiljevic, P. Chow","doi":"10.1145/2554688.2554761","DOIUrl":null,"url":null,"abstract":"One of the challenges in designing high-performance FPGA applications is fine-tuning the use of limited on-chip memory storage among many buffers in an application. To achieve desired performance the designer faces the burden of packaging such buffers into on-chip memories and manually optimizing the utilization of each memory and the throughput of each buffer. In addition, the application memories may not match the word width or depth of the physical on-chip memories available on the FPGA. This process is time consuming and non-trivial, particularly with a large number of buffers of various depths and bit widths. We propose a tool, MPack, which globally optimizes on-chip memory use across all buffers for stream applications. The goal is to speed up development time by providing rapid design space exploration and relieving the designer of lengthy low-level iterations. We introduce new high-level pragmas allowing the user to specify global memory requirements, such as an application's on-chip memory budget and data throughput. We allow the user to quickly generate a large number of memory solutions and explore the trade-off between memory usage and achievable throughput. To demonstrate the effectiveness of our tool, we apply the new high-level pragmas to an image processing benchmark. MPack effectively explores the design space and is able to produce a large number of memory solutions ranging from 10 to 100% in throughput, and from 12 to 100% in on-chip memory usage.","PeriodicalId":390562,"journal":{"name":"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays","volume":"110 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2554688.2554761","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

One of the challenges in designing high-performance FPGA applications is fine-tuning the use of limited on-chip memory storage among many buffers in an application. To achieve desired performance the designer faces the burden of packaging such buffers into on-chip memories and manually optimizing the utilization of each memory and the throughput of each buffer. In addition, the application memories may not match the word width or depth of the physical on-chip memories available on the FPGA. This process is time consuming and non-trivial, particularly with a large number of buffers of various depths and bit widths. We propose a tool, MPack, which globally optimizes on-chip memory use across all buffers for stream applications. The goal is to speed up development time by providing rapid design space exploration and relieving the designer of lengthy low-level iterations. We introduce new high-level pragmas allowing the user to specify global memory requirements, such as an application's on-chip memory budget and data throughput. We allow the user to quickly generate a large number of memory solutions and explore the trade-off between memory usage and achievable throughput. To demonstrate the effectiveness of our tool, we apply the new high-level pragmas to an image processing benchmark. MPack effectively explores the design space and is able to produce a large number of memory solutions ranging from 10 to 100% in throughput, and from 12 to 100% in on-chip memory usage.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

MPack:高级合成流应用的全局内存优化

设计高性能FPGA应用程序的挑战之一是在应用程序中的许多缓冲区中微调有限的片上存储器存储的使用。为了达到期望的性能，设计人员面临着将这些缓冲区封装到片上存储器中并手动优化每个存储器的利用率和每个缓冲区的吞吐量的负担。此外，应用程序存储器可能与FPGA上可用的物理片上存储器的字宽或深度不匹配。这个过程非常耗时，而且非常重要，特别是有大量不同深度和位宽度的缓冲区时。我们提出了一个工具，MPack，它可以全局优化流应用程序中所有缓冲区的片上内存使用。其目标是通过提供快速的设计空间探索和减轻设计人员冗长的低级迭代来加快开发时间。我们引入了新的高级编程，允许用户指定全局内存需求，例如应用程序的片上内存预算和数据吞吐量。我们允许用户快速生成大量内存解决方案，并探索内存使用和可实现吞吐量之间的权衡。为了证明我们的工具的有效性，我们将新的高级实用程序应用于图像处理基准。MPack有效地探索了设计空间，能够产生大量的内存解决方案，从10%到100%的吞吐量，从12%到100%的片上内存使用率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays

自引率

0.00%

发文量