A New Approach to Automatic Memory Banking using Trace-Based Address Mining

Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays Pub Date : 2017-02-22 DOI:10.1145/3020078.3021734

Yuan Zhou, Khalid Al-Hawaj, Zhiru Zhang

{"title":"A New Approach to Automatic Memory Banking using Trace-Based Address Mining","authors":"Yuan Zhou, Khalid Al-Hawaj, Zhiru Zhang","doi":"10.1145/3020078.3021734","DOIUrl":null,"url":null,"abstract":"Recent years have seen an increased deployment of FPGAs as programmable accelerators for improving the performance and energy efficiency of compute-intensive applications. A well-known \"secret sauce\" of achieving highly efficient FPGA acceleration is to create application-specific memory architecture that fully exploits the vast amounts of on-chip memory bandwidth provided by the reconfigurable fabric. In particular, memory banking is widely employed when multiple parallel memory accesses are needed to meet a demanding throughput constraint. In this paper we propose TraceBanking, a novel and flexible trace-driven address mining algorithm that can automatically generate efficient memory banking schemes by analyzing a stream of memory address bits. Unlike mainstream memory partitioning techniques that are based on static compile-time analysis, TraceBanking only relies on simple source-level instrumentation to provide the memory trace of interest without enforcing any coding restrictions. More importantly, our technique can effectively handle memory traces that exhibit either affine or non-affine access patterns, and produce efficient banking solutions with a reasonable runtime. Furthermore, TraceBanking can be used to process a reduced memory trace with the aid of an SMT prover to verify if the resulting banking scheme is indeed conflict free. Our experiments on Xilinx FPGAs show that TraceBanking achieves competitive performance and resource usage compared to the state-of-the-art across a set of real-life benchmarks with affine memory accesses. We also perform a case study on a face detection algorithm to show that TraceBanking is capable of generating a highly area-efficient memory partitioning based on a sequence of addresses without any obvious access patterns.","PeriodicalId":252039,"journal":{"name":"Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3020078.3021734","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 28

Abstract

Recent years have seen an increased deployment of FPGAs as programmable accelerators for improving the performance and energy efficiency of compute-intensive applications. A well-known "secret sauce" of achieving highly efficient FPGA acceleration is to create application-specific memory architecture that fully exploits the vast amounts of on-chip memory bandwidth provided by the reconfigurable fabric. In particular, memory banking is widely employed when multiple parallel memory accesses are needed to meet a demanding throughput constraint. In this paper we propose TraceBanking, a novel and flexible trace-driven address mining algorithm that can automatically generate efficient memory banking schemes by analyzing a stream of memory address bits. Unlike mainstream memory partitioning techniques that are based on static compile-time analysis, TraceBanking only relies on simple source-level instrumentation to provide the memory trace of interest without enforcing any coding restrictions. More importantly, our technique can effectively handle memory traces that exhibit either affine or non-affine access patterns, and produce efficient banking solutions with a reasonable runtime. Furthermore, TraceBanking can be used to process a reduced memory trace with the aid of an SMT prover to verify if the resulting banking scheme is indeed conflict free. Our experiments on Xilinx FPGAs show that TraceBanking achieves competitive performance and resource usage compared to the state-of-the-art across a set of real-life benchmarks with affine memory accesses. We also perform a case study on a face detection algorithm to show that TraceBanking is capable of generating a highly area-efficient memory partitioning based on a sequence of addresses without any obvious access patterns.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于跟踪地址挖掘的自动内存存储新方法

近年来，fpga作为可编程加速器的部署越来越多，用于提高计算密集型应用的性能和能源效率。实现高效FPGA加速的一个众所周知的“秘诀”是创建特定于应用程序的内存架构，充分利用可重构结构提供的大量片上内存带宽。特别是，当需要多个并行内存访问以满足苛刻的吞吐量约束时，内存银行被广泛使用。在本文中，我们提出了一种新颖而灵活的跟踪驱动地址挖掘算法TraceBanking，它可以通过分析内存地址位流来自动生成高效的内存银行方案。与基于静态编译时分析的主流内存分区技术不同，TraceBanking仅依赖于简单的源代码级插装来提供感兴趣的内存跟踪，而不强制任何编码限制。更重要的是，我们的技术可以有效地处理显示仿射或非仿射访问模式的内存跟踪，并产生具有合理运行时的高效银行解决方案。此外，可以使用TraceBanking在SMT证明程序的帮助下处理减少的内存跟踪，以验证生成的银行方案是否确实没有冲突。我们在Xilinx fpga上的实验表明，与一组具有仿射内存访问的实际基准测试相比，TraceBanking实现了具有竞争力的性能和资源使用。我们还对人脸检测算法进行了案例研究，以表明TraceBanking能够基于地址序列生成高度区域高效的内存分区，而无需任何明显的访问模式。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

自引率

0.00%

发文量