Traversal caches: a first step towards FPGA acceleration of pointer-based data structures

International Conference on Hardware/Software Codesign and System Synthesis Pub Date : 2008-10-19 DOI:10.1145/1450135.1450150

G. Stitt, Gaurav Chaudhari, J. Coole

{"title":"Traversal caches: a first step towards FPGA acceleration of pointer-based data structures","authors":"G. Stitt, Gaurav Chaudhari, J. Coole","doi":"10.1145/1450135.1450150","DOIUrl":null,"url":null,"abstract":"Field-programmable gate arrays (FPGAs) often achieve order of magnitude speedups compared to microprocessors, but typically have been unable to improve the performance of applications with irregular memory access patterns, such as traversals of pointer-based data structures. Due to the common use of these data structures, the applicability and widespread success of FPGAs has been limited. In this paper, we introduce the traversal cache framework - a first step towards improving the performance of FPGA applications that utilize pointer-based data structures. The traversal cache is a local FPGA memory that stores repeated traversals of pointer-based data structures, allowing for these traversals to be efficiently streamed into the FPGA. Although the cache is generally limited to improving applications that exhibit repeated traversals, we show that many applications in fact have this characteristic. Furthermore, we show that few repetitions are needed to achieve performance improvements. We present experimental results showing that FPGA implementations using the traversal cache framework achieve speedups ranging from 7x to 29x compared to pointer-based software on a 3.2 GHz Xeon.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":"90 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Hardware/Software Codesign and System Synthesis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1450135.1450150","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

Abstract

Field-programmable gate arrays (FPGAs) often achieve order of magnitude speedups compared to microprocessors, but typically have been unable to improve the performance of applications with irregular memory access patterns, such as traversals of pointer-based data structures. Due to the common use of these data structures, the applicability and widespread success of FPGAs has been limited. In this paper, we introduce the traversal cache framework - a first step towards improving the performance of FPGA applications that utilize pointer-based data structures. The traversal cache is a local FPGA memory that stores repeated traversals of pointer-based data structures, allowing for these traversals to be efficiently streamed into the FPGA. Although the cache is generally limited to improving applications that exhibit repeated traversals, we show that many applications in fact have this characteristic. Furthermore, we show that few repetitions are needed to achieve performance improvements. We present experimental results showing that FPGA implementations using the traversal cache framework achieve speedups ranging from 7x to 29x compared to pointer-based software on a 3.2 GHz Xeon.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

遍历缓存:迈向FPGA加速指针数据结构的第一步

与微处理器相比，现场可编程门阵列(fpga)通常可以实现数量级的速度提升，但通常无法提高具有不规则内存访问模式的应用程序的性能，例如遍历基于指针的数据结构。由于这些数据结构的普遍使用，限制了fpga的适用性和广泛成功。在本文中，我们介绍了遍历缓存框架，这是提高利用基于指针的数据结构的FPGA应用性能的第一步。遍历缓存是一个本地FPGA内存，存储基于指针的数据结构的重复遍历，允许这些遍历有效地流到FPGA中。尽管缓存通常仅限于改进表现出重复遍历的应用程序，但我们表明，实际上许多应用程序都具有这种特性。此外，我们表明，只需少量的重复就可以实现性能改进。我们提出的实验结果表明，与3.2 GHz至强处理器上基于指针的软件相比，使用遍历缓存框架的FPGA实现的速度提高了7倍到29倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Conference on Hardware/Software Codesign and System Synthesis

自引率

0.00%

发文量