具有索引访问的流寄存器文件

10th International Symposium on High Performance Computer Architecture (HPCA'04) Pub Date : 2004-02-14 DOI:10.1109/HPCA.2004.10007

N. Jayasena, M. Erez, Jung Ho Ahn, W. Dally

{"title":"具有索引访问的流寄存器文件","authors":"N. Jayasena, M. Erez, Jung Ho Ahn, W. Dally","doi":"10.1109/HPCA.2004.10007","DOIUrl":null,"url":null,"abstract":"Many current programmable architecture designed to exploit data parallelism require computation to be structured to operate on sequentially accessed vectors or streams of data. Applications with less regular data access patterns perform sub-optimally on such architectures. We present a register file for streams (SRF) that allows arbitrary, indexed accesses. Compared to sequential SRF access, indexed access captures more temporal locality, reduces data replication in the SRF, and provides efficient support for certain types of complex access patterns. Our simulations show that indexed SRF access provides speedups of 1.03x to 4.1x and memory bandwidth reductions of up to 95% over sequential SRF access for a set of benchmarks representative of data-parallel applications with irregular accesses. Indexed SRF access also provides greater speedups than caches for a number of application classes despite significantly lower hardware costs. The area overhead of our indexed SRF implementation is 11%-22% over a sequentially accessed SRF, which corresponds to a modest 1.5%-3% increase in the total die area of a typical stream processor.","PeriodicalId":145009,"journal":{"name":"10th International Symposium on High Performance Computer Architecture (HPCA'04)","volume":"451 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"53","resultStr":"{\"title\":\"Stream register files with indexed access\",\"authors\":\"N. Jayasena, M. Erez, Jung Ho Ahn, W. Dally\",\"doi\":\"10.1109/HPCA.2004.10007\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Many current programmable architecture designed to exploit data parallelism require computation to be structured to operate on sequentially accessed vectors or streams of data. Applications with less regular data access patterns perform sub-optimally on such architectures. We present a register file for streams (SRF) that allows arbitrary, indexed accesses. Compared to sequential SRF access, indexed access captures more temporal locality, reduces data replication in the SRF, and provides efficient support for certain types of complex access patterns. Our simulations show that indexed SRF access provides speedups of 1.03x to 4.1x and memory bandwidth reductions of up to 95% over sequential SRF access for a set of benchmarks representative of data-parallel applications with irregular accesses. Indexed SRF access also provides greater speedups than caches for a number of application classes despite significantly lower hardware costs. The area overhead of our indexed SRF implementation is 11%-22% over a sequentially accessed SRF, which corresponds to a modest 1.5%-3% increase in the total die area of a typical stream processor.\",\"PeriodicalId\":145009,\"journal\":{\"name\":\"10th International Symposium on High Performance Computer Architecture (HPCA'04)\",\"volume\":\"451 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2004-02-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"53\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"10th International Symposium on High Performance Computer Architecture (HPCA'04)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCA.2004.10007\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"10th International Symposium on High Performance Computer Architecture (HPCA'04)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2004.10007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 53

摘要

当前许多设计用于开发数据并行性的可编程架构都要求将计算结构化，以便对顺序访问的向量或数据流进行操作。数据访问模式不太规则的应用程序在这种体系结构上的性能不是最优的。我们提出了一个允许任意索引访问的流寄存器文件(SRF)。与顺序SRF访问相比，索引访问捕获更多的时间局部性，减少SRF中的数据复制，并为某些类型的复杂访问模式提供有效支持。我们的模拟表明，对于具有不规则访问的数据并行应用程序的一组基准测试，与顺序SRF访问相比，索引SRF访问提供了1.93倍到4.1倍的速度和高达95%的内存带宽减少。对于许多应用程序类，尽管硬件成本显著降低，但与缓存相比，索引SRF访问也提供了更高的速度。与顺序访问的SRF相比，我们的索引SRF实现的面积开销是11%-22%，这相当于典型流处理器的总模具面积增加了1.5%-3%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Stream register files with indexed access

Many current programmable architecture designed to exploit data parallelism require computation to be structured to operate on sequentially accessed vectors or streams of data. Applications with less regular data access patterns perform sub-optimally on such architectures. We present a register file for streams (SRF) that allows arbitrary, indexed accesses. Compared to sequential SRF access, indexed access captures more temporal locality, reduces data replication in the SRF, and provides efficient support for certain types of complex access patterns. Our simulations show that indexed SRF access provides speedups of 1.03x to 4.1x and memory bandwidth reductions of up to 95% over sequential SRF access for a set of benchmarks representative of data-parallel applications with irregular accesses. Indexed SRF access also provides greater speedups than caches for a number of application classes despite significantly lower hardware costs. The area overhead of our indexed SRF implementation is 11%-22% over a sequentially accessed SRF, which corresponds to a modest 1.5%-3% increase in the total die area of a typical stream processor.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

10th International Symposium on High Performance Computer Architecture (HPCA'04)

自引率

0.00%

发文量