ReSpar: Reordering Algorithm for ReRAM-based Sparse Matrix-Vector Multiplication Accelerator

2021 IEEE 39th International Conference on Computer Design (ICCD) Pub Date : 2021-10-01 DOI:10.1109/ICCD53106.2021.00050

Yi-Jou Hsiao, Chin-Fu Nien, Hsiang-Yun Cheng

{"title":"ReSpar: Reordering Algorithm for ReRAM-based Sparse Matrix-Vector Multiplication Accelerator","authors":"Yi-Jou Hsiao, Chin-Fu Nien, Hsiang-Yun Cheng","doi":"10.1109/ICCD53106.2021.00050","DOIUrl":null,"url":null,"abstract":"Sparse matrix-vector multiplication (SpMV) serves as a crucial operation for several key application domains, such as graph analytics and scientific computing, in the era of big data. The performance of SpMV is bounded by the data transmissions across memory channels in conventional von Neumann systems. Emerging metal-oxide resistive random access memory (ReRAM) has shown its potential to address this memory wall challenge through performing SpMV directly within its crossbar arrays. However, due to the tightly coupled crossbar structure, it is unlikely to skip all redundant data loading and computations with zero-valued entries of the sparse matrix in such ReRAM-based processing-in-memory architecture. These unnecessary ReRAM writes and computations hurt the energy efficiency. As only the crossbar-sized sub-matrices with full-zero entries can be skipped, prior studies have proposed some matrix reordering methods to aggregate non-zero entries to few crossbar arrays, such that more full-zero crossbar arrays can be skipped. Nevertheless, the effectiveness of prior reordering methods is constrained by the original ordering of matrix rows. In this paper, we show that the amount of full-zero sub-matrices derived by these prior studies are less than a theoretical lower bound in some cases, indicating that there are still rooms for improvement. Hence, we propose a novel reordering algorithm, ReSpar, that aims to aggregate matrix rows with similar non-zero column entries together and concentrates the non-zeros columns to increase the zero-skipping opportunities. Results show that ReSpar achieves 1.68× and 1.37× more energy savings, while reducing the required number of crossbar loads by 40.4% and 27.2% on average.","PeriodicalId":154014,"journal":{"name":"2021 IEEE 39th International Conference on Computer Design (ICCD)","volume":"104 9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 39th International Conference on Computer Design (ICCD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCD53106.2021.00050","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Sparse matrix-vector multiplication (SpMV) serves as a crucial operation for several key application domains, such as graph analytics and scientific computing, in the era of big data. The performance of SpMV is bounded by the data transmissions across memory channels in conventional von Neumann systems. Emerging metal-oxide resistive random access memory (ReRAM) has shown its potential to address this memory wall challenge through performing SpMV directly within its crossbar arrays. However, due to the tightly coupled crossbar structure, it is unlikely to skip all redundant data loading and computations with zero-valued entries of the sparse matrix in such ReRAM-based processing-in-memory architecture. These unnecessary ReRAM writes and computations hurt the energy efficiency. As only the crossbar-sized sub-matrices with full-zero entries can be skipped, prior studies have proposed some matrix reordering methods to aggregate non-zero entries to few crossbar arrays, such that more full-zero crossbar arrays can be skipped. Nevertheless, the effectiveness of prior reordering methods is constrained by the original ordering of matrix rows. In this paper, we show that the amount of full-zero sub-matrices derived by these prior studies are less than a theoretical lower bound in some cases, indicating that there are still rooms for improvement. Hence, we propose a novel reordering algorithm, ReSpar, that aims to aggregate matrix rows with similar non-zero column entries together and concentrates the non-zeros columns to increase the zero-skipping opportunities. Results show that ReSpar achieves 1.68× and 1.37× more energy savings, while reducing the required number of crossbar loads by 40.4% and 27.2% on average.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

ReSpar:基于rerram的稀疏矩阵向量乘法加速器的重排序算法

在大数据时代，稀疏矩阵向量乘法(SpMV)是图形分析和科学计算等几个关键应用领域的关键运算。在传统的冯·诺依曼系统中，SpMV的性能受到存储通道间数据传输的限制。新兴的金属氧化物电阻性随机存取存储器(ReRAM)通过在其横条阵列中直接执行SpMV，显示出其解决这一存储壁挑战的潜力。然而，在这种基于reram的内存处理架构中，由于采用了紧耦合的横杆结构，不太可能跳过所有冗余的数据加载和稀疏矩阵零值条目的计算。这些不必要的ReRAM写入和计算损害了能源效率。由于只能跳过具有全零条目的横条大小的子矩阵，因此已有研究提出了一些矩阵重排序方法，将非零条目聚合到少数横条数组中，从而可以跳过更多的全零交叉栏数组。然而，先前的重排序方法的有效性受到矩阵原始排序的约束。在本文中，我们证明了这些先前的研究在某些情况下推导出的全零子矩阵的数量小于理论下界，这表明仍有改进的余地。因此，我们提出了一种新的重排序算法ReSpar，该算法旨在将具有相似非零列条目的矩阵行聚集在一起，并集中非零列以增加跳零的机会。结果表明，ReSpar分别实现了1.68倍和1.37倍的节能效果，同时平均减少了40.4%和27.2%的横杆载荷数量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2021 IEEE 39th International Conference on Computer Design (ICCD)

自引率

0.00%

发文量