A Reconfigurable Computing-in-Memory Accelerator With Dynamic Group-Based Dataflow and Dual-Input Macro Designs

IF 4.9 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Circuits and Systems II: Express Briefs Pub Date : 2024-08-13 DOI:10.1109/TCSII.2024.3442873

Pufan Xu;Xing Mou;Bin Gao;Qiumeng Wei;Peng Yao;Jianshi Tang;He Qian;Huaqiang Wu

{"title":"A Reconfigurable Computing-in-Memory Accelerator With Dynamic Group-Based Dataflow and Dual-Input Macro Designs","authors":"Pufan Xu;Xing Mou;Bin Gao;Qiumeng Wei;Peng Yao;Jianshi Tang;He Qian;Huaqiang Wu","doi":"10.1109/TCSII.2024.3442873","DOIUrl":null,"url":null,"abstract":"Non-volatile memory-based computing-in-memory (nvCIM) is a promising candidate for accelerating deep neural networks (DNNs) at the edge. However, current nvCIMs adopt fully-pipelined (FP) or layer-serial (LS) dataflows for all DNN layers, suffering poor area and energy efficiency for the layer-wise-varied workloads. Furthermore, their fixed macro structure results in resource under-utilization, as it is unable to adapt to varying weight sizes. To address these issues, this brief proposes a reconfigurable nvCIM with dynamic dataflow. First, it contains a dynamic inter-pipelined-intra-serial (IPIS) dataflow with group partition mechanism, adapting to the diverse workloads for high area and energy efficiency. Second, it has a dual-input block-reconfigurable (DIBR) macro structure, allowing finer granularity input selection to improve macro utilization and achieve input data reuse. When applied to four well-known networks, the proposed design attains \n<inline-formula> <tex-math>$2.27\\sim 11.92\\times $ </tex-math></inline-formula>\n area efficiency gains and \n<inline-formula> <tex-math>$2.21\\sim 14.43\\times $ </tex-math></inline-formula>\n energy efficiency gains over nvCIM baselines.","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"71 12","pages":"4849-4853"},"PeriodicalIF":4.9000,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems II: Express Briefs","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10634884/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Non-volatile memory-based computing-in-memory (nvCIM) is a promising candidate for accelerating deep neural networks (DNNs) at the edge. However, current nvCIMs adopt fully-pipelined (FP) or layer-serial (LS) dataflows for all DNN layers, suffering poor area and energy efficiency for the layer-wise-varied workloads. Furthermore, their fixed macro structure results in resource under-utilization, as it is unable to adapt to varying weight sizes. To address these issues, this brief proposes a reconfigurable nvCIM with dynamic dataflow. First, it contains a dynamic inter-pipelined-intra-serial (IPIS) dataflow with group partition mechanism, adapting to the diverse workloads for high area and energy efficiency. Second, it has a dual-input block-reconfigurable (DIBR) macro structure, allowing finer granularity input selection to improve macro utilization and achieve input data reuse. When applied to four well-known networks, the proposed design attains

$2.27\sim 11.92\times $

area efficiency gains and

$2.21\sim 14.43\times $

energy efficiency gains over nvCIM baselines.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于动态组数据流和双输入宏设计的可重构内存计算加速器

基于非易失性内存的内存计算（nvCIM）是在边缘加速深度神经网络（DNN）的理想选择。然而，目前的 nvCIM 对所有 DNN 层都采用了全管道（FP）或层串联（LS）数据流，对于层变化的工作负载而言，面积和能效都很低。此外，其固定的宏结构无法适应不同的权重大小，导致资源利用率不足。为解决这些问题，本文提出了一种具有动态数据流的可重构 nvCIM。首先，它包含一个具有组分区机制的动态管线间串行（IPIS）数据流，可适应不同的工作负载，以实现高面积和能效。其次，它具有双输入块可重构（DIBR）宏结构，允许更精细的输入选择，以提高宏利用率并实现输入数据重用。当应用于四个知名网络时，与nvCIM基线相比，所提出的设计实现了2.27美元（模拟11.92倍）的面积效率提升和2.21美元（模拟14.43倍）的能效提升。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Circuits and Systems II: Express Briefs 工程技术-工程：电子与电气

CiteScore

7.90

自引率

20.50%

发文量

883

审稿时长

3.0 months

期刊介绍： TCAS II publishes brief papers in the field specified by the theory, analysis, design, and practical implementations of circuits, and the application of circuit techniques to systems and to signal processing. Included is the whole spectrum from basic scientific theory to industrial applications. The field of interest covered includes: Circuits: Analog, Digital and Mixed Signal Circuits and Systems Nonlinear Circuits and Systems, Integrated Sensors, MEMS and Systems on Chip, Nanoscale Circuits and Systems, Optoelectronic Circuits and Systems, Power Electronics and Systems Software for Analog-and-Logic Circuits and Systems Control aspects of Circuits and Systems.

期刊最新文献

IEEE Circuits and Systems Society Information Table of Contents Incoming Editorial IEEE Circuits and Systems Society Information IEEE Transactions on Circuits and Systems--II: Express Briefs Publication Information