A Novel Transpose 2T-DRAM based Computing-in-Memory Architecture for On-chip DNN Training and Inference

2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS) Pub Date : 2023-06-11 DOI:10.1109/AICAS57966.2023.10168641

Yuansheng Zhao, Zixuan Shen, Jiarui Xu, K. Chai, Yanqing Wu, Chao Wang

{"title":"A Novel Transpose 2T-DRAM based Computing-in-Memory Architecture for On-chip DNN Training and Inference","authors":"Yuansheng Zhao, Zixuan Shen, Jiarui Xu, K. Chai, Yanqing Wu, Chao Wang","doi":"10.1109/AICAS57966.2023.10168641","DOIUrl":null,"url":null,"abstract":"Recently, DRAM-based Computing-in-Memory (CIM) has emerged as one of the potential CIM solutions due to its unique advantages of high bit-cell density, large memory capacity and CMOS compatibility. This paper proposes a 2T-DRAM based CIM architecture, which can perform both CIM inference and training for deep neural networks (DNNs) efficiently. The proposed CIM architecture employs 2T-DRAM based transpose circuitry to implement transpose weight memory array and uses digital logic in the array peripheral to implement digital DNN computation in memory. A novel mapping method is proposed to map the convolutional and full-connection computation of the forward propagation and back propagation process into the transpose 2T-DRAM CIM array to achieve digital weight multiplexing and parallel computing. Simulation results show that the computing power of proposed transpose 2T-DRAM based CIM architecture is estimated to 11.26 GOPS by a 16K DRAM array to accelerate 4CONV+3FC @100 MHz and has an 82.15% accuracy on CIFAR-10 dataset, which are much higher than the state-of-the-art DRAM-based CIM accelerators without CIM learning capability. Preliminary evaluation of retention time in DRAM CIM also shows that a refresh-less training-inference process of lightweight networks can be realized by a suitable scale of CIM array through the proposed mapping strategy with negligible refresh-induced performance loss or power increase.","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AICAS57966.2023.10168641","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Recently, DRAM-based Computing-in-Memory (CIM) has emerged as one of the potential CIM solutions due to its unique advantages of high bit-cell density, large memory capacity and CMOS compatibility. This paper proposes a 2T-DRAM based CIM architecture, which can perform both CIM inference and training for deep neural networks (DNNs) efficiently. The proposed CIM architecture employs 2T-DRAM based transpose circuitry to implement transpose weight memory array and uses digital logic in the array peripheral to implement digital DNN computation in memory. A novel mapping method is proposed to map the convolutional and full-connection computation of the forward propagation and back propagation process into the transpose 2T-DRAM CIM array to achieve digital weight multiplexing and parallel computing. Simulation results show that the computing power of proposed transpose 2T-DRAM based CIM architecture is estimated to 11.26 GOPS by a 16K DRAM array to accelerate 4CONV+3FC @100 MHz and has an 82.15% accuracy on CIFAR-10 dataset, which are much higher than the state-of-the-art DRAM-based CIM accelerators without CIM learning capability. Preliminary evaluation of retention time in DRAM CIM also shows that a refresh-less training-inference process of lightweight networks can be realized by a suitable scale of CIM array through the proposed mapping strategy with negligible refresh-induced performance loss or power increase.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

一种新的基于转置2T-DRAM的片上深度神经网络训练与推理的内存计算架构

近年来，基于dram的内存计算(Computing-in-Memory, CIM)以其高位元密度、大存储容量和CMOS兼容性等独特优势，成为一种潜在的CIM解决方案。本文提出了一种基于2T-DRAM的CIM结构，该结构可以有效地进行深度神经网络的CIM推理和训练。所提出的CIM架构采用基于2T-DRAM的转置电路实现转置权重存储阵列，并在阵列外设中使用数字逻辑实现内存中的数字DNN计算。提出了一种新的映射方法，将前向传播和反向传播过程的卷积和全连接计算映射到转置2T-DRAM CIM阵列中，实现数字权复用和并行计算。仿真结果表明，采用16K DRAM阵列加速4CONV+3FC @100 MHz时，所提出的基于转置2T-DRAM的CIM架构的计算能力估计为11.26 GOPS，在CIFAR-10数据集上的准确率为82.15%，远高于目前最先进的没有CIM学习能力的基于DRAM的CIM加速器。对DRAM CIM保留时间的初步评估也表明，通过所提出的映射策略，可以通过适当的CIM阵列规模实现轻量级网络的无刷新训练-推理过程，且刷新导致的性能损失或功率增加可以忽略。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)

自引率

0.00%

发文量