Energy-Efficient In-SRAM Accumulation for CMOS-based CNN Accelerators

Wanqian Li, Yinhe Han, Xiaoming Chen
{"title":"Energy-Efficient In-SRAM Accumulation for CMOS-based CNN Accelerators","authors":"Wanqian Li, Yinhe Han, Xiaoming Chen","doi":"10.1145/3526241.3530319","DOIUrl":null,"url":null,"abstract":"State-of-the-art convolutional neural network (CNN) accelerators are typically communication-dominate architectures. To reduce the energy consumption of data accesses and also to maintain the high performance, researches have adopted large amounts of on-chip register resources and proposed various methods to concentrate communication on on-chip register accesses. As a result, the on-chip register accesses become the energy bottleneck. To further reduce the energy consumption, in this work we propose an in-SRAM accumulation architecture to replace the conventional register files and digital accumulators in the processing elements of CNN accelerators. Compared with the existing in-SRAM computing approaches (which may not be targeted at CNN accelerators), the presented in-SRAM computing architecture not only realizes in-memory accumulation, but also solves the structure contention problem which occurs frequently when embedding in-memory architectures into CNN accelerators. HSPICE simulation results based on the 45nm technology demonstrate that with the proposed in-SRAM accumulator, the overall energy efficiency of a state-of-the-art communication-optimal CNN accelerator is increased by 29% on average.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Great Lakes Symposium on VLSI 2022","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3526241.3530319","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

State-of-the-art convolutional neural network (CNN) accelerators are typically communication-dominate architectures. To reduce the energy consumption of data accesses and also to maintain the high performance, researches have adopted large amounts of on-chip register resources and proposed various methods to concentrate communication on on-chip register accesses. As a result, the on-chip register accesses become the energy bottleneck. To further reduce the energy consumption, in this work we propose an in-SRAM accumulation architecture to replace the conventional register files and digital accumulators in the processing elements of CNN accelerators. Compared with the existing in-SRAM computing approaches (which may not be targeted at CNN accelerators), the presented in-SRAM computing architecture not only realizes in-memory accumulation, but also solves the structure contention problem which occurs frequently when embedding in-memory architectures into CNN accelerators. HSPICE simulation results based on the 45nm technology demonstrate that with the proposed in-SRAM accumulator, the overall energy efficiency of a state-of-the-art communication-optimal CNN accelerator is increased by 29% on average.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于cmos的CNN加速器的高效内存储器积累
最先进的卷积神经网络(CNN)加速器是典型的通信主导架构。为了降低数据访问的能耗,同时保持高性能,研究人员采用了大量的片上寄存器资源,并提出了各种将通信集中在片上寄存器访问上的方法。因此,片上寄存器的存取成为能量的瓶颈。为了进一步降低能耗,本文提出了一种sram内累加架构来取代传统的CNN加速器处理单元中的寄存器文件和数字累加器。与现有的in-SRAM计算方法(可能不是针对CNN加速器)相比,本文提出的in-SRAM计算架构不仅实现了内存积累,而且解决了在CNN加速器中嵌入内存架构时经常出现的结构争用问题。基于45nm技术的HSPICE仿真结果表明,采用所提出的sram内蓄能器,最先进的通信优化CNN加速器的整体能量效率平均提高了29%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Reducing Power Consumption using Approximate Encoding for CNN Accelerators at the Edge Design and Evaluation of In-Exact Compressor based Approximate Multipliers MOCCA: A Process Variation Tolerant Systolic DNN Accelerator using CNFETs in Monolithic 3D ENTANGLE: An Enhanced Logic-locking Technique for Thwarting SAT and Structural Attacks Two 0.8 V, Highly Reliable RHBD 10T and 12T SRAM Cells for Aerospace Applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1