Analog Weights in ReRAM DNN Accelerators

J. Eshraghian, S. Kang, Seungbum Baek, G. Orchard, H. Iu, W. Lei
{"title":"Analog Weights in ReRAM DNN Accelerators","authors":"J. Eshraghian, S. Kang, Seungbum Baek, G. Orchard, H. Iu, W. Lei","doi":"10.1109/AICAS.2019.8771550","DOIUrl":null,"url":null,"abstract":"Artificial neural networks have become ubiquitous in modern life, which has triggered the emergence of a new class of application specific integrated circuits for their acceleration. ReRAM-based accelerators have gained significant traction due to their ability to leverage in-memory computations. In a crossbar structure, they can perform multiply-and-accumulate operations more efficiently than standard CMOS logic. By virtue of being resistive switches, ReRAM switches can only reliably store one of two states. This is a severe limitation on the range of values in a computational kernel. This paper presents a novel scheme in alleviating the single-bit-per-device restriction by exploiting frequency dependence of v-i plane hysteresis, and assigning kernel information not only to the device conductance but also partially distributing it to the frequency of a time-varying input.We show this approach reduces average power consumption for a single crossbar convolution by up to a factor of ×16 for an unsigned 8-bit input image, where each convolutional process consumes a worst-case of 1.1mW, and reduces area by a factor of ×8, without reducing accuracy to the level of binarized neural networks. This presents a massive saving in computing cost when there are many simultaneous in-situ multiply-and-accumulate processes occurring across different crossbars.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AICAS.2019.8771550","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 33

Abstract

Artificial neural networks have become ubiquitous in modern life, which has triggered the emergence of a new class of application specific integrated circuits for their acceleration. ReRAM-based accelerators have gained significant traction due to their ability to leverage in-memory computations. In a crossbar structure, they can perform multiply-and-accumulate operations more efficiently than standard CMOS logic. By virtue of being resistive switches, ReRAM switches can only reliably store one of two states. This is a severe limitation on the range of values in a computational kernel. This paper presents a novel scheme in alleviating the single-bit-per-device restriction by exploiting frequency dependence of v-i plane hysteresis, and assigning kernel information not only to the device conductance but also partially distributing it to the frequency of a time-varying input.We show this approach reduces average power consumption for a single crossbar convolution by up to a factor of ×16 for an unsigned 8-bit input image, where each convolutional process consumes a worst-case of 1.1mW, and reduces area by a factor of ×8, without reducing accuracy to the level of binarized neural networks. This presents a massive saving in computing cost when there are many simultaneous in-situ multiply-and-accumulate processes occurring across different crossbars.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ReRAM DNN加速器中的模拟权重
人工神经网络在现代生活中无处不在,这引发了一类新的应用特定集成电路的出现。基于reram的加速器由于能够利用内存中的计算而获得了巨大的吸引力。在横杆结构中,它们可以比标准CMOS逻辑更有效地执行乘法和累加运算。由于是电阻开关,ReRAM开关只能可靠地存储两种状态中的一种。这是对计算内核中值范围的严重限制。本文提出了一种新的方案,利用v-i面迟滞的频率依赖性,将内核信息分配给器件电导,并将其部分分配给时变输入的频率,以减轻每个器件的单比特限制。我们表明,对于无符号8位输入图像,这种方法将单个交叉条卷积的平均功耗降低了×16,其中每个卷积过程消耗的最坏情况为1.1mW,并将面积减少了×8,而不会将精度降低到二值化神经网络的水平。当在不同的交叉条上同时发生许多原位乘法和累积过程时,这将大大节省计算成本。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Artificial Intelligence of Things Wearable System for Cardiac Disease Detection Fast event-driven incremental learning of hand symbols Accelerating CNN-RNN Based Machine Health Monitoring on FPGA Neuromorphic networks on the SpiNNaker platform Complexity Reduction on HEVC Intra Mode Decision with modified LeNet-5
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1