Error Resilient In-Memory Computing Architecture for CNN Inference on the Edge

M. Rios, Flavio Ponzina, G. Ansaloni, A. Levisse, David Atienza Alonso
{"title":"Error Resilient In-Memory Computing Architecture for CNN Inference on the Edge","authors":"M. Rios, Flavio Ponzina, G. Ansaloni, A. Levisse, David Atienza Alonso","doi":"10.1145/3526241.3530351","DOIUrl":null,"url":null,"abstract":"The growing popularity of edge computing has fostered the development of diverse solutions to support Artificial Intelligence (AI) in energy-constrained devices. Nonetheless, comparatively few efforts have focused on the resiliency exhibited by AI workloads (such as Convolutional Neural Networks, CNNs) as an avenue towards increasing their run-time efficiency, and even fewer have proposed strategies to increase such resiliency. We herein address this challenge in the context of Bit-line Computing architectures, an embodiment of the in-memory computing paradigm tailored towards CNN applications. We show that little additional hardware is required to add highly effective error detection and mitigation in such platforms. In turn, our proposed scheme can cope with high error rates when performing memory accesses with no impact on CNNs accuracy, allowing for very aggressive voltage scaling. Complementary, we also show that CNN resiliency can be increased by algorithmic optimizations in addition to architectural ones, adopting a combined ensembling and pruning strategy that increases robustness while not inflating workload requirements. Experiments on different quantized CNN models reveal that our combined hardware/software approach enables the supply voltage to be reduced to just 650mV, decreasing the energy per inference up to 51.3%, without affecting the baseline CNN classification accuracy.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Great Lakes Symposium on VLSI 2022","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3526241.3530351","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

The growing popularity of edge computing has fostered the development of diverse solutions to support Artificial Intelligence (AI) in energy-constrained devices. Nonetheless, comparatively few efforts have focused on the resiliency exhibited by AI workloads (such as Convolutional Neural Networks, CNNs) as an avenue towards increasing their run-time efficiency, and even fewer have proposed strategies to increase such resiliency. We herein address this challenge in the context of Bit-line Computing architectures, an embodiment of the in-memory computing paradigm tailored towards CNN applications. We show that little additional hardware is required to add highly effective error detection and mitigation in such platforms. In turn, our proposed scheme can cope with high error rates when performing memory accesses with no impact on CNNs accuracy, allowing for very aggressive voltage scaling. Complementary, we also show that CNN resiliency can be increased by algorithmic optimizations in addition to architectural ones, adopting a combined ensembling and pruning strategy that increases robustness while not inflating workload requirements. Experiments on different quantized CNN models reveal that our combined hardware/software approach enables the supply voltage to be reduced to just 650mV, decreasing the energy per inference up to 51.3%, without affecting the baseline CNN classification accuracy.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于边缘的CNN推理的内存中容错计算架构
边缘计算的日益普及促进了各种解决方案的发展,以支持能源受限设备中的人工智能(AI)。尽管如此,相对较少的努力集中在人工智能工作负载(如卷积神经网络,cnn)所表现出的弹性上,作为提高其运行时效率的途径,甚至更少的人提出了增加这种弹性的策略。本文在位线计算架构的背景下解决了这一挑战,位线计算架构是为CNN应用量身定制的内存计算范例的体现。我们表明,在这样的平台中添加高效的错误检测和缓解只需要很少的额外硬件。反过来,我们提出的方案可以在不影响cnn精度的情况下处理内存访问时的高错误率,允许非常积极的电压缩放。此外,我们还表明,除了架构优化之外,CNN的弹性还可以通过算法优化来增加,采用组合的集成和修剪策略可以增加鲁棒性,同时不会增加工作量需求。在不同量化CNN模型上的实验表明,我们的硬件/软件组合方法使供电电压降至仅650mV,每次推理的能量降低高达51.3%,而不影响基线CNN分类精度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Reducing Power Consumption using Approximate Encoding for CNN Accelerators at the Edge Design and Evaluation of In-Exact Compressor based Approximate Multipliers MOCCA: A Process Variation Tolerant Systolic DNN Accelerator using CNFETs in Monolithic 3D ENTANGLE: An Enhanced Logic-locking Technique for Thwarting SAT and Structural Attacks Two 0.8 V, Highly Reliable RHBD 10T and 12T SRAM Cells for Aerospace Applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1