随机稀疏自适应对不精确的多级随机随机存储器阵列进行精确推断

2017 IEEE International Electron Devices Meeting (IEDM) Pub Date : 2017-12-01 DOI:10.1109/IEDM.2017.8268339

Abinash Mohanty, Xiaocong Du, Pai-Yu Chen, Jae-sun Seo, Shimeng Yu, Yu Cao

{"title":"随机稀疏自适应对不精确的多级随机随机存储器阵列进行精确推断","authors":"Abinash Mohanty, Xiaocong Du, Pai-Yu Chen, Jae-sun Seo, Shimeng Yu, Yu Cao","doi":"10.1109/IEDM.2017.8268339","DOIUrl":null,"url":null,"abstract":"An array of multi-level resistive memory devices (RRAMs) can speed up the computation of deep learning algorithms. However, when a pre-trained model is programmed to a real RRAM array for inference, its accuracy degrades due to many non-idealities, such as variations, quantization error, and stuck-at faults. A conventional solution involves multiple read-verify-write (R-V-W) for each RRAM cell, costing a long time because of the slow Write speed and cell-by-cell compensation. In this work, we propose a fundamentally new approach to overcome this issue: random sparse adaptation (RSA) after the model is transferred to the RRAM array. By randomly selecting a small portion of model parameters and mapping them to on-chip memory for further training, we demonstrate an efficient and fast method to recover the accuracy: in CNNs for MNIST and CIFAR-10, ∼5% of model parameters is sufficient for RSA even under excessive RRAM variations. As the back-propagation in training is only applied to RSA cells and there is no need of any Write operation on RRAM, the proposed RSA achieves 10–100X acceleration compared to R-V-W. Therefore, this hybrid solution with a large, inaccurate RRAM array and a small, accurate on-chip memory array promises both area efficiency and inference accuracy.","PeriodicalId":412333,"journal":{"name":"2017 IEEE International Electron Devices Meeting (IEDM)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":"{\"title\":\"Random sparse adaptation for accurate inference with inaccurate multi-level RRAM arrays\",\"authors\":\"Abinash Mohanty, Xiaocong Du, Pai-Yu Chen, Jae-sun Seo, Shimeng Yu, Yu Cao\",\"doi\":\"10.1109/IEDM.2017.8268339\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An array of multi-level resistive memory devices (RRAMs) can speed up the computation of deep learning algorithms. However, when a pre-trained model is programmed to a real RRAM array for inference, its accuracy degrades due to many non-idealities, such as variations, quantization error, and stuck-at faults. A conventional solution involves multiple read-verify-write (R-V-W) for each RRAM cell, costing a long time because of the slow Write speed and cell-by-cell compensation. In this work, we propose a fundamentally new approach to overcome this issue: random sparse adaptation (RSA) after the model is transferred to the RRAM array. By randomly selecting a small portion of model parameters and mapping them to on-chip memory for further training, we demonstrate an efficient and fast method to recover the accuracy: in CNNs for MNIST and CIFAR-10, ∼5% of model parameters is sufficient for RSA even under excessive RRAM variations. As the back-propagation in training is only applied to RSA cells and there is no need of any Write operation on RRAM, the proposed RSA achieves 10–100X acceleration compared to R-V-W. Therefore, this hybrid solution with a large, inaccurate RRAM array and a small, accurate on-chip memory array promises both area efficiency and inference accuracy.\",\"PeriodicalId\":412333,\"journal\":{\"name\":\"2017 IEEE International Electron Devices Meeting (IEDM)\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"23\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE International Electron Devices Meeting (IEDM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IEDM.2017.8268339\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Electron Devices Meeting (IEDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IEDM.2017.8268339","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 23

摘要

多层电阻存储器阵列(rram)可以加快深度学习算法的计算速度。然而，当一个预训练的模型被编程到一个真实的RRAM阵列进行推理时，由于许多非理想性，如变量、量化误差和卡在故障，其精度会降低。传统的解决方案涉及对每个RRAM单元进行多次读-验证-写(R-V-W)，由于写入速度慢和逐单元补偿，需要花费很长时间。在这项工作中，我们提出了一种全新的方法来克服这个问题:将模型转移到RRAM阵列后的随机稀疏适应(RSA)。通过随机选择一小部分模型参数并将其映射到片上存储器进行进一步训练，我们展示了一种高效快速的方法来恢复精度:在MNIST和CIFAR-10的cnn中，即使在过度的RRAM变化下，~ 5%的模型参数对于RSA来说也是足够的。由于训练中的反向传播仅应用于RSA单元，并且不需要对RRAM进行任何Write操作，因此与R-V-W相比，本文提出的RSA实现了10 - 100倍的加速。因此，这种混合解决方案具有大型，不精确的RRAM阵列和小型，精确的片上存储阵列，保证了面积效率和推理精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Random sparse adaptation for accurate inference with inaccurate multi-level RRAM arrays

An array of multi-level resistive memory devices (RRAMs) can speed up the computation of deep learning algorithms. However, when a pre-trained model is programmed to a real RRAM array for inference, its accuracy degrades due to many non-idealities, such as variations, quantization error, and stuck-at faults. A conventional solution involves multiple read-verify-write (R-V-W) for each RRAM cell, costing a long time because of the slow Write speed and cell-by-cell compensation. In this work, we propose a fundamentally new approach to overcome this issue: random sparse adaptation (RSA) after the model is transferred to the RRAM array. By randomly selecting a small portion of model parameters and mapping them to on-chip memory for further training, we demonstrate an efficient and fast method to recover the accuracy: in CNNs for MNIST and CIFAR-10, ∼5% of model parameters is sufficient for RSA even under excessive RRAM variations. As the back-propagation in training is only applied to RSA cells and there is no need of any Write operation on RRAM, the proposed RSA achieves 10–100X acceleration compared to R-V-W. Therefore, this hybrid solution with a large, inaccurate RRAM array and a small, accurate on-chip memory array promises both area efficiency and inference accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 IEEE International Electron Devices Meeting (IEDM)

自引率

0.00%

发文量