An Empirical Fault Vulnerability Exploration of ReRAM-Based Process-in-Memory CNN Accelerators

IF 5 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE IEEE Transactions on Reliability Pub Date : 2024-06-06 DOI:10.1109/TR.2024.3405825

Aniseh Dorostkar;Hamed Farbeh;Hamid R. Zarandi

{"title":"An Empirical Fault Vulnerability Exploration of ReRAM-Based Process-in-Memory CNN Accelerators","authors":"Aniseh Dorostkar;Hamed Farbeh;Hamid R. Zarandi","doi":"10.1109/TR.2024.3405825","DOIUrl":null,"url":null,"abstract":"Resistive random-access memory (ReRAM)-based <italic>processing-in-memory</i> (PIM) accelerator is a promising platform for processing massively memory intensive matrix-vector multiplications of neural networks in parallel domain, due to its capability of analog computation, ultra-high density, near-zero leakage current, and nonvolatility. Despite many advantages, ReRAM-based accelerators are highly error-prone due to limitations of technology fabrication that lead to process variations and defects. These limitations degrade the accuracy of deep convolutional neural networks (CNNs) (Deep CNNs) running on PIM accelerators. While these CNNs accelerators are widely deployed in safety-critical systems, their vulnerability to fault is not well explored. In this article, we have developed a fault-injection framework to investigate the vulnerability of large-scale CNNs at both software- and hardware-level of inference phases. Faulty ReRAM devices are another reliability challenges due to significant degradation of classification accuracy when CNN parameters are mapped to the accelerators. To investigate this challenge, we map the CNN learning parameter to the ReRAM crossbar and inject faults into crossbar arrays. The proposed framework analyzes the impact of <italic>stuck-at high</i> (SaH) and <italic>stuck-at low</i> (SaL) fault models on different layers and locations of CNN learning parameters. By performing extensive fault injections, we illustrate that the vulnerability behavior of ReRAM-based PIM accelerator for CNNs is greatly impressible to the types and depth of layers, the location of the learning parameter in every layer, and the value and types of faults. Our observations show that different models have different vulnerabilities to faults. Specifically, we show that SaL further reduces classification accuracy than SaH.","PeriodicalId":56305,"journal":{"name":"IEEE Transactions on Reliability","volume":"74 1","pages":"2290-2304"},"PeriodicalIF":5.0000,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Reliability","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10551492/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Resistive random-access memory (ReRAM)-based processing-in-memory (PIM) accelerator is a promising platform for processing massively memory intensive matrix-vector multiplications of neural networks in parallel domain, due to its capability of analog computation, ultra-high density, near-zero leakage current, and nonvolatility. Despite many advantages, ReRAM-based accelerators are highly error-prone due to limitations of technology fabrication that lead to process variations and defects. These limitations degrade the accuracy of deep convolutional neural networks (CNNs) (Deep CNNs) running on PIM accelerators. While these CNNs accelerators are widely deployed in safety-critical systems, their vulnerability to fault is not well explored. In this article, we have developed a fault-injection framework to investigate the vulnerability of large-scale CNNs at both software- and hardware-level of inference phases. Faulty ReRAM devices are another reliability challenges due to significant degradation of classification accuracy when CNN parameters are mapped to the accelerators. To investigate this challenge, we map the CNN learning parameter to the ReRAM crossbar and inject faults into crossbar arrays. The proposed framework analyzes the impact of stuck-at high (SaH) and stuck-at low (SaL) fault models on different layers and locations of CNN learning parameters. By performing extensive fault injections, we illustrate that the vulnerability behavior of ReRAM-based PIM accelerator for CNNs is greatly impressible to the types and depth of layers, the location of the learning parameter in every layer, and the value and types of faults. Our observations show that different models have different vulnerabilities to faults. Specifically, we show that SaL further reduces classification accuracy than SaH.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于 ReRAM 的内存进程 CNN 加速器的经验故障脆弱性探索

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Reliability 工程技术-工程：电子与电气

CiteScore

12.20

自引率

8.50%

发文量

153

审稿时长

7.5 months

期刊介绍： IEEE Transactions on Reliability is a refereed journal for the reliability and allied disciplines including, but not limited to, maintainability, physics of failure, life testing, prognostics, design and manufacture for reliability, reliability for systems of systems, network availability, mission success, warranty, safety, and various measures of effectiveness. Topics eligible for publication range from hardware to software, from materials to systems, from consumer and industrial devices to manufacturing plants, from individual items to networks, from techniques for making things better to ways of predicting and measuring behavior in the field. As an engineering subject that supports new and existing technologies, we constantly expand into new areas of the assurance sciences.

期刊最新文献

Table of Contents IEEE Reliability Society Information Editorial: Applied AI for Reliability and Cybersecurity 2024 Index IEEE Transactions on Reliability Vol. 73 Table of Contents