Soft Error Resilient Deep Learning Systems Using Neuron Gradient Statistics

2022 IEEE 28th International Symposium on On-Line Testing and Robust System Design (IOLTS) Pub Date : 2022-09-12 DOI:10.1109/IOLTS56730.2022.9897815

C. Amarnath, Mohamed Mejri, Kwondo Ma, A. Chatterjee

{"title":"Soft Error Resilient Deep Learning Systems Using Neuron Gradient Statistics","authors":"C. Amarnath, Mohamed Mejri, Kwondo Ma, A. Chatterjee","doi":"10.1109/IOLTS56730.2022.9897815","DOIUrl":null,"url":null,"abstract":"Deep learning techniques have been widely adopted in daily life with applications ranging from face recognition to recommender systems. The substantial overhead of conventional error tolerance techniques precludes their widespread use, while approaches involving median filtering and invariant generation rely on alterations to DNN training that may be difficult to achieve for larger networks on larger datasets. To address this issue, this paper presents a novel approach taking advantage of the statistics of neuron output gradients to identify and suppress erroneous neuron values. By using the statistics of neurons’ gradients with respect to their neighbors, tighter statistical thresholds are obtained compared to the use of neuron output values alone. This approach is modular and is combined with accurate, low-overhead error detection methods to ensure it is used only when needed, further reducing its cost. Deep learning models can be trained using standard methods and our error correction module is fit to a trained DNN, achieving comparable or superior performance compared to baseline error correction methods while incurring comparable hardware overhead without needing to modify DNN training or utilize specialized hardware architectures.","PeriodicalId":274595,"journal":{"name":"2022 IEEE 28th International Symposium on On-Line Testing and Robust System Design (IOLTS)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 28th International Symposium on On-Line Testing and Robust System Design (IOLTS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IOLTS56730.2022.9897815","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

Deep learning techniques have been widely adopted in daily life with applications ranging from face recognition to recommender systems. The substantial overhead of conventional error tolerance techniques precludes their widespread use, while approaches involving median filtering and invariant generation rely on alterations to DNN training that may be difficult to achieve for larger networks on larger datasets. To address this issue, this paper presents a novel approach taking advantage of the statistics of neuron output gradients to identify and suppress erroneous neuron values. By using the statistics of neurons’ gradients with respect to their neighbors, tighter statistical thresholds are obtained compared to the use of neuron output values alone. This approach is modular and is combined with accurate, low-overhead error detection methods to ensure it is used only when needed, further reducing its cost. Deep learning models can be trained using standard methods and our error correction module is fit to a trained DNN, achieving comparable or superior performance compared to baseline error correction methods while incurring comparable hardware overhead without needing to modify DNN training or utilize specialized hardware architectures.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于神经元梯度统计的软误差弹性深度学习系统

深度学习技术已经广泛应用于日常生活中，从人脸识别到推荐系统。传统容错技术的巨大开销阻碍了它们的广泛使用，而涉及中值过滤和不变生成的方法依赖于对DNN训练的改变，这可能难以在更大的数据集上实现更大的网络。为了解决这个问题，本文提出了一种利用神经元输出梯度统计来识别和抑制错误神经元值的新方法。与单独使用神经元输出值相比，通过使用神经元相对于其邻居的梯度统计，可以获得更严格的统计阈值。这种方法是模块化的，并与精确、低开销的错误检测方法相结合，以确保仅在需要时使用，从而进一步降低成本。深度学习模型可以使用标准方法进行训练，我们的纠错模块适合训练好的深度神经网络，与基线纠错方法相比，实现相当或更好的性能，同时产生相当的硬件开销，而无需修改深度神经网络训练或利用专门的硬件架构。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2022 IEEE 28th International Symposium on On-Line Testing and Robust System Design (IOLTS)

自引率

0.00%

发文量

期刊最新文献

Structural Test Generation for AI Accelerators using Neural Twins Radiation-induced Effects on DMA Data Transfer in Reconfigurable Devices Functional and Timing Implications of Transient Faults in Critical Systems All Digital Low-Cost Built-in Defect Testing Strategy for Operational Amplifiers with High Coverage IOLTS 2022 Cover Page