提高神经网络可靠性:神经元漏洞量化的硬件/软件合作启示

IF 3.6 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE IEEE Transactions on Computers Pub Date : 2024-03-09 DOI:10.1109/TC.2024.3398492
Jing Wang;Jinbin Zhu;Xin Fu;Di Zang;Keyao Li;Weigong Zhang
{"title":"提高神经网络可靠性:神经元漏洞量化的硬件/软件合作启示","authors":"Jing Wang;Jinbin Zhu;Xin Fu;Di Zang;Keyao Li;Weigong Zhang","doi":"10.1109/TC.2024.3398492","DOIUrl":null,"url":null,"abstract":"Ensuring the reliability of deep neural networks (DNNs) is paramount in safety-critical applications. Although introducing supplementary fault-tolerant mechanisms can augment the reliability of DNNs, an efficiency tradeoff may be introduced. This study reveals the inherent fault tolerance of neural networks, where individual neurons exhibit varying degrees of fault tolerance, by thoroughly exploring the structural attributes of DNNs. We thereby develop a hardware/software collaborative method that guarantees the reliability of DNNs while minimizing performance degradation. We introduce the neuron vulnerability factor (NVF) to quantify the susceptibility to soft errors. We propose two efficient methods that leverage the NVF to minimize the negative effects of soft errors on neurons. First, we present a novel computational scheduling scheme. By prioritizing error-prone neurons, the expedited completion of their computations is facilitated to mitigate the risk of neural computing errors that arise from soft errors without sacrificing efficiency. Second, we propose the NVF-guided heterogeneous memory system. We employ variable-strength error-correcting codes and tailor their error-correction mechanisms to the vulnerability profile of specific neurons to ensure a highly targeted approach for error mitigation. Our experimental results demonstrate that the proposed scheme enhances the neural network accuracy by 18% on average, while significantly reducing the fault-tolerance overhead.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 8","pages":"1953-1966"},"PeriodicalIF":3.6000,"publicationDate":"2024-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing Neural Network Reliability: Insights From Hardware/Software Collaboration With Neuron Vulnerability Quantization\",\"authors\":\"Jing Wang;Jinbin Zhu;Xin Fu;Di Zang;Keyao Li;Weigong Zhang\",\"doi\":\"10.1109/TC.2024.3398492\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Ensuring the reliability of deep neural networks (DNNs) is paramount in safety-critical applications. Although introducing supplementary fault-tolerant mechanisms can augment the reliability of DNNs, an efficiency tradeoff may be introduced. This study reveals the inherent fault tolerance of neural networks, where individual neurons exhibit varying degrees of fault tolerance, by thoroughly exploring the structural attributes of DNNs. We thereby develop a hardware/software collaborative method that guarantees the reliability of DNNs while minimizing performance degradation. We introduce the neuron vulnerability factor (NVF) to quantify the susceptibility to soft errors. We propose two efficient methods that leverage the NVF to minimize the negative effects of soft errors on neurons. First, we present a novel computational scheduling scheme. By prioritizing error-prone neurons, the expedited completion of their computations is facilitated to mitigate the risk of neural computing errors that arise from soft errors without sacrificing efficiency. Second, we propose the NVF-guided heterogeneous memory system. We employ variable-strength error-correcting codes and tailor their error-correction mechanisms to the vulnerability profile of specific neurons to ensure a highly targeted approach for error mitigation. Our experimental results demonstrate that the proposed scheme enhances the neural network accuracy by 18% on average, while significantly reducing the fault-tolerance overhead.\",\"PeriodicalId\":13087,\"journal\":{\"name\":\"IEEE Transactions on Computers\",\"volume\":\"73 8\",\"pages\":\"1953-1966\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2024-03-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Computers\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10527392/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computers","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10527392/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

摘要

在安全关键型应用中,确保深度神经网络(DNN)的可靠性至关重要。虽然引入辅助容错机制可以增强 DNN 的可靠性,但可能会带来效率上的折衷。本研究通过深入探讨 DNN 的结构属性,揭示了神经网络固有的容错性,即单个神经元表现出不同程度的容错性。因此,我们开发了一种硬件/软件协作方法,既能保证 DNN 的可靠性,又能最大限度地减少性能下降。我们引入了神经元易损性因子(NVF)来量化对软错误的易感性。我们提出了两种有效的方法,利用 NVF 将软错误对神经元的负面影响降至最低。首先,我们提出了一种新颖的计算调度方案。通过对容易出错的神经元进行优先排序,加速完成其计算,从而在不牺牲效率的前提下,降低软错误导致的神经计算错误风险。其次,我们提出了 NVF 引导的异构存储系统。我们采用强度可变的纠错码,并根据特定神经元的脆弱性特征定制纠错机制,以确保采用极具针对性的方法来缓解错误。我们的实验结果表明,所提出的方案平均提高了 18% 的神经网络准确性,同时显著降低了容错开销。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Enhancing Neural Network Reliability: Insights From Hardware/Software Collaboration With Neuron Vulnerability Quantization
Ensuring the reliability of deep neural networks (DNNs) is paramount in safety-critical applications. Although introducing supplementary fault-tolerant mechanisms can augment the reliability of DNNs, an efficiency tradeoff may be introduced. This study reveals the inherent fault tolerance of neural networks, where individual neurons exhibit varying degrees of fault tolerance, by thoroughly exploring the structural attributes of DNNs. We thereby develop a hardware/software collaborative method that guarantees the reliability of DNNs while minimizing performance degradation. We introduce the neuron vulnerability factor (NVF) to quantify the susceptibility to soft errors. We propose two efficient methods that leverage the NVF to minimize the negative effects of soft errors on neurons. First, we present a novel computational scheduling scheme. By prioritizing error-prone neurons, the expedited completion of their computations is facilitated to mitigate the risk of neural computing errors that arise from soft errors without sacrificing efficiency. Second, we propose the NVF-guided heterogeneous memory system. We employ variable-strength error-correcting codes and tailor their error-correction mechanisms to the vulnerability profile of specific neurons to ensure a highly targeted approach for error mitigation. Our experimental results demonstrate that the proposed scheme enhances the neural network accuracy by 18% on average, while significantly reducing the fault-tolerance overhead.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Computers
IEEE Transactions on Computers 工程技术-工程:电子与电气
CiteScore
6.60
自引率
5.40%
发文量
199
审稿时长
6.0 months
期刊介绍: The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field. It publishes papers on research in areas of current interest to the readers. These areas include, but are not limited to, the following: a) computer organizations and architectures; b) operating systems, software systems, and communication protocols; c) real-time systems and embedded systems; d) digital devices, computer components, and interconnection networks; e) specification, design, prototyping, and testing methods and tools; f) performance, fault tolerance, reliability, security, and testability; g) case studies and experimental and theoretical evaluations; and h) new and important applications and trends.
期刊最新文献
CUSPX: Efficient GPU Implementations of Post-Quantum Signature SPHINCS+ Chiplet-Gym: Optimizing Chiplet-based AI Accelerator Design with Reinforcement Learning FLALM: A Flexible Low Area-Latency Montgomery Modular Multiplication on FPGA Novel Lagrange Multipliers-Driven Adaptive Offloading for Vehicular Edge Computing Leveraging GPU in Homomorphic Encryption: Framework Design and Analysis of BFV Variants
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1