Comparative study: AutoDPR-SEM for enhancing CNN reliability in SRAM-based FPGAs through autonomous reconfiguration

IF 1.6 4区 工程技术 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC Microelectronics Reliability Pub Date : 2024-04-20 DOI:10.1016/j.microrel.2024.115392
Haonan Tian , Younis Ibrahim , Rui Chen , Yixiu Wang , Chen Jin , George Belev , Li Chen
{"title":"Comparative study: AutoDPR-SEM for enhancing CNN reliability in SRAM-based FPGAs through autonomous reconfiguration","authors":"Haonan Tian ,&nbsp;Younis Ibrahim ,&nbsp;Rui Chen ,&nbsp;Yixiu Wang ,&nbsp;Chen Jin ,&nbsp;George Belev ,&nbsp;Li Chen","doi":"10.1016/j.microrel.2024.115392","DOIUrl":null,"url":null,"abstract":"<div><p>Convolutional neural networks (CNNs) are widely adopted in safety-critical systems, including space applications and autonomous vehicles. Field-programmable gate arrays (FPGAs) based on SRAM are preferred for accelerating CNN computations due to their unique characteristics. However, the configuration memory of FPGAs is susceptible to single event effects (SEEs), which can corrupt computations and lead to misclassification of CNN outputs. In this study, we investigated the impact of SEEs on SRAM-based FPGAs with Two-Photon Absorption (TPA) laser fault injections through a comparative analysis of two popular CNN acceleration architectures: streaming architecture (SA) and single compute engine (SCE). Experimental results show that SA-based CNNs require more hardware resources but exhibit superior resilience against single event upsets (SEUs). Without any Radiation Hardened by Design (RHBD) protection, SCE has an error rate approximately twice as high as SA. To mitigate errors, the Xilinx IP core - Soft Error Mitigation (SEM) is used for error detection and correction, leading to error rate reductions of up to 50 % in both architectures. Importantly, we propose the AutoDPR-SEM (Autonomous Dynamic Partial Reconfiguration for Soft Error Mitigation) approach, which automatically reconfigures the SEM IP core when it remains idle due to uncorrectable errors. AutoDPR-SEM significantly improves CNN error rates, reducing errors by approximately 17.8 times in SCE and 14.8 times in SA. We also applied software level simulation to validate the TPA experiment, showing similar trends of the testing results across all models. In conclusion, the study confirms the feasibility of AutoDPR-SEM in both architectures, showcasing its potential to improve CNN error rates in safety-critical systems.</p></div>","PeriodicalId":51131,"journal":{"name":"Microelectronics Reliability","volume":null,"pages":null},"PeriodicalIF":1.6000,"publicationDate":"2024-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microelectronics Reliability","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0026271424000726","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Convolutional neural networks (CNNs) are widely adopted in safety-critical systems, including space applications and autonomous vehicles. Field-programmable gate arrays (FPGAs) based on SRAM are preferred for accelerating CNN computations due to their unique characteristics. However, the configuration memory of FPGAs is susceptible to single event effects (SEEs), which can corrupt computations and lead to misclassification of CNN outputs. In this study, we investigated the impact of SEEs on SRAM-based FPGAs with Two-Photon Absorption (TPA) laser fault injections through a comparative analysis of two popular CNN acceleration architectures: streaming architecture (SA) and single compute engine (SCE). Experimental results show that SA-based CNNs require more hardware resources but exhibit superior resilience against single event upsets (SEUs). Without any Radiation Hardened by Design (RHBD) protection, SCE has an error rate approximately twice as high as SA. To mitigate errors, the Xilinx IP core - Soft Error Mitigation (SEM) is used for error detection and correction, leading to error rate reductions of up to 50 % in both architectures. Importantly, we propose the AutoDPR-SEM (Autonomous Dynamic Partial Reconfiguration for Soft Error Mitigation) approach, which automatically reconfigures the SEM IP core when it remains idle due to uncorrectable errors. AutoDPR-SEM significantly improves CNN error rates, reducing errors by approximately 17.8 times in SCE and 14.8 times in SA. We also applied software level simulation to validate the TPA experiment, showing similar trends of the testing results across all models. In conclusion, the study confirms the feasibility of AutoDPR-SEM in both architectures, showcasing its potential to improve CNN error rates in safety-critical systems.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
比较研究:通过自主重新配置提高基于 SRAM FPGA 的 CNN 可靠性的 AutoDPR-SEM
卷积神经网络(CNN)被广泛应用于包括空间应用和自动驾驶汽车在内的安全关键系统中。基于 SRAM 的现场可编程门阵列 (FPGA) 因其独一无二的特性而成为加速 CNN 计算的首选。然而,FPGA 的配置存储器容易受到单事件效应 (SEE) 的影响,从而破坏计算并导致 CNN 输出分类错误。在本研究中,我们通过对流式架构(SA)和单计算引擎(SCE)这两种流行的 CNN 加速架构进行比较分析,研究了 SEE 对基于 SRAM 的 FPGA 的双光子吸收(TPA)激光故障注入的影响。实验结果表明,基于流架构的 CNN 需要更多的硬件资源,但对单次事件中断(SEUs)的恢复能力更强。在没有任何辐射加固设计(RHBD)保护的情况下,SCE 的错误率约为 SA 的两倍。为了减少错误,赛灵思 IP 核--软错误缓解(SEM)被用于错误检测和纠正,从而使两种架构的错误率都降低了 50%。重要的是,我们提出了 AutoDPR-SEM(用于软错误缓解的自主动态部分重新配置)方法,当 SEM IP 核因无法纠正错误而处于空闲状态时,它会自动重新配置。AutoDPR-SEM 显著提高了 CNN 错误率,在 SCE 中将错误减少了约 17.8 倍,在 SA 中将错误减少了约 14.8 倍。我们还应用软件级仿真验证了 TPA 实验,结果显示所有模型的测试结果趋势相似。总之,这项研究证实了 AutoDPR-SEM 在这两种架构中的可行性,展示了其改善安全关键型系统中 CNN 错误率的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Microelectronics Reliability
Microelectronics Reliability 工程技术-工程:电子与电气
CiteScore
3.30
自引率
12.50%
发文量
342
审稿时长
68 days
期刊介绍: Microelectronics Reliability, is dedicated to disseminating the latest research results and related information on the reliability of microelectronic devices, circuits and systems, from materials, process and manufacturing, to design, testing and operation. The coverage of the journal includes the following topics: measurement, understanding and analysis; evaluation and prediction; modelling and simulation; methodologies and mitigation. Papers which combine reliability with other important areas of microelectronics engineering, such as design, fabrication, integration, testing, and field operation will also be welcome, and practical papers reporting case studies in the field and specific application domains are particularly encouraged. Most accepted papers will be published as Research Papers, describing significant advances and completed work. Papers reviewing important developing topics of general interest may be accepted for publication as Review Papers. Urgent communications of a more preliminary nature and short reports on completed practical work of current interest may be considered for publication as Research Notes. All contributions are subject to peer review by leading experts in the field.
期刊最新文献
Modeling of HCI effect in nFinFET for circuit reliability simulation Signal integrity and heat transfer performance of through-boron nitride via A comprehensive investigation of total ionizing dose effects on bulk FinFETs through TCAD simulation Solder joints stress analysis and optimization of chip component under shear and tensile load based on orthogonal experimental design and gray correlation analysis Effects of different air gaps of underfill encapsulant on multi-stack printed circuit board
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1