On the Dependability of Bidirectional Encoder Representations from Transformers (BERT) to Soft Errors

IF 2.1 4区工程技术 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Nanotechnology Pub Date : 2025-01-20 DOI:10.1109/TNANO.2025.3531721

Zhen Gao;Ziye Yin;Jingyan Wang;Rui Su;Jie Deng;Qiang Liu;Pedro Reviriego;Shanshan Liu;Fabrizio Lombardi

{"title":"On the Dependability of Bidirectional Encoder Representations from Transformers (BERT) to Soft Errors","authors":"Zhen Gao;Ziye Yin;Jingyan Wang;Rui Su;Jie Deng;Qiang Liu;Pedro Reviriego;Shanshan Liu;Fabrizio Lombardi","doi":"10.1109/TNANO.2025.3531721","DOIUrl":null,"url":null,"abstract":"Transformers are widely used in natural language processing and computer vision, and Bidirectional Encoder Representations from Transformers (BERT) is one of the most popular pre-trained transformer models for many applications. This paper studies the dependability and impact of soft errors on BERT implemented with different floating-point formats using two case studies: sentence emotion classification and question answering. Simulation by error injection is conducted to assess the impact of errors on different parts of the BERT model and different bits of the parameters. The analysis of the results leads to the following findings: 1) in both single and half precision, there is a Critical Bit (CB) on which errors significantly affect the performance of the model; 2) in single precision, errors on the CB may cause overflow in many cases, which leads to a fixed result regardless of the input; 3) in half precision, the errors do not cause overflow but they may still introduce a large accuracy loss. In general, the impact of errors is significantly larger in single-precision than half-precision parameters. Error propagation analysis is also considered to further study the effects of errors on different types of parameters and reveal the mitigation effects of the activation function and the intrinsic redundancy of BERT.","PeriodicalId":449,"journal":{"name":"IEEE Transactions on Nanotechnology","volume":"24 ","pages":"73-87"},"PeriodicalIF":2.1000,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Nanotechnology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10845154/","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Transformers are widely used in natural language processing and computer vision, and Bidirectional Encoder Representations from Transformers (BERT) is one of the most popular pre-trained transformer models for many applications. This paper studies the dependability and impact of soft errors on BERT implemented with different floating-point formats using two case studies: sentence emotion classification and question answering. Simulation by error injection is conducted to assess the impact of errors on different parts of the BERT model and different bits of the parameters. The analysis of the results leads to the following findings: 1) in both single and half precision, there is a Critical Bit (CB) on which errors significantly affect the performance of the model; 2) in single precision, errors on the CB may cause overflow in many cases, which leads to a fixed result regardless of the input; 3) in half precision, the errors do not cause overflow but they may still introduce a large accuracy loss. In general, the impact of errors is significantly larger in single-precision than half-precision parameters. Error propagation analysis is also considered to further study the effects of errors on different types of parameters and reveal the mitigation effects of the activation function and the intrinsic redundancy of BERT.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Nanotechnology 工程技术-材料科学：综合

CiteScore

4.80

自引率

8.30%

发文量

审稿时长

8.3 months

期刊介绍： The IEEE Transactions on Nanotechnology is devoted to the publication of manuscripts of archival value in the general area of nanotechnology, which is rapidly emerging as one of the fastest growing and most promising new technological developments for the next generation and beyond.