{"title":"A cross-layer approach to online adaptive reliability prediction of transient faults","authors":"Bahareh J. Farahani, S. Safari","doi":"10.1109/DFT.2015.7315165","DOIUrl":null,"url":null,"abstract":"As the semiconductor industry migrates into the nanometer regime, processors become increasingly susceptible to transient faults. Such faults usually stem either from soft errors due to particle strikes or timing violations due to Process, Voltage, Temperature, and Aging (PVTA) variations. These faults can propagate from circuit-level to application-level and alter the correct execution output of the application. For generations, designers build high-level resiliency such that the details of the underlying circuit are an abstraction that could be neglected. This paper argues that in contrast to the prior work, which only take into account Architectural Vulnerability Factor (AVF) as a measure to guide fault tolerant techniques, the vulnerability of each abstraction layer of design stack from circuit going up to instruction and application layers should be considered. This paper presents a novel online cross-layer reliability prediction technique based on learning algorithms which can anticipate the susceptibility of the processor considering both lower-level and higher-level details in an adaptive fashion. According to the results, the proposed technique can predict the future reliability with 6% error on average across SPEC2000 benchmarks. Our technique by forecasting the reliability emergencies can assist proactive fault tolerant techniques to maintain the reliability constraints more efficiently in comparison to reactive strategies.","PeriodicalId":383972,"journal":{"name":"2015 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFTS)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFTS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DFT.2015.7315165","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
As the semiconductor industry migrates into the nanometer regime, processors become increasingly susceptible to transient faults. Such faults usually stem either from soft errors due to particle strikes or timing violations due to Process, Voltage, Temperature, and Aging (PVTA) variations. These faults can propagate from circuit-level to application-level and alter the correct execution output of the application. For generations, designers build high-level resiliency such that the details of the underlying circuit are an abstraction that could be neglected. This paper argues that in contrast to the prior work, which only take into account Architectural Vulnerability Factor (AVF) as a measure to guide fault tolerant techniques, the vulnerability of each abstraction layer of design stack from circuit going up to instruction and application layers should be considered. This paper presents a novel online cross-layer reliability prediction technique based on learning algorithms which can anticipate the susceptibility of the processor considering both lower-level and higher-level details in an adaptive fashion. According to the results, the proposed technique can predict the future reliability with 6% error on average across SPEC2000 benchmarks. Our technique by forecasting the reliability emergencies can assist proactive fault tolerant techniques to maintain the reliability constraints more efficiently in comparison to reactive strategies.