Rafail Psiakis, A. Kritikakou, O. Sentieys, E. Casseau
{"title":"Run-Time Coarse-Grained Hardware Mitigation for Multiple Faults on VLIW Processors","authors":"Rafail Psiakis, A. Kritikakou, O. Sentieys, E. Casseau","doi":"10.1109/DASIP48288.2019.9049194","DOIUrl":null,"url":null,"abstract":"As transistors scale down, processors are more vulnerable to radiation that can cause multiple transient faults in function units. Rather than excluding these units from execution, performance overhead of VLIW processors can be reduced when fault-free components of these affected units are still used. In the proposed approach, the function units are enhanced with coarse-grained fault detectors. A re-scheduling of the instructions is performed at run-time to use not only the healthy function units, but also the fault-free components of the faulty function units. The scheduling window of the proposed mechanism is two instruction bundles being able to explore mitigation solutions in the current and the next instruction execution. Experiments show that the proposed approach can mitigate a large number of faults with low performance and area overheads.","PeriodicalId":120855,"journal":{"name":"2019 Conference on Design and Architectures for Signal and Image Processing (DASIP)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Conference on Design and Architectures for Signal and Image Processing (DASIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DASIP48288.2019.9049194","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
As transistors scale down, processors are more vulnerable to radiation that can cause multiple transient faults in function units. Rather than excluding these units from execution, performance overhead of VLIW processors can be reduced when fault-free components of these affected units are still used. In the proposed approach, the function units are enhanced with coarse-grained fault detectors. A re-scheduling of the instructions is performed at run-time to use not only the healthy function units, but also the fault-free components of the faulty function units. The scheduling window of the proposed mechanism is two instruction bundles being able to explore mitigation solutions in the current and the next instruction execution. Experiments show that the proposed approach can mitigate a large number of faults with low performance and area overheads.