N. Huang, Min-Syue Yang, Ya-Chu Chang, Kai-Chiang Wu
{"title":"Decomposable Architecture and Fault Mitigation Methodology for Deep Learning Accelerators","authors":"N. Huang, Min-Syue Yang, Ya-Chu Chang, Kai-Chiang Wu","doi":"10.1109/ISQED57927.2023.10129283","DOIUrl":null,"url":null,"abstract":"As the demand for data analysis increases rapidly, artificial intelligence (AI) models have been developed for various applications. Many deep neural networks are presented with millions or billions of parameters and operations for AI computation. Therefore, many AI accelerators apply pipelined architectures with simple but dense computational elements for numerous operations. However, manufacturing-induced faults cause a challenge to computational robustness or yield degradation on those AI accelerators. In this paper, we propose a fault mitigation methodology based on decomposable systolic arrays. By leveraging the inherent error resilience of AI applications, our data arrangement can reduce the difference between accurate results and faulty results. Additionally, utilizing both our proposed data arrangement and sign compensation can further mitigate the influence of faults in AI accelerators. In the experiments, our proposed fault mitigation methodology can maintain the application accuracy at a certain level, which outperforms state-of-the-art methods. When 0.1% of multiplier-accumulators are faulty in a systolic array, the array with our proposed fault mitigation methodology can have less than 0.5% accuracy loss while executing ResNet-18 for ImageNet classification.","PeriodicalId":315053,"journal":{"name":"2023 24th International Symposium on Quality Electronic Design (ISQED)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 24th International Symposium on Quality Electronic Design (ISQED)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISQED57927.2023.10129283","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Decomposable Architecture and Fault Mitigation Methodology for Deep Learning Accelerators
As the demand for data analysis increases rapidly, artificial intelligence (AI) models have been developed for various applications. Many deep neural networks are presented with millions or billions of parameters and operations for AI computation. Therefore, many AI accelerators apply pipelined architectures with simple but dense computational elements for numerous operations. However, manufacturing-induced faults cause a challenge to computational robustness or yield degradation on those AI accelerators. In this paper, we propose a fault mitigation methodology based on decomposable systolic arrays. By leveraging the inherent error resilience of AI applications, our data arrangement can reduce the difference between accurate results and faulty results. Additionally, utilizing both our proposed data arrangement and sign compensation can further mitigate the influence of faults in AI accelerators. In the experiments, our proposed fault mitigation methodology can maintain the application accuracy at a certain level, which outperforms state-of-the-art methods. When 0.1% of multiplier-accumulators are faulty in a systolic array, the array with our proposed fault mitigation methodology can have less than 0.5% accuracy loss while executing ResNet-18 for ImageNet classification.