{"title":"Improving Dependability of Neuromorphic Computing With Non-Volatile Memory","authors":"Shihao Song, Anup Das, Nagarajan Kandasamy","doi":"10.1109/EDCC51268.2020.00013","DOIUrl":null,"url":null,"abstract":"As process technology continues to scale aggressively, circuit aging in a neuromorphic hardware due to negative bias temperature instability (NBTI) and time-dependent dielectric breakdown (TDDB) is becoming a critical reliability issue and is expected to proliferate when using non-volatile memory (NVM) for synaptic storage. This is because NVM devices require high voltages and currents to access their synaptic weights, which further accelerate the circuit aging in neuromorphic hardware. Current methods for qualifying reliability are overly conservative, since they estimate circuit aging considering worst-case operating conditions and unnecessarily constrain performance. This paper proposes RENEU, a reliability-oriented approach to map machine learning applications to neuromorphic hardware, with the aim of improving system-wide reliability, without compromising key performance metrics such as execution time of these applications on the hardware. Fundamental to RENEU is a novel formulation of the aging of CMOS-based circuits in a neuromorphic hardware considering different failure mechanisms. Using this formulation, RENEU develops a system- wide reliability model which can be used inside a design-space exploration framework involving the mapping of neurons and synapses to the hardware. To this end, RENEU uses an instance of Particle Swarm Optimization (PSO) to generate mappings that are Pareto-optimal in terms of performance and reliability. We evaluate RENEU using different machine learning applications on a state-of-the-art neuromorphic hardware with NVM synapses. Our results demonstrate an average 38% reduction in circuit aging, leading to an average 18% improvement in the lifetime of the hardware compared to current practices. RENEU only introduces a marginal performance overhead of 5% compared to a performance-oriented state-of-the-art.","PeriodicalId":212573,"journal":{"name":"2020 16th European Dependable Computing Conference (EDCC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 16th European Dependable Computing Conference (EDCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EDCC51268.2020.00013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 25
Abstract
As process technology continues to scale aggressively, circuit aging in a neuromorphic hardware due to negative bias temperature instability (NBTI) and time-dependent dielectric breakdown (TDDB) is becoming a critical reliability issue and is expected to proliferate when using non-volatile memory (NVM) for synaptic storage. This is because NVM devices require high voltages and currents to access their synaptic weights, which further accelerate the circuit aging in neuromorphic hardware. Current methods for qualifying reliability are overly conservative, since they estimate circuit aging considering worst-case operating conditions and unnecessarily constrain performance. This paper proposes RENEU, a reliability-oriented approach to map machine learning applications to neuromorphic hardware, with the aim of improving system-wide reliability, without compromising key performance metrics such as execution time of these applications on the hardware. Fundamental to RENEU is a novel formulation of the aging of CMOS-based circuits in a neuromorphic hardware considering different failure mechanisms. Using this formulation, RENEU develops a system- wide reliability model which can be used inside a design-space exploration framework involving the mapping of neurons and synapses to the hardware. To this end, RENEU uses an instance of Particle Swarm Optimization (PSO) to generate mappings that are Pareto-optimal in terms of performance and reliability. We evaluate RENEU using different machine learning applications on a state-of-the-art neuromorphic hardware with NVM synapses. Our results demonstrate an average 38% reduction in circuit aging, leading to an average 18% improvement in the lifetime of the hardware compared to current practices. RENEU only introduces a marginal performance overhead of 5% compared to a performance-oriented state-of-the-art.