Pub Date : 2023-09-13DOI: 10.1109/JXCDC.2023.3315134
Matthew Spear;Joshua E. Kim;Christopher H. Bennett;Sapan Agarwal;Matthew J. Marinella;T. Patrick Xiao
The analog-to-digital converter (ADC) is not only a key component in analog in-memory computing (IMC) accelerators but also a bottleneck for the efficiency and accuracy of these systems. While the tradeoffs between power consumption, latency, and area in ADC design are well studied, it is relatively unknown which ADC implementations are optimal for algorithmic accuracy, particularly for neural network inference. We explore the design space of the ADC with a focus on accuracy, investigating the sensitivity of neural network outputs to component variability inside the ADC and how this sensitivity depends on the ADC architecture. The compact models of the pipeline, cyclic, successive-approximation-register (SAR) and ramp ADCs are developed, and these models are used in a system-level accuracy simulation of analog neural network inference. Our results show how the accuracy on a complex image recognition benchmark (ResNet50 on ImageNet) depends on the capacitance mismatch, comparator offset, and effective number of bits (ENOB) for each of the four ADC architectures. We find that robustness to component variations depends strongly on the ADC design and that inference accuracy is particularly sensitive to the value-dependent error characteristics of the ADC, which cannot be captured by the conventional ENOB precision metric.
{"title":"The Impact of Analog-to-Digital Converter Architecture and Variability on Analog Neural Network Accuracy","authors":"Matthew Spear;Joshua E. Kim;Christopher H. Bennett;Sapan Agarwal;Matthew J. Marinella;T. Patrick Xiao","doi":"10.1109/JXCDC.2023.3315134","DOIUrl":"10.1109/JXCDC.2023.3315134","url":null,"abstract":"The analog-to-digital converter (ADC) is not only a key component in analog in-memory computing (IMC) accelerators but also a bottleneck for the efficiency and accuracy of these systems. While the tradeoffs between power consumption, latency, and area in ADC design are well studied, it is relatively unknown which ADC implementations are optimal for algorithmic accuracy, particularly for neural network inference. We explore the design space of the ADC with a focus on accuracy, investigating the sensitivity of neural network outputs to component variability inside the ADC and how this sensitivity depends on the ADC architecture. The compact models of the pipeline, cyclic, successive-approximation-register (SAR) and ramp ADCs are developed, and these models are used in a system-level accuracy simulation of analog neural network inference. Our results show how the accuracy on a complex image recognition benchmark (ResNet50 on ImageNet) depends on the capacitance mismatch, comparator offset, and effective number of bits (ENOB) for each of the four ADC architectures. We find that robustness to component variations depends strongly on the ADC design and that inference accuracy is particularly sensitive to the value-dependent error characteristics of the ADC, which cannot be captured by the conventional ENOB precision metric.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 2","pages":"176-184"},"PeriodicalIF":2.4,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10250846","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135402296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-11DOI: 10.1109/JXCDC.2023.3313127
An Qi Zhang;Amr M. S. Tosson;Dylan Ma;Ryan Fang;Lan Wei
Devices using emerging technologies and materials with the potential to outperform their silicon counterpart are actively explored in search of ways to extend Moore’s law. Among these technologies, low dimensional channel materials (LDMs) devices, such as carbon nanotube field-effect transistors (CNFETs), are promising to eventually outperform silicon CMOS. As these technologies are in their early development stages, their devices still suffer from high levels of defects and variations, thus unsuitable for nowadays general-purpose applications. On the other hand, applications with inherent error resilience and high-performance demands would suppress the impact of process imperfection and benefit from the performance boost. These applications, including image processing and machine learning through neural networks, would be the ideal targets for adopting these new emerging technologies even in their early stage of technology and process development. In this article, the effects of stuck-at faults in CNFET static random access memory (SRAM)-based multilayer perceptron (MLP) neural network are investigated. The impacts of various fault patterns are analyzed. Several fault recovery techniques are introduced, and their effectiveness is analyzed under different scenarios. With the proposed recovery techniques, the system can recover and tolerate a high level of stuck-at faults up to 40%, paving the path to adopt the early-stage and faulty emerging devices technologies in such high-demand applications.
{"title":"Stuck-at Faults Tolerance and Recovery in MLP Neural Networks Using Imperfect Emerging CNFET Technology","authors":"An Qi Zhang;Amr M. S. Tosson;Dylan Ma;Ryan Fang;Lan Wei","doi":"10.1109/JXCDC.2023.3313127","DOIUrl":"10.1109/JXCDC.2023.3313127","url":null,"abstract":"Devices using emerging technologies and materials with the potential to outperform their silicon counterpart are actively explored in search of ways to extend Moore’s law. Among these technologies, low dimensional channel materials (LDMs) devices, such as carbon nanotube field-effect transistors (CNFETs), are promising to eventually outperform silicon CMOS. As these technologies are in their early development stages, their devices still suffer from high levels of defects and variations, thus unsuitable for nowadays general-purpose applications. On the other hand, applications with inherent error resilience and high-performance demands would suppress the impact of process imperfection and benefit from the performance boost. These applications, including image processing and machine learning through neural networks, would be the ideal targets for adopting these new emerging technologies even in their early stage of technology and process development. In this article, the effects of stuck-at faults in CNFET static random access memory (SRAM)-based multilayer perceptron (MLP) neural network are investigated. The impacts of various fault patterns are analyzed. Several fault recovery techniques are introduced, and their effectiveness is analyzed under different scenarios. With the proposed recovery techniques, the system can recover and tolerate a high level of stuck-at faults up to 40%, paving the path to adopt the early-stage and faulty emerging devices technologies in such high-demand applications.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 2","pages":"168-175"},"PeriodicalIF":2.4,"publicationDate":"2023-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10246789","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135361608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This article presents a modified compact model of resistive random access memory (RRAM) with a tunneling barrier. The bilayer modulated RRAM can be integrated into a higher density array, reducing leakage current in standby mode. The model demonstrates current transition behavior from low- to high-bias regions by considering both bulk-limited and electrode-limited transport mechanisms. This model can evaluate RRAM array performance under various pulsing conditions and device parameter variations with calibrated model cards. The compute-in-memory application requires precise current sum results hindered by the wire resistance loading effect. This study also evaluates various sizes of arrays suitable for performance improvement.
{"title":"Modeling of Bilayer Modulated RRAM and Its Array Performance for Compute-in-Memory Applications","authors":"Jia-Wei Lee;Tzu-Chin Chou;Po-An Chen;Meng-Hsueh Chiang","doi":"10.1109/JXCDC.2023.3311899","DOIUrl":"10.1109/JXCDC.2023.3311899","url":null,"abstract":"This article presents a modified compact model of resistive random access memory (RRAM) with a tunneling barrier. The bilayer modulated RRAM can be integrated into a higher density array, reducing leakage current in standby mode. The model demonstrates current transition behavior from low- to high-bias regions by considering both bulk-limited and electrode-limited transport mechanisms. This model can evaluate RRAM array performance under various pulsing conditions and device parameter variations with calibrated model cards. The compute-in-memory application requires precise current sum results hindered by the wire resistance loading effect. This study also evaluates various sizes of arrays suitable for performance improvement.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 2","pages":"151-158"},"PeriodicalIF":2.4,"publicationDate":"2023-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10239165","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62236596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-29DOI: 10.1109/JXCDC.2023.3309713
Andrea Boni;Francesco Malena;Francesco Saccani;Michele Amoretti;Michele Caselli
This article presents the flipped (F)-2T2R resistive random access memory (RRAM) compute cell enhancing the performance of RRAM-based mixed-signal accelerators for deep neural networks (DNNs) in machine-learning (ML) applications. The F-2T2R cell is designed to exploit the features of the FD-SOI technology and it achieves a large increase in cell output impedance, compared to the standard 1-transistor 1-resistor (1T1R) cell. The article also describes the modeling of an F-2T2R-based accelerator and its transistor-level implementation in a 22-nm FD-SOI technology. The modeling results and the accelerator performance are validated by simulation. The proposed design can achieve an energy efficiency of up to 1260 1 bit-TOPS/W, with a memory array of 256 rows and columns. From the results of our analytical framework, a ResNet18, mapped on the accelerator, can obtain an accuracy reduction below 2%, with respect to the floating-point baseline, on the CIFAR-10 dataset.
{"title":"Boosting RRAM-Based Mixed-Signal Accelerators in FD-SOI Technology for ML Applications","authors":"Andrea Boni;Francesco Malena;Francesco Saccani;Michele Amoretti;Michele Caselli","doi":"10.1109/JXCDC.2023.3309713","DOIUrl":"10.1109/JXCDC.2023.3309713","url":null,"abstract":"This article presents the flipped (F)-2T2R resistive random access memory (RRAM) compute cell enhancing the performance of RRAM-based mixed-signal accelerators for deep neural networks (DNNs) in machine-learning (ML) applications. The F-2T2R cell is designed to exploit the features of the FD-SOI technology and it achieves a large increase in cell output impedance, compared to the standard 1-transistor 1-resistor (1T1R) cell. The article also describes the modeling of an F-2T2R-based accelerator and its transistor-level implementation in a 22-nm FD-SOI technology. The modeling results and the accelerator performance are validated by simulation. The proposed design can achieve an energy efficiency of up to 1260 1 bit-TOPS/W, with a memory array of 256 rows and columns. From the results of our analytical framework, a ResNet18, mapped on the accelerator, can obtain an accuracy reduction below 2%, with respect to the floating-point baseline, on the CIFAR-10 dataset.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 2","pages":"159-167"},"PeriodicalIF":2.4,"publicationDate":"2023-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10233848","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62236587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This work presents new insights into 3-D logic circuit design with vertical junctionless nanowire FETs (VNWFET) accounting for underlying electrothermal phenomena. Aided by the understanding of the nanoscale heat transport in VNWFETs through multiphysics simulations, the SPICE-compatible compact model captures temperature and trapping effects principally through a shift of the device threshold voltage. Circuit-level simulations indicate a strong impact of temperature variation on functionality and figures of merits, such as energy-delay products. Subsequent guidelines for design considerations are discussed that are intended to provide feedback for technology improvements.
{"title":"3-D Logic Circuit Design-Oriented Electrothermal Modeling of Vertical Junctionless Nanowire FETs","authors":"Sara Mannaa;Arnaud Poittevin;Cédric Marchand;Damien Deleruyelle;Bastien Deveautour;Alberto Bosio;Ian O’Connor;Chhandak Mukherjee;Yifan Wang;Houssem Rezgui;Marina Deng;Cristell Maneux;Jonas Müller;Sylvain Pelloquin;Konstantinos Moustakas;Guilhem Larrieu","doi":"10.1109/JXCDC.2023.3309502","DOIUrl":"10.1109/JXCDC.2023.3309502","url":null,"abstract":"This work presents new insights into 3-D logic circuit design with vertical junctionless nanowire FETs (VNWFET) accounting for underlying electrothermal phenomena. Aided by the understanding of the nanoscale heat transport in VNWFETs through multiphysics simulations, the SPICE-compatible compact model captures temperature and trapping effects principally through a shift of the device threshold voltage. Circuit-level simulations indicate a strong impact of temperature variation on functionality and figures of merits, such as energy-delay products. Subsequent guidelines for design considerations are discussed that are intended to provide feedback for technology improvements.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 2","pages":"116-123"},"PeriodicalIF":2.4,"publicationDate":"2023-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10232986","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62236510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-29DOI: 10.1109/JXCDC.2023.3277781
{"title":"INFORMATION FOR AUTHORS","authors":"","doi":"10.1109/JXCDC.2023.3277781","DOIUrl":"https://doi.org/10.1109/JXCDC.2023.3277781","url":null,"abstract":"","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 1","pages":"C3-C3"},"PeriodicalIF":2.4,"publicationDate":"2023-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/6570653/10138050/10168535.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49946610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recently, novel applications in the space of artificial intelligence (AI) such as solving constraint optimization problems, probabilistic inferencing, contextual adaptation, and continual learning from noisy data are gaining momentum to address relevant real-world problems. A majority of these tasks are compute and/or memory intensive. While traditional deep learning has been fueled by the utilization of graphic processing units (GPUs) to accelerate algorithms primarily in the cloud, today we see a surge in the development of application/domain-specific integrated circuits and systems that aim at providing an order of magnitude improvement over traditional GPU-based approaches in terms of energy efficiency and latency. This growing branch of research taps into the realms of neuronal dynamics, collective computing using dynamical systems, harnessing stochasticity to enable probabilistic computing, and even draws inspiration from quantum computing. We envision such specialized application/domain-specific systems to perform complex tasks such as solving NP-hard optimization problems, performing reasoning and cognition in the presence of uncertainty with superior energy-efficiency (and/or area and latency improvements) compared to conventional GPU-based approaches and von Neumann computing using traditional silicon-based devices, circuits, and architectures. Of special interest is to utilize such nontraditional computing approaches to reduce the time to obtain solutions for computationally challenging problems that otherwise tend to grow exponentially with problem size. To support this vision, there needs to be fundamental advances in both nontraditional devices and circuits/architectures. Recent works have shown that novel circuit topologies and architectures involving non-Boolean, oscillatory, spiking, probabilistic, or quantum-inspired computing are more suited toward tackling applications such as solving constraint optimization problems, performing energy-based learning, performing Bayesian learning and inference, lifelong continual learning, and solving quantum-inspired applications such as Quantum Monte Carlo. A flurry of current research highlights that compared to traditional silicon-based devices, emerging nanodevices utilizing novel quantum materials such as complex oxides, ferroelectric materials, and spintronic materials can allow the realization of these novel circuits and architectures with lower foot-print area, higher energy efficiency, and lower latency.
{"title":"Special Topic on Nontraditional Devices, Circuits, and Architectures for Energy-Efficient Computing","authors":"Sourav Dutta;Punyashloka Debashis;Amir Khosrowshahi","doi":"10.1109/JXCDC.2023.3280846","DOIUrl":"10.1109/JXCDC.2023.3280846","url":null,"abstract":"Recently, novel applications in the space of artificial intelligence (AI) such as solving constraint optimization problems, probabilistic inferencing, contextual adaptation, and continual learning from noisy data are gaining momentum to address relevant real-world problems. A majority of these tasks are compute and/or memory intensive. While traditional deep learning has been fueled by the utilization of graphic processing units (GPUs) to accelerate algorithms primarily in the cloud, today we see a surge in the development of application/domain-specific integrated circuits and systems that aim at providing an order of magnitude improvement over traditional GPU-based approaches in terms of energy efficiency and latency. This growing branch of research taps into the realms of neuronal dynamics, collective computing using dynamical systems, harnessing stochasticity to enable probabilistic computing, and even draws inspiration from quantum computing. We envision such specialized application/domain-specific systems to perform complex tasks such as solving NP-hard optimization problems, performing reasoning and cognition in the presence of uncertainty with superior energy-efficiency (and/or area and latency improvements) compared to conventional GPU-based approaches and von Neumann computing using traditional silicon-based devices, circuits, and architectures. Of special interest is to utilize such nontraditional computing approaches to reduce the time to obtain solutions for computationally challenging problems that otherwise tend to grow exponentially with problem size. To support this vision, there needs to be fundamental advances in both nontraditional devices and circuits/architectures. Recent works have shown that novel circuit topologies and architectures involving non-Boolean, oscillatory, spiking, probabilistic, or quantum-inspired computing are more suited toward tackling applications such as solving constraint optimization problems, performing energy-based learning, performing Bayesian learning and inference, lifelong continual learning, and solving quantum-inspired applications such as Quantum Monte Carlo. A flurry of current research highlights that compared to traditional silicon-based devices, emerging nanodevices utilizing novel quantum materials such as complex oxides, ferroelectric materials, and spintronic materials can allow the realization of these novel circuits and architectures with lower foot-print area, higher energy efficiency, and lower latency.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 1","pages":"iii-v"},"PeriodicalIF":2.4,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/6570653/10138050/10163725.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48736919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this article, we propose a nontraditional design of dynamic logic circuits using fully-depleted silicon-on-insulator (FDSOI) FETs. FDSOI FET allows the threshold voltage ( $V_{text {t}}$