首页 > 最新文献

IEEE Journal on Exploratory Solid-State Computational Devices and Circuits最新文献

英文 中文
XNOR-VSH: A Valley-Spin Hall Effect-Based Compact and Energy-Efficient Synaptic Crossbar Array for Binary Neural Networks XNOR-VSH:一种基于谷自旋霍尔效应的紧凑节能的二元神经网络突触交叉栅阵列
IF 2.4 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-09-29 DOI: 10.1109/JXCDC.2023.3320677
Karam Cho;Akul Malhotra;Sumeet Kumar Gupta
Binary neural networks (BNNs) have shown an immense promise for resource-constrained edge artificial intelligence (AI) platforms. However, prior designs typically either require two bit-cells to encode signed weights leading to an area overhead, or require complex peripheral circuitry. In this article, we address this issue by proposing a compact and low power in-memory computing (IMC) of XNOR-based dot products featuring signed weight encoding in a single bit-cell. Our approach utilizes valley-spin Hall (VSH) effect in monolayer tungsten di-selenide to design an XNOR bit-cell (named “XNOR-VSH”) with differential storage and access-transistor-less topology. We co-optimize the proposed VSH device and a memory array to enable robust in-memory dot product computations between signed binary inputs and signed binary weights with sense margin (SM) $1 ~mu text{A}$ . Our results show that the proposed XNOR-VSH array achieves 4.8%–9.0% and 37%–63% lower IMC latency and energy, respectively, with 49%–64% smaller area compared to spin-transfer-torque (STT)-magnetic random access memory (MRAM) and spin-orbit-torque (SOT)-MRAM based XNOR-arrays. We also present the impact of hardware non-idealities and process variations in XNOR-VSH on system-level accuracy for the trained ResNet-18 BNNs using the CIFAR-10 dataset.
二元神经网络(bnn)在资源受限的边缘人工智能(AI)平台上显示出巨大的前景。然而,先前的设计通常要么需要两个位单元来编码带符号的权重,导致面积开销,要么需要复杂的外围电路。在本文中,我们通过提出基于xnor的点积的紧凑和低功耗内存计算(IMC)来解决这个问题,该点积在单个位单元中具有符号权重编码。我们的方法利用单层二硒化钨中的谷自旋霍尔(VSH)效应来设计具有差分存储和无接入晶体管拓扑结构的XNOR位单元(命名为“XNOR-VSH”)。我们对所提出的VSH器件和内存阵列进行了共同优化,以实现有符号二进制输入和有符号二进制权值(SM) $1 ~mu text{a}$)之间的鲁棒内存点积计算。结果表明,与基于自旋-传递-扭矩(STT)-磁随机存取存储器(MRAM)和基于自旋-轨道-扭矩(SOT)-MRAM的xnor阵列相比,所提出的XNOR-VSH阵列的IMC延迟和能量分别降低了4.8% ~ 9.0%和37% ~ 63%,面积减少了49% ~ 64%。我们还介绍了XNOR-VSH中硬件非理想性和过程变化对使用CIFAR-10数据集训练的ResNet-18 bnn的系统级精度的影响。
{"title":"XNOR-VSH: A Valley-Spin Hall Effect-Based Compact and Energy-Efficient Synaptic Crossbar Array for Binary Neural Networks","authors":"Karam Cho;Akul Malhotra;Sumeet Kumar Gupta","doi":"10.1109/JXCDC.2023.3320677","DOIUrl":"10.1109/JXCDC.2023.3320677","url":null,"abstract":"Binary neural networks (BNNs) have shown an immense promise for resource-constrained edge artificial intelligence (AI) platforms. However, prior designs typically either require two bit-cells to encode signed weights leading to an area overhead, or require complex peripheral circuitry. In this article, we address this issue by proposing a compact and low power in-memory computing (IMC) of XNOR-based dot products featuring signed weight encoding in a single bit-cell. Our approach utilizes valley-spin Hall (VSH) effect in monolayer tungsten di-selenide to design an XNOR bit-cell (named “XNOR-VSH”) with differential storage and access-transistor-less topology. We co-optimize the proposed VSH device and a memory array to enable robust in-memory dot product computations between signed binary inputs and signed binary weights with sense margin (SM)\u0000<inline-formula> <tex-math>$1 ~mu text{A}$ </tex-math></inline-formula>\u0000. Our results show that the proposed XNOR-VSH array achieves 4.8%–9.0% and 37%–63% lower IMC latency and energy, respectively, with 49%–64% smaller area compared to spin-transfer-torque (STT)-magnetic random access memory (MRAM) and spin-orbit-torque (SOT)-MRAM based XNOR-arrays. We also present the impact of hardware non-idealities and process variations in XNOR-VSH on system-level accuracy for the trained ResNet-18 BNNs using the CIFAR-10 dataset.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 2","pages":"99-107"},"PeriodicalIF":2.4,"publicationDate":"2023-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10268108","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135845097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Many-Body Effects-Based Invertible Logic With a Simple Energy Landscape and High Accuracy 基于多体效应的可逆逻辑,具有简单的能量景观和高精度
IF 2.4 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-09-28 DOI: 10.1109/JXCDC.2023.3320230
Yihan He;Chao Fang;Sheng Luo;Gengchiau Liang
Inspired by many-body effects, we propose a novel design for Boltzmann machine (BM)-based invertible logic (IL) using probabilistic bits (p-bits). A CMOS-based XNOR gate is derived to serve as the hardware implementation of many-body interactions, and an IL family is built based on this design. Compared to the conventional two-body-based design framework, the many-body-based design enables compact configuration and provides the simplest binarized energy landscape for fundamental IL gates; furthermore, we demonstrate the composability of the many-body-based IL circuit by merging modular building blocks into large-scale integer factorizers (IFs). To optimize the energy landscape of large-scale combinatorial IL circuits, we introduce degeneracy in energy levels, which enlarges the probabilities for the lowest states. Circuit simulations of our IFs reveal a significant boost in factorization accuracy. An example of a 2- $times2$ -bit IF demonstrated an increment of factorization accuracy from 64.99% to 91.44% with a reduction in the number of energy levels from 32 to 9. Similarly, our 6- $times6$ -bit IF increases the accuracy from 4.430% to 83.65% with the many-body design. Overall, the many-body-based design scheme provides promising results for future IL circuit designs.
受多体效应的启发,我们提出了一种使用概率比特(p比特)的基于玻尔兹曼机(BM)的可逆逻辑(IL)的新设计。提出了一种基于CMOS的XNOR门作为多体交互的硬件实现,并在此基础上构建了IL家族。与传统的基于两体的设计框架相比,基于多体的设计实现了紧凑的配置,并为基本IL门提供了最简单的二进制能量景观;此外,我们通过将模块化构建块合并到大规模整数分解器(IF)中,证明了基于多体的IL电路的可组合性。为了优化大规模组合IL电路的能量景观,我们引入了能级的简并性,这扩大了最低状态的概率。我们的IF的电路模拟显示了因子分解精度的显著提高。一个2-$times2$-bit IF的例子表明,随着能级数量从32减少到9,因子分解精度从64.99%增加到91.44%。类似地,我们的6$times6$-bit IF通过多体设计将精度从4.430%提高到83.65%。总体而言,基于多体的设计方案为未来的IL电路设计提供了有希望的结果。
{"title":"Many-Body Effects-Based Invertible Logic With a Simple Energy Landscape and High Accuracy","authors":"Yihan He;Chao Fang;Sheng Luo;Gengchiau Liang","doi":"10.1109/JXCDC.2023.3320230","DOIUrl":"https://doi.org/10.1109/JXCDC.2023.3320230","url":null,"abstract":"Inspired by many-body effects, we propose a novel design for Boltzmann machine (BM)-based invertible logic (IL) using probabilistic bits (p-bits). A CMOS-based XNOR gate is derived to serve as the hardware implementation of many-body interactions, and an IL family is built based on this design. Compared to the conventional two-body-based design framework, the many-body-based design enables compact configuration and provides the simplest binarized energy landscape for fundamental IL gates; furthermore, we demonstrate the composability of the many-body-based IL circuit by merging modular building blocks into large-scale integer factorizers (IFs). To optimize the energy landscape of large-scale combinatorial IL circuits, we introduce degeneracy in energy levels, which enlarges the probabilities for the lowest states. Circuit simulations of our IFs reveal a significant boost in factorization accuracy. An example of a 2- \u0000<inline-formula> <tex-math>$times2$ </tex-math></inline-formula>\u0000-bit IF demonstrated an increment of factorization accuracy from 64.99% to 91.44% with a reduction in the number of energy levels from 32 to 9. Similarly, our 6- \u0000<inline-formula> <tex-math>$times6$ </tex-math></inline-formula>\u0000-bit IF increases the accuracy from 4.430% to 83.65% with the many-body design. Overall, the many-body-based design scheme provides promising results for future IL circuit designs.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 2","pages":"83-91"},"PeriodicalIF":2.4,"publicationDate":"2023-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/6570653/10288180/10266315.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49964659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modeling and Evaluation of Echo-State Networks Using Spin Torque Nano-Oscillators 使用自旋扭矩纳米振荡器的回声态网络建模与评估
IF 2.4 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-09-19 DOI: 10.1109/JXCDC.2023.3317240
Siyuan Qian;Shaloo Rakheja
An echo state network (ESN), capable of processing time-series data with high accuracy, is designed and benchmarked using spin torque nano-oscillators (STNOs) with easy-plane anisotropy. An ESN belongs to the category of reservoir computers, where the reservoir comprises a randomly initialized, recurrently connected, and untrained pool of neurons and acts as a high-dimensional expansion of the input signal. The readout function is used to glean a meaningful output representation. Here, we use STNOs as the basic building block of the ESN and apply the ESN to predict the Mackey–Glass (MG) time-series data. The design parameters of the STNO and the input data representation are selected to yield prediction errors as low as $4times 10^{-3}$ . We also quantify the short-term memory (STM) and the parity-check (PC) capacity of the ESN and obtain metrics that are comparable to or better than existing spintronics-based ESNs, as well as ESNs employing “tanh” neurons. The peak STM is found to be approximately 8.8, while the peak PC capacity is found to be approximately 3.9. The impacts of thermal fluctuations and process variability on ESN performance are systematically quantified. Although the ESN’s prediction and memory capability remain robust with temperature variations, a 10% variation in the dimensions of the STNO free layer can lead to around 40% increase in its prediction error for the MG time-series data.
利用具有易平面各向异性的自旋力矩纳米振荡器(STNOs)设计了一种能够高精度处理时间序列数据的回波状态网络(ESN),并对其进行了基准测试。ESN 属于蓄水池计算机,蓄水池由随机初始化、递归连接和未经训练的神经元池组成,是输入信号的高维扩展。读出功能用于收集有意义的输出表示。在这里,我们使用 STNO 作为 ESN 的基本构件,并将 ESN 应用于预测 Mackey-Glass (MG) 时间序列数据。我们选择了 STNO 和输入数据表示的设计参数,以使预测误差低至 $4times 10^{-3}$。我们还量化了 ESN 的短时记忆(STM)和奇偶校验(PC)能力,得到的指标与现有的基于自旋电子学的 ESN 以及采用 "tanh "神经元的 ESN 相当或更好。结果发现,STM 的峰值约为 8.8,而 PC 容量的峰值约为 3.9。热波动和工艺变异对 ESN 性能的影响得到了系统量化。尽管 ESN 的预测和记忆能力在温度变化时仍然保持稳定,但 STNO 自由层 10%的尺寸变化会导致其对 MG 时间序列数据的预测误差增加约 40%。
{"title":"Modeling and Evaluation of Echo-State Networks Using Spin Torque Nano-Oscillators","authors":"Siyuan Qian;Shaloo Rakheja","doi":"10.1109/JXCDC.2023.3317240","DOIUrl":"10.1109/JXCDC.2023.3317240","url":null,"abstract":"An echo state network (ESN), capable of processing time-series data with high accuracy, is designed and benchmarked using spin torque nano-oscillators (STNOs) with easy-plane anisotropy. An ESN belongs to the category of reservoir computers, where the reservoir comprises a randomly initialized, recurrently connected, and untrained pool of neurons and acts as a high-dimensional expansion of the input signal. The readout function is used to glean a meaningful output representation. Here, we use STNOs as the basic building block of the ESN and apply the ESN to predict the Mackey–Glass (MG) time-series data. The design parameters of the STNO and the input data representation are selected to yield prediction errors as low as \u0000<inline-formula> <tex-math>$4times 10^{-3}$ </tex-math></inline-formula>\u0000. We also quantify the short-term memory (STM) and the parity-check (PC) capacity of the ESN and obtain metrics that are comparable to or better than existing spintronics-based ESNs, as well as ESNs employing “tanh” neurons. The peak STM is found to be approximately 8.8, while the peak PC capacity is found to be approximately 3.9. The impacts of thermal fluctuations and process variability on ESN performance are systematically quantified. Although the ESN’s prediction and memory capability remain robust with temperature variations, a 10% variation in the dimensions of the STNO free layer can lead to around 40% increase in its prediction error for the MG time-series data.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 2","pages":"134-142"},"PeriodicalIF":2.4,"publicationDate":"2023-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10255553","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135551560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Impact of Analog-to-Digital Converter Architecture and Variability on Analog Neural Network Accuracy 模数转换器结构和可变性对模拟神经网络精度的影响
IF 2.4 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-09-13 DOI: 10.1109/JXCDC.2023.3315134
Matthew Spear;Joshua E. Kim;Christopher H. Bennett;Sapan Agarwal;Matthew J. Marinella;T. Patrick Xiao
The analog-to-digital converter (ADC) is not only a key component in analog in-memory computing (IMC) accelerators but also a bottleneck for the efficiency and accuracy of these systems. While the tradeoffs between power consumption, latency, and area in ADC design are well studied, it is relatively unknown which ADC implementations are optimal for algorithmic accuracy, particularly for neural network inference. We explore the design space of the ADC with a focus on accuracy, investigating the sensitivity of neural network outputs to component variability inside the ADC and how this sensitivity depends on the ADC architecture. The compact models of the pipeline, cyclic, successive-approximation-register (SAR) and ramp ADCs are developed, and these models are used in a system-level accuracy simulation of analog neural network inference. Our results show how the accuracy on a complex image recognition benchmark (ResNet50 on ImageNet) depends on the capacitance mismatch, comparator offset, and effective number of bits (ENOB) for each of the four ADC architectures. We find that robustness to component variations depends strongly on the ADC design and that inference accuracy is particularly sensitive to the value-dependent error characteristics of the ADC, which cannot be captured by the conventional ENOB precision metric.
模数转换器(ADC)不仅是模拟内存计算(IMC)加速器的关键组件,也是这些系统效率和精度的瓶颈。虽然 ADC 设计中功耗、延迟和面积之间的权衡已得到深入研究,但哪些 ADC 实现是算法精度(尤其是神经网络推理)的最佳选择却相对未知。我们以精度为重点探索 ADC 的设计空间,研究神经网络输出对 ADC 内部元件变化的敏感性,以及这种敏感性如何取决于 ADC 架构。我们开发了流水线、循环、逐次逼近寄存器 (SAR) 和斜坡 ADC 的紧凑模型,并将这些模型用于模拟神经网络推理的系统级精度仿真。我们的结果显示了复杂图像识别基准(ImageNet 上的 ResNet50)的准确性如何取决于四种 ADC 架构中每种架构的电容失配、比较器偏移和有效位数 (ENOB)。我们发现,对元件变化的鲁棒性在很大程度上取决于 ADC 的设计,而推理准确度对 ADC 的误差特性值特别敏感,传统的 ENOB 精度指标无法捕捉到这种误差特性。
{"title":"The Impact of Analog-to-Digital Converter Architecture and Variability on Analog Neural Network Accuracy","authors":"Matthew Spear;Joshua E. Kim;Christopher H. Bennett;Sapan Agarwal;Matthew J. Marinella;T. Patrick Xiao","doi":"10.1109/JXCDC.2023.3315134","DOIUrl":"10.1109/JXCDC.2023.3315134","url":null,"abstract":"The analog-to-digital converter (ADC) is not only a key component in analog in-memory computing (IMC) accelerators but also a bottleneck for the efficiency and accuracy of these systems. While the tradeoffs between power consumption, latency, and area in ADC design are well studied, it is relatively unknown which ADC implementations are optimal for algorithmic accuracy, particularly for neural network inference. We explore the design space of the ADC with a focus on accuracy, investigating the sensitivity of neural network outputs to component variability inside the ADC and how this sensitivity depends on the ADC architecture. The compact models of the pipeline, cyclic, successive-approximation-register (SAR) and ramp ADCs are developed, and these models are used in a system-level accuracy simulation of analog neural network inference. Our results show how the accuracy on a complex image recognition benchmark (ResNet50 on ImageNet) depends on the capacitance mismatch, comparator offset, and effective number of bits (ENOB) for each of the four ADC architectures. We find that robustness to component variations depends strongly on the ADC design and that inference accuracy is particularly sensitive to the value-dependent error characteristics of the ADC, which cannot be captured by the conventional ENOB precision metric.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 2","pages":"176-184"},"PeriodicalIF":2.4,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10250846","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135402296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stuck-at Faults Tolerance and Recovery in MLP Neural Networks Using Imperfect Emerging CNFET Technology 使用不完美新兴 CNFET 技术的 MLP 神经网络的卡滞故障容限与恢复
IF 2.4 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-09-11 DOI: 10.1109/JXCDC.2023.3313127
An Qi Zhang;Amr M. S. Tosson;Dylan Ma;Ryan Fang;Lan Wei
Devices using emerging technologies and materials with the potential to outperform their silicon counterpart are actively explored in search of ways to extend Moore’s law. Among these technologies, low dimensional channel materials (LDMs) devices, such as carbon nanotube field-effect transistors (CNFETs), are promising to eventually outperform silicon CMOS. As these technologies are in their early development stages, their devices still suffer from high levels of defects and variations, thus unsuitable for nowadays general-purpose applications. On the other hand, applications with inherent error resilience and high-performance demands would suppress the impact of process imperfection and benefit from the performance boost. These applications, including image processing and machine learning through neural networks, would be the ideal targets for adopting these new emerging technologies even in their early stage of technology and process development. In this article, the effects of stuck-at faults in CNFET static random access memory (SRAM)-based multilayer perceptron (MLP) neural network are investigated. The impacts of various fault patterns are analyzed. Several fault recovery techniques are introduced, and their effectiveness is analyzed under different scenarios. With the proposed recovery techniques, the system can recover and tolerate a high level of stuck-at faults up to 40%, paving the path to adopt the early-stage and faulty emerging devices technologies in such high-demand applications.
人们正在积极探索使用新兴技术和材料的器件,这些器件的性能有可能超过硅器件,以寻求延长摩尔定律的方法。在这些技术中,碳纳米管场效应晶体管(CNFET)等低维沟道材料(LDMs)器件有望最终超越硅 CMOS。由于这些技术还处于早期开发阶段,其器件仍存在大量缺陷和变异,因此不适合现在的通用应用。另一方面,具有内在抗错能力和高性能要求的应用将抑制工艺缺陷的影响,并从性能提升中获益。包括图像处理和通过神经网络进行机器学习在内的这些应用将是采用这些新兴技术的理想目标,即使它们还处于技术和工艺开发的早期阶段。本文研究了基于 CNFET 静态随机存取存储器 (SRAM) 的多层感知器 (MLP) 神经网络中卡滞故障的影响。分析了各种故障模式的影响。引入了几种故障恢复技术,并分析了它们在不同情况下的有效性。利用所提出的恢复技术,系统可以恢复和容忍高达 40% 的卡滞故障,为在此类高需求应用中采用早期和故障新兴设备技术铺平了道路。
{"title":"Stuck-at Faults Tolerance and Recovery in MLP Neural Networks Using Imperfect Emerging CNFET Technology","authors":"An Qi Zhang;Amr M. S. Tosson;Dylan Ma;Ryan Fang;Lan Wei","doi":"10.1109/JXCDC.2023.3313127","DOIUrl":"10.1109/JXCDC.2023.3313127","url":null,"abstract":"Devices using emerging technologies and materials with the potential to outperform their silicon counterpart are actively explored in search of ways to extend Moore’s law. Among these technologies, low dimensional channel materials (LDMs) devices, such as carbon nanotube field-effect transistors (CNFETs), are promising to eventually outperform silicon CMOS. As these technologies are in their early development stages, their devices still suffer from high levels of defects and variations, thus unsuitable for nowadays general-purpose applications. On the other hand, applications with inherent error resilience and high-performance demands would suppress the impact of process imperfection and benefit from the performance boost. These applications, including image processing and machine learning through neural networks, would be the ideal targets for adopting these new emerging technologies even in their early stage of technology and process development. In this article, the effects of stuck-at faults in CNFET static random access memory (SRAM)-based multilayer perceptron (MLP) neural network are investigated. The impacts of various fault patterns are analyzed. Several fault recovery techniques are introduced, and their effectiveness is analyzed under different scenarios. With the proposed recovery techniques, the system can recover and tolerate a high level of stuck-at faults up to 40%, paving the path to adopt the early-stage and faulty emerging devices technologies in such high-demand applications.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 2","pages":"168-175"},"PeriodicalIF":2.4,"publicationDate":"2023-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10246789","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135361608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modeling of Bilayer Modulated RRAM and Its Array Performance for Compute-in-Memory Applications 存储器中计算应用的双层调制RRAM建模及其阵列性能
IF 2.4 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-09-04 DOI: 10.1109/JXCDC.2023.3311899
Jia-Wei Lee;Tzu-Chin Chou;Po-An Chen;Meng-Hsueh Chiang
This article presents a modified compact model of resistive random access memory (RRAM) with a tunneling barrier. The bilayer modulated RRAM can be integrated into a higher density array, reducing leakage current in standby mode. The model demonstrates current transition behavior from low- to high-bias regions by considering both bulk-limited and electrode-limited transport mechanisms. This model can evaluate RRAM array performance under various pulsing conditions and device parameter variations with calibrated model cards. The compute-in-memory application requires precise current sum results hindered by the wire resistance loading effect. This study also evaluates various sizes of arrays suitable for performance improvement.
本文提出了一种改进的带隧道阻挡层的电阻随机存取存储器(RRAM)的紧凑模型。双层调制RRAM可以集成到更高密度的阵列中,减少待机模式下的泄漏电流。该模型通过考虑体积限制和电极限制输运机制,展示了电流从低偏置到高偏置区域的转变行为。该模型可以通过校准的模型卡来评估不同脉冲条件下RRAM阵列的性能和器件参数的变化。内存计算应用需要精确的电流和结果,这受到导线电阻负载效应的阻碍。本研究亦评估了不同大小的阵列,以提高效能。
{"title":"Modeling of Bilayer Modulated RRAM and Its Array Performance for Compute-in-Memory Applications","authors":"Jia-Wei Lee;Tzu-Chin Chou;Po-An Chen;Meng-Hsueh Chiang","doi":"10.1109/JXCDC.2023.3311899","DOIUrl":"10.1109/JXCDC.2023.3311899","url":null,"abstract":"This article presents a modified compact model of resistive random access memory (RRAM) with a tunneling barrier. The bilayer modulated RRAM can be integrated into a higher density array, reducing leakage current in standby mode. The model demonstrates current transition behavior from low- to high-bias regions by considering both bulk-limited and electrode-limited transport mechanisms. This model can evaluate RRAM array performance under various pulsing conditions and device parameter variations with calibrated model cards. The compute-in-memory application requires precise current sum results hindered by the wire resistance loading effect. This study also evaluates various sizes of arrays suitable for performance improvement.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 2","pages":"151-158"},"PeriodicalIF":2.4,"publicationDate":"2023-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10239165","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62236596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Boosting RRAM-Based Mixed-Signal Accelerators in FD-SOI Technology for ML Applications 在FD-SOI技术中增强基于ram的混合信号加速器用于ML应用
IF 2.4 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-08-29 DOI: 10.1109/JXCDC.2023.3309713
Andrea Boni;Francesco Malena;Francesco Saccani;Michele Amoretti;Michele Caselli
This article presents the flipped (F)-2T2R resistive random access memory (RRAM) compute cell enhancing the performance of RRAM-based mixed-signal accelerators for deep neural networks (DNNs) in machine-learning (ML) applications. The F-2T2R cell is designed to exploit the features of the FD-SOI technology and it achieves a large increase in cell output impedance, compared to the standard 1-transistor 1-resistor (1T1R) cell. The article also describes the modeling of an F-2T2R-based accelerator and its transistor-level implementation in a 22-nm FD-SOI technology. The modeling results and the accelerator performance are validated by simulation. The proposed design can achieve an energy efficiency of up to 1260 1 bit-TOPS/W, with a memory array of 256 rows and columns. From the results of our analytical framework, a ResNet18, mapped on the accelerator, can obtain an accuracy reduction below 2%, with respect to the floating-point baseline, on the CIFAR-10 dataset.
本文介绍了翻转(F)-2T2R电阻随机存取存储器(RRAM)计算单元,增强了机器学习(ML)应用中深度神经网络(dnn)中基于RRAM的混合信号加速器的性能。F-2T2R电池设计利用FD-SOI技术的特点,与标准的1晶体管1电阻(1T1R)电池相比,它实现了电池输出阻抗的大幅增加。本文还介绍了基于f - 22rr的加速器的建模及其在22纳米FD-SOI技术中的晶体管级实现。仿真结果验证了建模结果和加速器的性能。该设计可实现高达1260 bit-TOPS/W的能量效率,存储阵列为256行和256列。从我们的分析框架的结果来看,在加速器上映射的ResNet18,相对于CIFAR-10数据集上的浮点基线,可以获得低于2%的精度降低。
{"title":"Boosting RRAM-Based Mixed-Signal Accelerators in FD-SOI Technology for ML Applications","authors":"Andrea Boni;Francesco Malena;Francesco Saccani;Michele Amoretti;Michele Caselli","doi":"10.1109/JXCDC.2023.3309713","DOIUrl":"10.1109/JXCDC.2023.3309713","url":null,"abstract":"This article presents the flipped (F)-2T2R resistive random access memory (RRAM) compute cell enhancing the performance of RRAM-based mixed-signal accelerators for deep neural networks (DNNs) in machine-learning (ML) applications. The F-2T2R cell is designed to exploit the features of the FD-SOI technology and it achieves a large increase in cell output impedance, compared to the standard 1-transistor 1-resistor (1T1R) cell. The article also describes the modeling of an F-2T2R-based accelerator and its transistor-level implementation in a 22-nm FD-SOI technology. The modeling results and the accelerator performance are validated by simulation. The proposed design can achieve an energy efficiency of up to 1260 1 bit-TOPS/W, with a memory array of 256 rows and columns. From the results of our analytical framework, a ResNet18, mapped on the accelerator, can obtain an accuracy reduction below 2%, with respect to the floating-point baseline, on the CIFAR-10 dataset.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 2","pages":"159-167"},"PeriodicalIF":2.4,"publicationDate":"2023-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10233848","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62236587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
3-D Logic Circuit Design-Oriented Electrothermal Modeling of Vertical Junctionless Nanowire FETs 面向三维逻辑电路设计的垂直无结纳米线场效应管电热建模
IF 2.4 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-08-28 DOI: 10.1109/JXCDC.2023.3309502
Sara Mannaa;Arnaud Poittevin;Cédric Marchand;Damien Deleruyelle;Bastien Deveautour;Alberto Bosio;Ian O’Connor;Chhandak Mukherjee;Yifan Wang;Houssem Rezgui;Marina Deng;Cristell Maneux;Jonas Müller;Sylvain Pelloquin;Konstantinos Moustakas;Guilhem Larrieu
This work presents new insights into 3-D logic circuit design with vertical junctionless nanowire FETs (VNWFET) accounting for underlying electrothermal phenomena. Aided by the understanding of the nanoscale heat transport in VNWFETs through multiphysics simulations, the SPICE-compatible compact model captures temperature and trapping effects principally through a shift of the device threshold voltage. Circuit-level simulations indicate a strong impact of temperature variation on functionality and figures of merits, such as energy-delay products. Subsequent guidelines for design considerations are discussed that are intended to provide feedback for technology improvements.
这项工作为三维逻辑电路设计提供了新的见解,垂直无结纳米线场效应管(VNWFET)考虑了潜在的电热现象。通过多物理场模拟了解了vnwfet中的纳米级热传输,spice兼容的紧凑模型主要通过器件阈值电压的移动来捕获温度和捕获效应。电路级仿真表明,温度变化对功能和性能指标(如能量延迟产品)的影响很大。随后讨论了设计考虑的指导方针,旨在为技术改进提供反馈。
{"title":"3-D Logic Circuit Design-Oriented Electrothermal Modeling of Vertical Junctionless Nanowire FETs","authors":"Sara Mannaa;Arnaud Poittevin;Cédric Marchand;Damien Deleruyelle;Bastien Deveautour;Alberto Bosio;Ian O’Connor;Chhandak Mukherjee;Yifan Wang;Houssem Rezgui;Marina Deng;Cristell Maneux;Jonas Müller;Sylvain Pelloquin;Konstantinos Moustakas;Guilhem Larrieu","doi":"10.1109/JXCDC.2023.3309502","DOIUrl":"10.1109/JXCDC.2023.3309502","url":null,"abstract":"This work presents new insights into 3-D logic circuit design with vertical junctionless nanowire FETs (VNWFET) accounting for underlying electrothermal phenomena. Aided by the understanding of the nanoscale heat transport in VNWFETs through multiphysics simulations, the SPICE-compatible compact model captures temperature and trapping effects principally through a shift of the device threshold voltage. Circuit-level simulations indicate a strong impact of temperature variation on functionality and figures of merits, such as energy-delay products. Subsequent guidelines for design considerations are discussed that are intended to provide feedback for technology improvements.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 2","pages":"116-123"},"PeriodicalIF":2.4,"publicationDate":"2023-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10232986","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62236510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
INFORMATION FOR AUTHORS 作者信息
IF 2.4 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-06-29 DOI: 10.1109/JXCDC.2023.3277781
{"title":"INFORMATION FOR AUTHORS","authors":"","doi":"10.1109/JXCDC.2023.3277781","DOIUrl":"https://doi.org/10.1109/JXCDC.2023.3277781","url":null,"abstract":"","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 1","pages":"C3-C3"},"PeriodicalIF":2.4,"publicationDate":"2023-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/6570653/10138050/10168535.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49946610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Special Topic on Nontraditional Devices, Circuits, and Architectures for Energy-Efficient Computing 节能计算的非传统器件、电路和体系结构专题
IF 2.4 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-06-26 DOI: 10.1109/JXCDC.2023.3280846
Sourav Dutta;Punyashloka Debashis;Amir Khosrowshahi
Recently, novel applications in the space of artificial intelligence (AI) such as solving constraint optimization problems, probabilistic inferencing, contextual adaptation, and continual learning from noisy data are gaining momentum to address relevant real-world problems. A majority of these tasks are compute and/or memory intensive. While traditional deep learning has been fueled by the utilization of graphic processing units (GPUs) to accelerate algorithms primarily in the cloud, today we see a surge in the development of application/domain-specific integrated circuits and systems that aim at providing an order of magnitude improvement over traditional GPU-based approaches in terms of energy efficiency and latency. This growing branch of research taps into the realms of neuronal dynamics, collective computing using dynamical systems, harnessing stochasticity to enable probabilistic computing, and even draws inspiration from quantum computing. We envision such specialized application/domain-specific systems to perform complex tasks such as solving NP-hard optimization problems, performing reasoning and cognition in the presence of uncertainty with superior energy-efficiency (and/or area and latency improvements) compared to conventional GPU-based approaches and von Neumann computing using traditional silicon-based devices, circuits, and architectures. Of special interest is to utilize such nontraditional computing approaches to reduce the time to obtain solutions for computationally challenging problems that otherwise tend to grow exponentially with problem size. To support this vision, there needs to be fundamental advances in both nontraditional devices and circuits/architectures. Recent works have shown that novel circuit topologies and architectures involving non-Boolean, oscillatory, spiking, probabilistic, or quantum-inspired computing are more suited toward tackling applications such as solving constraint optimization problems, performing energy-based learning, performing Bayesian learning and inference, lifelong continual learning, and solving quantum-inspired applications such as Quantum Monte Carlo. A flurry of current research highlights that compared to traditional silicon-based devices, emerging nanodevices utilizing novel quantum materials such as complex oxides, ferroelectric materials, and spintronic materials can allow the realization of these novel circuits and architectures with lower foot-print area, higher energy efficiency, and lower latency.
最近,人工智能(AI)领域的新应用,如解决约束优化问题、概率推理、上下文适应和从噪声数据中持续学习,正在获得解决相关现实问题的动力。这些任务中的大多数都是计算和/或内存密集型的。虽然传统的深度学习是由图形处理单元(gpu)的使用推动的,主要是在云端加速算法,但今天我们看到应用程序/特定领域集成电路和系统的开发激增,旨在提供比传统的基于gpu的方法在能效和延迟方面的数量级改进。这一不断发展的研究分支涉及神经动力学领域,使用动力系统进行集体计算,利用随机性实现概率计算,甚至从量子计算中汲取灵感。我们设想,与传统的基于gpu的方法和使用传统硅基器件、电路和架构的冯·诺伊曼计算相比,这种专门的应用/领域特定系统可以执行复杂的任务,如解决NP-hard优化问题,在存在不确定性的情况下进行推理和认知,并具有卓越的能效(和/或面积和延迟改进)。特别感兴趣的是利用这种非传统的计算方法来减少获得具有计算挑战性问题的解决方案的时间,否则这些问题往往会随着问题规模呈指数级增长。为了支持这一愿景,需要在非传统设备和电路/架构方面取得根本性的进步。最近的研究表明,涉及非布尔、振荡、尖峰、概率或量子启发计算的新颖电路拓扑和架构更适合于解决约束优化问题、执行基于能量的学习、执行贝叶斯学习和推理、终身持续学习以及解决量子启发应用(如量子蒙特卡罗)等应用。当前的一系列研究强调,与传统的硅基器件相比,新兴的纳米器件利用新型量子材料,如复合氧化物、铁电材料和自旋电子材料,可以实现这些新颖的电路和架构,具有更小的占地面积、更高的能量效率和更低的延迟。
{"title":"Special Topic on Nontraditional Devices, Circuits, and Architectures for Energy-Efficient Computing","authors":"Sourav Dutta;Punyashloka Debashis;Amir Khosrowshahi","doi":"10.1109/JXCDC.2023.3280846","DOIUrl":"10.1109/JXCDC.2023.3280846","url":null,"abstract":"Recently, novel applications in the space of artificial intelligence (AI) such as solving constraint optimization problems, probabilistic inferencing, contextual adaptation, and continual learning from noisy data are gaining momentum to address relevant real-world problems. A majority of these tasks are compute and/or memory intensive. While traditional deep learning has been fueled by the utilization of graphic processing units (GPUs) to accelerate algorithms primarily in the cloud, today we see a surge in the development of application/domain-specific integrated circuits and systems that aim at providing an order of magnitude improvement over traditional GPU-based approaches in terms of energy efficiency and latency. This growing branch of research taps into the realms of neuronal dynamics, collective computing using dynamical systems, harnessing stochasticity to enable probabilistic computing, and even draws inspiration from quantum computing. We envision such specialized application/domain-specific systems to perform complex tasks such as solving NP-hard optimization problems, performing reasoning and cognition in the presence of uncertainty with superior energy-efficiency (and/or area and latency improvements) compared to conventional GPU-based approaches and von Neumann computing using traditional silicon-based devices, circuits, and architectures. Of special interest is to utilize such nontraditional computing approaches to reduce the time to obtain solutions for computationally challenging problems that otherwise tend to grow exponentially with problem size. To support this vision, there needs to be fundamental advances in both nontraditional devices and circuits/architectures. Recent works have shown that novel circuit topologies and architectures involving non-Boolean, oscillatory, spiking, probabilistic, or quantum-inspired computing are more suited toward tackling applications such as solving constraint optimization problems, performing energy-based learning, performing Bayesian learning and inference, lifelong continual learning, and solving quantum-inspired applications such as Quantum Monte Carlo. A flurry of current research highlights that compared to traditional silicon-based devices, emerging nanodevices utilizing novel quantum materials such as complex oxides, ferroelectric materials, and spintronic materials can allow the realization of these novel circuits and architectures with lower foot-print area, higher energy efficiency, and lower latency.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 1","pages":"iii-v"},"PeriodicalIF":2.4,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/6570653/10138050/10163725.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48736919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Journal on Exploratory Solid-State Computational Devices and Circuits
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1