首页 > 最新文献

IEEE Journal on Exploratory Solid-State Computational Devices and Circuits最新文献

英文 中文
IEEE Journal on Exploratory Solid-State Computational Devices and Circuits publication information 探索性固态计算器件和电路IEEE杂志出版信息
IF 2.4 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-06-01 DOI: 10.1109/JXCDC.2023.3277777
{"title":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits publication information","authors":"","doi":"10.1109/JXCDC.2023.3277777","DOIUrl":"10.1109/JXCDC.2023.3277777","url":null,"abstract":"","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 1","pages":"C2-C2"},"PeriodicalIF":2.4,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/6570653/10138050/10171850.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47878758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nontraditional Design of Dynamic Logics Using FDSOI for Ultra-Efficient Computing 利用FDSOI实现超高效计算的非传统动态逻辑设计
IF 2.4 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-04-21 DOI: 10.1109/JXCDC.2023.3269141
Shubham Kumar;Swetaki Chatterjee;Chetan Kumar Dabhi;Yogesh Singh Chauhan;Hussam Amrouch
In this article, we propose a nontraditional design of dynamic logic circuits using fully-depleted silicon-on-insulator (FDSOI) FETs. FDSOI FET allows the threshold voltage ( $V_{text {t}}$ ) to be adjustable (i.e., low- $V_{text {t}}$ and high- $V_{text {t}}$ states) by using the back gate (BG) bias. Our design utilizes the front gate (FG) and BG of an FDSOI FET as the input terminals and proposes the dynamic logic gates (like NAND, NOR, AND, OR, XOR, and XNOR) and circuits (like a half-adder and full-adder). It requires fewer transistors to build dynamic logic gates and achieves high performance with low power dissipation compared to conventional dynamic logic designs. The compact industrial model of FDSOI FET (BSIM-IMG) has been used to simulate dynamic logic gates and is fully calibrated to reproduce the 14 nm FDSOI FET technology node data. Calibration is performed for both electrical characteristics and process variations. The simulation results show an average improvement in transistor count, propagation delay, power, and power-delay product (PDP) of 23.43%, 57.16%, 47.05%, and 77.29%, respectively, compared to the conventional designs. Further, our design reduces the charge-sharing effect, which affects the drivability of the dynamic logic gates. In addition, we have analyzed the impact of the process, supply voltage, and load capacitance variations on the propagation delay of the dynamic logic family in detail. The results show that these variations have a minor impact on the propagation delay of the proposed FDSOI-based dynamic logic gates compared to the conventional dynamic logic gates.
在本文中,我们提出了一种使用完全耗尽绝缘体上硅(FDSOI)FET的动态逻辑电路的非传统设计。FDSOI FET允许通过使用背栅(BG)偏置来调整阈值电压($V_{text{t}}$)(即,低-$V_{text{t}}$和高-$V_{text{t}}$状态)。我们的设计利用FDSOI FET的前栅极(FG)和BG作为输入端,并提出了动态逻辑门(如NAND、NOR、and、OR、XOR和XNOR)和电路(如半加法器和全加法器)。与传统的动态逻辑设计相比,它需要更少的晶体管来构建动态逻辑门,并以低功耗实现高性能。FDSOI FET的紧凑型工业模型(BSIM-IMG)已被用于模拟动态逻辑门,并被完全校准以再现14nm FDSOI场效应管技术节点数据。对电气特性和工艺变化进行校准。仿真结果显示,与传统设计相比,晶体管计数、传播延迟、功率和功率延迟乘积(PDP)的平均改善分别为23.43%、57.16%、47.05%和77.29%。此外,我们的设计减少了电荷共享效应,这影响了动态逻辑门的可驱动性。此外,我们还详细分析了工艺、电源电压和负载电容变化对动态逻辑族传播延迟的影响。结果表明,与传统动态逻辑门相比,这些变化对所提出的基于FDSOI的动态逻辑门的传播延迟影响较小。
{"title":"Nontraditional Design of Dynamic Logics Using FDSOI for Ultra-Efficient Computing","authors":"Shubham Kumar;Swetaki Chatterjee;Chetan Kumar Dabhi;Yogesh Singh Chauhan;Hussam Amrouch","doi":"10.1109/JXCDC.2023.3269141","DOIUrl":"10.1109/JXCDC.2023.3269141","url":null,"abstract":"In this article, we propose a nontraditional design of dynamic logic circuits using fully-depleted silicon-on-insulator (FDSOI) FETs. FDSOI FET allows the threshold voltage (\u0000<inline-formula> <tex-math>$V_{text {t}}$ </tex-math></inline-formula>\u0000) to be adjustable (i.e., low-\u0000<inline-formula> <tex-math>$V_{text {t}}$ </tex-math></inline-formula>\u0000 and high-\u0000<inline-formula> <tex-math>$V_{text {t}}$ </tex-math></inline-formula>\u0000 states) by using the back gate (BG) bias. Our design utilizes the front gate (FG) and BG of an FDSOI FET as the input terminals and proposes the dynamic logic gates (like NAND, NOR, AND, OR, XOR, and XNOR) and circuits (like a half-adder and full-adder). It requires fewer transistors to build dynamic logic gates and achieves high performance with low power dissipation compared to conventional dynamic logic designs. The compact industrial model of FDSOI FET (BSIM-IMG) has been used to simulate dynamic logic gates and is fully calibrated to reproduce the 14 nm FDSOI FET technology node data. Calibration is performed for both electrical characteristics and process variations. The simulation results show an average improvement in transistor count, propagation delay, power, and power-delay product (PDP) of 23.43%, 57.16%, 47.05%, and 77.29%, respectively, compared to the conventional designs. Further, our design reduces the charge-sharing effect, which affects the drivability of the dynamic logic gates. In addition, we have analyzed the impact of the process, supply voltage, and load capacitance variations on the propagation delay of the dynamic logic family in detail. The results show that these variations have a minor impact on the propagation delay of the proposed FDSOI-based dynamic logic gates compared to the conventional dynamic logic gates.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 1","pages":"74-82"},"PeriodicalIF":2.4,"publicationDate":"2023-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/6570653/10138050/10106142.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43854132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parallel Matrix Multiplication Using Voltage-Controlled Magnetic Anisotropy Domain Wall Logic 利用压控磁各向异性畴壁逻辑的并行矩阵乘法
IF 2.4 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-04-20 DOI: 10.1109/JXCDC.2023.3266441
Nicholas Zogbi;Samuel Liu;Christopher H. Bennett;Sapan Agarwal;Matthew J. Marinella;Jean Anne C. Incorvia;T. Patrick Xiao
The domain wall-magnetic tunnel junction (DW-MTJ) is a versatile device that can simultaneously store data and perform computations. These three-terminal devices are promising for digital logic due to their nonvolatility, low-energy operation, and radiation hardness. Here, we augment the DW-MTJ logic gate with voltage-controlled magnetic anisotropy (VCMA) to improve the reliability of logical concatenation in the presence of realistic process variations. VCMA creates potential wells that allow for reliable and repeatable localization of domain walls (DWs). The DW-MTJ logic gate supports different fanouts, allowing for multiple inputs and outputs for a single device without affecting the area. We simulate a systolic array of DW-MTJ multiply-accumulate (MAC) units with 4-bit and 8-bit precision, which uses the nonvolatility of DW-MTJ logic gates to enable fine-grained pipelining and high parallelism. The DW-MTJ systolic array provides comparable throughput and efficiency to state-of-the-art CMOS systolic arrays while being radiation-hard. These results improve the feasibility of using DW-based processors, especially for extreme-environment applications such as space.
畴壁磁隧道结(DW-MTJ)是一种多功能设备,可以同时存储数据和执行计算。这三种终端设备由于其非易失性、低能耗操作和辐射硬度而有望用于数字逻辑。在这里,我们用压控磁各向异性(VCMA)增强DW-MTJ逻辑门,以在存在实际工艺变化的情况下提高逻辑级联的可靠性。VCMA创造了潜在的阱,允许域壁(DW)的可靠和可重复的定位。DW-MTJ逻辑门支持不同的扇出,允许在不影响区域的情况下为单个设备提供多个输入和输出。我们模拟了一个具有4位和8位精度的DW-MTJ乘累加(MAC)单元的收缩阵列,该阵列使用DW-MaTJ逻辑门的非易失性来实现细粒度流水线和高并行性。DW-MTJ收缩阵列在抗辐射的同时,提供了与最先进的CMOS收缩阵列相当的吞吐量和效率。这些结果提高了使用基于DW的处理器的可行性,尤其是在太空等极端环境应用中。
{"title":"Parallel Matrix Multiplication Using Voltage-Controlled Magnetic Anisotropy Domain Wall Logic","authors":"Nicholas Zogbi;Samuel Liu;Christopher H. Bennett;Sapan Agarwal;Matthew J. Marinella;Jean Anne C. Incorvia;T. Patrick Xiao","doi":"10.1109/JXCDC.2023.3266441","DOIUrl":"10.1109/JXCDC.2023.3266441","url":null,"abstract":"The domain wall-magnetic tunnel junction (DW-MTJ) is a versatile device that can simultaneously store data and perform computations. These three-terminal devices are promising for digital logic due to their nonvolatility, low-energy operation, and radiation hardness. Here, we augment the DW-MTJ logic gate with voltage-controlled magnetic anisotropy (VCMA) to improve the reliability of logical concatenation in the presence of realistic process variations. VCMA creates potential wells that allow for reliable and repeatable localization of domain walls (DWs). The DW-MTJ logic gate supports different fanouts, allowing for multiple inputs and outputs for a single device without affecting the area. We simulate a systolic array of DW-MTJ multiply-accumulate (MAC) units with 4-bit and 8-bit precision, which uses the nonvolatility of DW-MTJ logic gates to enable fine-grained pipelining and high parallelism. The DW-MTJ systolic array provides comparable throughput and efficiency to state-of-the-art CMOS systolic arrays while being radiation-hard. These results improve the feasibility of using DW-based processors, especially for extreme-environment applications such as space.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 1","pages":"65-73"},"PeriodicalIF":2.4,"publicationDate":"2023-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/6570653/10138050/10106129.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43778223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Stochastic Computing Scheme of Embedding Random Bit Generation and Processing in Computational Random Access Memory (SC-CRAM) 一种在计算随机存取存储器(SC-CRAM)中嵌入随机位生成和处理的随机计算方案
IF 2.4 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-04-11 DOI: 10.1109/JXCDC.2023.3266136
Brandon R. Zink;Yang Lv;Masoud Zabihi;Husrev Cilasun;Sachin S. Sapatnekar;Ulya R. Karpuzcu;Marc D. Riedel;Jian-Ping Wang
Stochastic computing (SC) has emerged as a promising solution for performing complex functions on large amounts of data to meet future computing demands. However, the hardware needed to generate random bit-streams using conventional CMOS-based technologies drastically increases the area and delay cost. Area costs can be reduced using spintronics-based random number generators (RNGs), and however, this will not alleviate the delay costs since stochastic bit generation is still performed separately from the computation. In this article, we present an SC method of embedding stochastic bit generation and processing in a computational random access memory (CRAM) array, which we refer to as SC-CRAM. We demonstrate that SC-CRAM is a resilient and low-cost method for image processing, Bayesian inference systems, and Bayesian belief networks.
随机计算(SC)已经成为一种很有前途的解决方案,用于在大量数据上执行复杂函数,以满足未来的计算需求。然而,使用传统的基于cmos的技术生成随机比特流所需的硬件大大增加了面积和延迟成本。使用基于自旋电子学的随机数生成器(rng)可以减少面积成本,但是,这并不能减轻延迟成本,因为随机比特生成仍然与计算分开执行。在本文中,我们提出了一种在计算随机存取存储器(CRAM)阵列中嵌入随机位生成和处理的SC方法,我们称之为SC-CRAM。我们证明SC-CRAM是一种弹性和低成本的图像处理方法,贝叶斯推理系统和贝叶斯信念网络。
{"title":"A Stochastic Computing Scheme of Embedding Random Bit Generation and Processing in Computational Random Access Memory (SC-CRAM)","authors":"Brandon R. Zink;Yang Lv;Masoud Zabihi;Husrev Cilasun;Sachin S. Sapatnekar;Ulya R. Karpuzcu;Marc D. Riedel;Jian-Ping Wang","doi":"10.1109/JXCDC.2023.3266136","DOIUrl":"10.1109/JXCDC.2023.3266136","url":null,"abstract":"Stochastic computing (SC) has emerged as a promising solution for performing complex functions on large amounts of data to meet future computing demands. However, the hardware needed to generate random bit-streams using conventional CMOS-based technologies drastically increases the area and delay cost. Area costs can be reduced using spintronics-based random number generators (RNGs), and however, this will not alleviate the delay costs since stochastic bit generation is still performed separately from the computation. In this article, we present an SC method of embedding stochastic bit generation and processing in a computational random access memory (CRAM) array, which we refer to as SC-CRAM. We demonstrate that SC-CRAM is a resilient and low-cost method for image processing, Bayesian inference systems, and Bayesian belief networks.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 1","pages":"29-37"},"PeriodicalIF":2.4,"publicationDate":"2023-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/6570653/10138050/10099030.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43855515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Generalized Block-Matrix Circuit for Closed-Loop Analog In-Memory Computing 一种用于闭环模拟内存计算的广义分阵电路
IF 2.4 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-04-10 DOI: 10.1109/JXCDC.2023.3265803
Piergiulio Mannocci;Daniele Ielmini
Matrix-based computing is ubiquitous in an increasing number of present-day machine learning applications such as neural networks, regression, and 5G communications. Conventional systems based on von-Neumann architecture are limited by the energy and latency bottleneck induced by the physical separation of the processing and memory units. In-memory computing (IMC) is a novel paradigm where computation is performed directly within the memory, thus eliminating the need for constant data transfer. IMC has shown exceptional throughput and energy efficiency when coupled with crosspoint arrays of resistive memory devices in open-loop matrix-vector-multiplication and closed-loop inverse-matrix-vector multiplication (IMVM) accelerators. However, each application results in a different circuit topology, thus complicating the development of reconfigurable, general-purpose IMC systems. In this article, we present a generalized closed-loop IMVM circuit capable of performing any linear matrix operation by proper memory remapping. We derive closed-form equations for the ideal input-output transfer functions, static error, and dynamic behavior, introducing a novel continuous-time analytical model allowing for orders-of-magnitude simulation speedup with respect to SPICE-based solvers. The proposed circuit represents an ideal candidate for general-purpose accelerators of machine learning.
基于矩阵的计算在当今越来越多的机器学习应用中无处不在,如神经网络、回归和5G通信。基于von Neumann体系结构的传统系统受到处理单元和存储单元的物理分离所引起的能量和延迟瓶颈的限制。内存内计算(IMC)是一种新的范式,其中计算直接在内存内执行,从而消除了对恒定数据传输的需要。当在开环矩阵矢量乘法和闭环逆矩阵矢量乘法(IMVM)加速器中与电阻存储器器件的交叉点阵列耦合时,IMC已经显示出异常的吞吐量和能量效率。然而,每种应用都会导致不同的电路拓扑,从而使可重新配置的通用IMC系统的开发变得复杂。在本文中,我们提出了一种广义闭环IMVM电路,它能够通过适当的内存重映射来执行任何线性矩阵运算。我们推导了理想输入输出传递函数、静态误差和动态行为的闭合方程,引入了一种新的连续时间分析模型,允许相对于基于SPICE的求解器进行数量级的模拟加速。所提出的电路代表了机器学习通用加速器的理想候选者。
{"title":"A Generalized Block-Matrix Circuit for Closed-Loop Analog In-Memory Computing","authors":"Piergiulio Mannocci;Daniele Ielmini","doi":"10.1109/JXCDC.2023.3265803","DOIUrl":"10.1109/JXCDC.2023.3265803","url":null,"abstract":"Matrix-based computing is ubiquitous in an increasing number of present-day machine learning applications such as neural networks, regression, and 5G communications. Conventional systems based on von-Neumann architecture are limited by the energy and latency bottleneck induced by the physical separation of the processing and memory units. In-memory computing (IMC) is a novel paradigm where computation is performed directly within the memory, thus eliminating the need for constant data transfer. IMC has shown exceptional throughput and energy efficiency when coupled with crosspoint arrays of resistive memory devices in open-loop matrix-vector-multiplication and closed-loop inverse-matrix-vector multiplication (IMVM) accelerators. However, each application results in a different circuit topology, thus complicating the development of reconfigurable, general-purpose IMC systems. In this article, we present a generalized closed-loop IMVM circuit capable of performing any linear matrix operation by proper memory remapping. We derive closed-form equations for the ideal input-output transfer functions, static error, and dynamic behavior, introducing a novel continuous-time analytical model allowing for orders-of-magnitude simulation speedup with respect to SPICE-based solvers. The proposed circuit represents an ideal candidate for general-purpose accelerators of machine learning.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 1","pages":"47-55"},"PeriodicalIF":2.4,"publicationDate":"2023-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/6570653/10138050/10097860.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45573365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Nonvolatile Compute-in-Memory Macro Using Voltage-Controlled MRAM and In Situ Magnetic-to-Digital Converter 使用压控MRAM和原位磁-数转换器的非易失性内存宏
IF 2.4 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-03-17 DOI: 10.1109/JXCDC.2023.3258431
Vinod Kurian Jacob;Jiyue Yang;Haoran He;Puneet Gupta;Kang L Wang;Sudhakar Pamarti
Compute-in-memory (CIM) accelerator has become a popular solution to achieve high energy efficiency for deep learning applications in edge devices. Recent works have demonstrated CIM macros using nonvolatile memories [spin transfer torque (STT)-MRAM and resistive random access memory (RRAM)] to take advantages of their nonvolatility and high density. However, effective computation dynamic range is far lower than their static random access memory (SRAM)-CIM counterparts due to low device ON/ OFF ratio. In this work, we combine a nonvolatile memory based on a voltage-controlled magnetic tunneling junction (VC-MTJ) device, called voltage-controlled MRAM or VC-MRAM, and accurate switched-capacitor-based CIM using a novel in situ magnetic-to-digital converter (MDC). The VC-MTJ device has demonstrated $10times $ lower write energy and switching time compared to STT-MRAM device and has comparable density, read energy, and read latency. The in situ MDCs embedded inside each VC-MRAM row convert magnetically stored weight information to CMOS logic levels and enable switched-capacitor-based multiply–accumulate (MAC) operation with accuracy comparable to the state-of-the-art SRAM-CIM. This article describes the schematic and layout level design of a VC-MRAM CIM macro in 28 nm. This is the first nonvolatile CIM design to enable analog MAC computation with 256 parallel rows turned ON simultaneously without degradation in dynamic range (< 1 LSB). Detailed circuit simulations including experimentally validated VC-MTJ compact models show $1.5times $ higher energy efficiency and $2times $ higher density compared to the state-of-the-art SRAM-based CIM.
内存中计算(CIM)加速器已成为边缘设备中深度学习应用实现高能效的流行解决方案。最近的研究表明,CIM宏使用非易失性存储器[自旋转移扭矩(STT)-MRAM和电阻随机存取存储器(RRAM)]来利用其非易失性和高密度的优势。然而,由于器件的开/关比较低,其有效计算动态范围远低于静态随机存取存储器(SRAM)-CIM。在这项工作中,我们结合了基于压控磁隧道结(VC-MTJ)器件的非易失性存储器,称为压控MRAM或VC-MRAM,以及使用新型原位磁数字转换器(MDC)的基于精确开关电容的CIM。与STT-MRAM器件相比,VC-MTJ器件的写入能量和开关时间降低了10倍,并且具有相当的密度、读取能量和读取延迟。嵌入在每个VC-MRAM行中的原位mdc将磁性存储的重量信息转换为CMOS逻辑电平,并实现基于开关电容的乘法累积(MAC)操作,其精度可与最先进的SRAM-CIM相媲美。本文介绍了一种28纳米的VC-MRAM CIM宏的原理图和布局级设计。这是第一个非易失性CIM设计,可以在256个并行行同时开启的情况下实现模拟MAC计算,而不会降低动态范围(< 1 LSB)。包括实验验证的VC-MTJ紧凑型模型在内的详细电路模拟显示,与最先进的基于sram的CIM相比,能效提高1.5倍,密度提高2倍。
{"title":"A Nonvolatile Compute-in-Memory Macro Using Voltage-Controlled MRAM and In Situ Magnetic-to-Digital Converter","authors":"Vinod Kurian Jacob;Jiyue Yang;Haoran He;Puneet Gupta;Kang L Wang;Sudhakar Pamarti","doi":"10.1109/JXCDC.2023.3258431","DOIUrl":"10.1109/JXCDC.2023.3258431","url":null,"abstract":"Compute-in-memory (CIM) accelerator has become a popular solution to achieve high energy efficiency for deep learning applications in edge devices. Recent works have demonstrated CIM macros using nonvolatile memories [spin transfer torque (STT)-MRAM and resistive random access memory (RRAM)] to take advantages of their nonvolatility and high density. However, effective computation dynamic range is far lower than their static random access memory (SRAM)-CIM counterparts due to low device ON/ OFF ratio. In this work, we combine a nonvolatile memory based on a voltage-controlled magnetic tunneling junction (VC-MTJ) device, called voltage-controlled MRAM or VC-MRAM, and accurate switched-capacitor-based CIM using a novel in situ magnetic-to-digital converter (MDC). The VC-MTJ device has demonstrated \u0000<inline-formula> <tex-math>$10times $ </tex-math></inline-formula>\u0000 lower write energy and switching time compared to STT-MRAM device and has comparable density, read energy, and read latency. The in situ MDCs embedded inside each VC-MRAM row convert magnetically stored weight information to CMOS logic levels and enable switched-capacitor-based multiply–accumulate (MAC) operation with accuracy comparable to the state-of-the-art SRAM-CIM. This article describes the schematic and layout level design of a VC-MRAM CIM macro in 28 nm. This is the first nonvolatile CIM design to enable analog MAC computation with 256 parallel rows turned ON simultaneously without degradation in dynamic range (< 1 LSB). Detailed circuit simulations including experimentally validated VC-MTJ compact models show \u0000<inline-formula> <tex-math>$1.5times $ </tex-math></inline-formula>\u0000 higher energy efficiency and \u0000<inline-formula> <tex-math>$2times $ </tex-math></inline-formula>\u0000 higher density compared to the state-of-the-art SRAM-based CIM.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 1","pages":"56-64"},"PeriodicalIF":2.4,"publicationDate":"2023-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/6570653/10138050/10075423.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41359420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A High-Parallelism RRAM-Based Compute-In-Memory Macro With Intrinsic Impedance Boosting and In-ADC Computing 基于内禀阻抗增强和adc内计算的高并行rram内存宏
IF 2.4 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-03-14 DOI: 10.1109/JXCDC.2023.3255788
Tian Xie;Shimeng Yu;Shaolan Li
Resistive random access memory (RRAM) is considered to be a promising compute-in-memory (CIM) platform; however, they tend to lose energy efficiency quickly in high-throughput and high-resolution cases. Instead of using access transistors as switches, this work explores their analog characteristics as common-gate current buffers. So the cell current can be minimized and the output impedance is boosted. The idea of In-ADC Computing (IAC) is also proposed to further decrease the complexity of the peripheral circuits. Benefiting from the proposed ideas, a pretrained VGG-8 network based on the CIFAR-10 dataset can be implemented, and an accuracy of 87.2% is achieved with 8.9 TOPS/W energy efficiency (for 8-bit multiply-and-accumulate (MAC) operation), demonstrating that the proposed techniques enable low-distortion partial sum results while still being able to operate in a power-efficient way.
电阻随机存取存储器(RRAM)被认为是一个很有前途的内存计算(CIM)平台;然而,在高通量和高分辨率的情况下,它们往往会迅速失去能量效率。这项工作没有使用存取晶体管作为开关,而是探索了它们作为公共栅极电流缓冲器的模拟特性。因此,可以使电池电流最小化,并提高输出阻抗。为了进一步降低外围电路的复杂度,还提出了ADC内计算(IAC)的思想。得益于所提出的思想,可以实现基于CIFAR-10数据集的预训练VGG-8网络,并且以8.9TOPS/W的能量效率(对于8位乘法和累加(MAC)运算)实现87.2%的准确率,表明所提出的技术能够实现低失真的部分和结果,同时仍然能够以功率有效的方式操作。
{"title":"A High-Parallelism RRAM-Based Compute-In-Memory Macro With Intrinsic Impedance Boosting and In-ADC Computing","authors":"Tian Xie;Shimeng Yu;Shaolan Li","doi":"10.1109/JXCDC.2023.3255788","DOIUrl":"10.1109/JXCDC.2023.3255788","url":null,"abstract":"Resistive random access memory (RRAM) is considered to be a promising compute-in-memory (CIM) platform; however, they tend to lose energy efficiency quickly in high-throughput and high-resolution cases. Instead of using access transistors as switches, this work explores their analog characteristics as common-gate current buffers. So the cell current can be minimized and the output impedance is boosted. The idea of In-ADC Computing (IAC) is also proposed to further decrease the complexity of the peripheral circuits. Benefiting from the proposed ideas, a pretrained VGG-8 network based on the CIFAR-10 dataset can be implemented, and an accuracy of 87.2% is achieved with 8.9 TOPS/W energy efficiency (for 8-bit multiply-and-accumulate (MAC) operation), demonstrating that the proposed techniques enable low-distortion partial sum results while still being able to operate in a power-efficient way.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 1","pages":"38-46"},"PeriodicalIF":2.4,"publicationDate":"2023-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/6570653/10138050/10070378.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46523822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Full-Stack View of Probabilistic Computing With p-Bits: Devices, Architectures, and Algorithms 概率计算的全栈视图与p位:设备,架构和算法
IF 2.4 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-03-14 DOI: 10.1109/JXCDC.2023.3256981
Shuvro Chowdhury;Andrea Grimaldi;Navid Anjum Aadit;Shaila Niazi;Masoud Mohseni;Shun Kanai;Hideo Ohno;Shunsuke Fukami;Luke Theogarajan;Giovanni Finocchio;Supriyo Datta;Kerem Y. Camsari
The transistor celebrated its 75th birthday in 2022. The continued scaling of the transistor defined by Moore’s law continues, albeit at a slower pace. Meanwhile, computing demands and energy consumption required by modern artificial intelligence (AI) algorithms have skyrocketed. As an alternative to scaling transistors for general-purpose computing, the integration of transistors with unconventional technologies has emerged as a promising path for domain-specific computing. In this article, we provide a full-stack review of probabilistic computing with p-bits as a representative example of the energy-efficient and domain-specific computing movement. We argue that p-bits could be used to build energy-efficient probabilistic systems, tailored for probabilistic algorithms and applications. From hardware, architecture, and algorithmic perspectives, we outline the main applications of probabilistic computers ranging from probabilistic machine learning (ML) and AI to combinatorial optimization and quantum simulation. Combining emerging nanodevices with the existing CMOS ecosystem will lead to probabilistic computers with orders of magnitude improvements in energy efficiency and probabilistic sampling, potentially unlocking previously unexplored regimes for powerful probabilistic algorithms.
晶体管在2022年庆祝了其75岁生日。摩尔定律所定义的晶体管的持续缩小仍在继续,尽管速度有所放缓。与此同时,现代人工智能(AI)算法所需的计算需求和能耗也在飙升。作为通用计算的缩放晶体管的替代方案,晶体管与非常规技术的集成已经成为特定领域计算的一个有前途的途径。在本文中,我们提供了p位概率计算的全栈回顾,作为节能和特定领域计算运动的代表性示例。我们认为,p比特可以用来构建节能的概率系统,为概率算法和应用量身定制。从硬件、架构和算法的角度,我们概述了概率计算机的主要应用,从概率机器学习(ML)和人工智能到组合优化和量子模拟。将新兴的纳米器件与现有的CMOS生态系统相结合,将导致概率计算机在能源效率和概率采样方面有数量级的提高,有可能为强大的概率算法打开以前未开发的机制。
{"title":"A Full-Stack View of Probabilistic Computing With p-Bits: Devices, Architectures, and Algorithms","authors":"Shuvro Chowdhury;Andrea Grimaldi;Navid Anjum Aadit;Shaila Niazi;Masoud Mohseni;Shun Kanai;Hideo Ohno;Shunsuke Fukami;Luke Theogarajan;Giovanni Finocchio;Supriyo Datta;Kerem Y. Camsari","doi":"10.1109/JXCDC.2023.3256981","DOIUrl":"10.1109/JXCDC.2023.3256981","url":null,"abstract":"The transistor celebrated its 75th birthday in 2022. The continued scaling of the transistor defined by Moore’s law continues, albeit at a slower pace. Meanwhile, computing demands and energy consumption required by modern artificial intelligence (AI) algorithms have skyrocketed. As an alternative to scaling transistors for general-purpose computing, the integration of transistors with unconventional technologies has emerged as a promising path for domain-specific computing. In this article, we provide a full-stack review of probabilistic computing with p-bits as a representative example of the energy-efficient and domain-specific computing movement. We argue that p-bits could be used to build energy-efficient probabilistic systems, tailored for probabilistic algorithms and applications. From hardware, architecture, and algorithmic perspectives, we outline the main applications of probabilistic computers ranging from probabilistic machine learning (ML) and AI to combinatorial optimization and quantum simulation. Combining emerging nanodevices with the existing CMOS ecosystem will lead to probabilistic computers with orders of magnitude improvements in energy efficiency and probabilistic sampling, potentially unlocking previously unexplored regimes for powerful probabilistic algorithms.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 1","pages":"1-11"},"PeriodicalIF":2.4,"publicationDate":"2023-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/6570653/10138050/10068500.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41667070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Oscillator-Inspired Dynamical Systems to Solve Boolean Satisfiability 求解布尔可满足性的振子激励动力系统
IF 2.4 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-01-31 DOI: 10.1109/JXCDC.2023.3241045
Mohammad Khairul Bashar;Zongli Lin;Nikhil Shukla
Dynamical systems can offer a novel non-Boolean approach to computing. Specifically, the natural minimization of energy in the system is a valuable property for minimizing the objective functions of combinatorial optimization problems, many of which are still challenging to solve using conventional digital solvers. In this work, we design two oscillator-inspired dynamical systems to solve quintessential computationally intractable problems in Boolean satisfiability (SAT). The system dynamics are engineered such that they facilitate solutions to two different flavors of the SAT problem. We formulate the first dynamical system to compute the solution to the 3-SAT problem, while for the second system, we show that its dynamics map to the solution of the Max-not-all-equal (NAE)-3-SAT problem. Our work advances our understanding of how this physics-inspired approach can be used to address challenging problems in computing.
动态系统可以提供一种新颖的非布尔计算方法。具体来说,系统中能量的自然最小化是最小化组合优化问题目标函数的一个有价值的属性,其中许多问题仍然难以使用传统的数字求解器来解决。在这项工作中,我们设计了两个振子激励的动力系统来解决布尔可满足性(SAT)中典型的计算棘手问题。系统动力学是这样设计的,它们有助于解决两种不同风格的SAT问题。我们建立了第一个动力系统来计算3-SAT问题的解,而对于第二个系统,我们证明了它的动力学映射到最大不全等于(NAE)-3-SAT问题的解。我们的工作促进了我们对如何使用这种受物理启发的方法来解决计算中的挑战性问题的理解。
{"title":"Oscillator-Inspired Dynamical Systems to Solve Boolean Satisfiability","authors":"Mohammad Khairul Bashar;Zongli Lin;Nikhil Shukla","doi":"10.1109/JXCDC.2023.3241045","DOIUrl":"10.1109/JXCDC.2023.3241045","url":null,"abstract":"Dynamical systems can offer a novel non-Boolean approach to computing. Specifically, the natural minimization of energy in the system is a valuable property for minimizing the objective functions of combinatorial optimization problems, many of which are still challenging to solve using conventional digital solvers. In this work, we design two oscillator-inspired dynamical systems to solve quintessential computationally intractable problems in Boolean satisfiability (SAT). The system dynamics are engineered such that they facilitate solutions to two different flavors of the SAT problem. We formulate the first dynamical system to compute the solution to the 3-SAT problem, while for the second system, we show that its dynamics map to the solution of the Max-not-all-equal (NAE)-3-SAT problem. Our work advances our understanding of how this physics-inspired approach can be used to address challenging problems in computing.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 1","pages":"12-20"},"PeriodicalIF":2.4,"publicationDate":"2023-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/6570653/10138050/10032530.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43177120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Dynamical System-Based Computational Models for Solving Combinatorial Optimization on Hypergraphs 基于动力学系统的超图组合优化计算模型
IF 2.4 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-01-09 DOI: 10.1109/JXCDC.2023.3235113
Mohammad Khairul Bashar;Antik Mallick;Avik W. Ghosh;Nikhil Shukla
The intrinsic energy minimization in dynamical systems offers a valuable tool for minimizing the objective functions of computationally challenging problems in combinatorial optimization. However, most prior works have focused on mapping such dynamics to combinatorial optimization problems whose objective functions have quadratic degree [e.g., maximum cut (MaxCut)]; such problems can be represented and analyzed using graphs. However, the work on developing such models for problems that need objective functions with degree greater than two, and subsequently, entail the use of hypergraph data structures, is relatively sparse. In this work, we develop dynamical system-inspired computational models for several such problems. Specifically, we define the “energy function” for hypergraph-based combinatorial problems ranging from Boolean Satisfiability (SAT) and its variants to integer factorization, and subsequently, define the resulting system dynamics. We also show that the design approach is applicable to optimization problems with quadratic degree, and use it to develop a new dynamical system formulation for minimizing the Ising Hamiltonian. Our work not only expands on the scope of problems that can be directly mapped to, and solved using physics-inspired models, but also creates new opportunities to design high-performance accelerators for solving combinatorial optimization.
动力系统中的固有能量最小化为最小化组合优化中具有计算挑战性的问题的目标函数提供了一个有价值的工具。然而,大多数先前的工作都集中在将这种动力学映射到目标函数具有二次度的组合优化问题上[例如,最大割(MaxCut)];这样的问题可以用图来表示和分析。然而,对于需要阶数大于2的目标函数的问题,以及随后需要使用超图数据结构的问题,开发此类模型的工作相对较少。在这项工作中,我们为几个这样的问题开发了受动力系统启发的计算模型。具体来说,我们定义了从布尔可满足性(SAT)及其变体到整数因子分解的基于超图的组合问题的“能量函数”,然后定义了由此产生的系统动力学。我们还证明了该设计方法适用于具有二次度的优化问题,并用它开发了一个新的最小化伊辛哈密顿量的动力系统公式。我们的工作不仅扩展了可以直接映射到并使用物理启发模型解决的问题的范围,还为设计用于解决组合优化的高性能加速器创造了新的机会。
{"title":"Dynamical System-Based Computational Models for Solving Combinatorial Optimization on Hypergraphs","authors":"Mohammad Khairul Bashar;Antik Mallick;Avik W. Ghosh;Nikhil Shukla","doi":"10.1109/JXCDC.2023.3235113","DOIUrl":"10.1109/JXCDC.2023.3235113","url":null,"abstract":"The intrinsic energy minimization in dynamical systems offers a valuable tool for minimizing the objective functions of computationally challenging problems in combinatorial optimization. However, most prior works have focused on mapping such dynamics to combinatorial optimization problems whose objective functions have quadratic degree [e.g., maximum cut (MaxCut)]; such problems can be represented and analyzed using graphs. However, the work on developing such models for problems that need objective functions with degree greater than two, and subsequently, entail the use of hypergraph data structures, is relatively sparse. In this work, we develop dynamical system-inspired computational models for several such problems. Specifically, we define the “energy function” for hypergraph-based combinatorial problems ranging from Boolean Satisfiability (SAT) and its variants to integer factorization, and subsequently, define the resulting system dynamics. We also show that the design approach is applicable to optimization problems with quadratic degree, and use it to develop a new dynamical system formulation for minimizing the Ising Hamiltonian. Our work not only expands on the scope of problems that can be directly mapped to, and solved using physics-inspired models, but also creates new opportunities to design high-performance accelerators for solving combinatorial optimization.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 1","pages":"21-28"},"PeriodicalIF":2.4,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/6570653/10138050/10011425.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48914151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
IEEE Journal on Exploratory Solid-State Computational Devices and Circuits
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1