Pub Date : 2023-04-11DOI: 10.1109/JXCDC.2023.3266136
Brandon R. Zink;Yang Lv;Masoud Zabihi;Husrev Cilasun;Sachin S. Sapatnekar;Ulya R. Karpuzcu;Marc D. Riedel;Jian-Ping Wang
Stochastic computing (SC) has emerged as a promising solution for performing complex functions on large amounts of data to meet future computing demands. However, the hardware needed to generate random bit-streams using conventional CMOS-based technologies drastically increases the area and delay cost. Area costs can be reduced using spintronics-based random number generators (RNGs), and however, this will not alleviate the delay costs since stochastic bit generation is still performed separately from the computation. In this article, we present an SC method of embedding stochastic bit generation and processing in a computational random access memory (CRAM) array, which we refer to as SC-CRAM. We demonstrate that SC-CRAM is a resilient and low-cost method for image processing, Bayesian inference systems, and Bayesian belief networks.
{"title":"A Stochastic Computing Scheme of Embedding Random Bit Generation and Processing in Computational Random Access Memory (SC-CRAM)","authors":"Brandon R. Zink;Yang Lv;Masoud Zabihi;Husrev Cilasun;Sachin S. Sapatnekar;Ulya R. Karpuzcu;Marc D. Riedel;Jian-Ping Wang","doi":"10.1109/JXCDC.2023.3266136","DOIUrl":"10.1109/JXCDC.2023.3266136","url":null,"abstract":"Stochastic computing (SC) has emerged as a promising solution for performing complex functions on large amounts of data to meet future computing demands. However, the hardware needed to generate random bit-streams using conventional CMOS-based technologies drastically increases the area and delay cost. Area costs can be reduced using spintronics-based random number generators (RNGs), and however, this will not alleviate the delay costs since stochastic bit generation is still performed separately from the computation. In this article, we present an SC method of embedding stochastic bit generation and processing in a computational random access memory (CRAM) array, which we refer to as SC-CRAM. We demonstrate that SC-CRAM is a resilient and low-cost method for image processing, Bayesian inference systems, and Bayesian belief networks.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 1","pages":"29-37"},"PeriodicalIF":2.4,"publicationDate":"2023-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/6570653/10138050/10099030.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43855515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-10DOI: 10.1109/JXCDC.2023.3265803
Piergiulio Mannocci;Daniele Ielmini
Matrix-based computing is ubiquitous in an increasing number of present-day machine learning applications such as neural networks, regression, and 5G communications. Conventional systems based on von-Neumann architecture are limited by the energy and latency bottleneck induced by the physical separation of the processing and memory units. In-memory computing (IMC) is a novel paradigm where computation is performed directly within the memory, thus eliminating the need for constant data transfer. IMC has shown exceptional throughput and energy efficiency when coupled with crosspoint arrays of resistive memory devices in open-loop matrix-vector-multiplication and closed-loop inverse-matrix-vector multiplication (IMVM) accelerators. However, each application results in a different circuit topology, thus complicating the development of reconfigurable, general-purpose IMC systems. In this article, we present a generalized closed-loop IMVM circuit capable of performing any linear matrix operation by proper memory remapping. We derive closed-form equations for the ideal input-output transfer functions, static error, and dynamic behavior, introducing a novel continuous-time analytical model allowing for orders-of-magnitude simulation speedup with respect to SPICE-based solvers. The proposed circuit represents an ideal candidate for general-purpose accelerators of machine learning.
{"title":"A Generalized Block-Matrix Circuit for Closed-Loop Analog In-Memory Computing","authors":"Piergiulio Mannocci;Daniele Ielmini","doi":"10.1109/JXCDC.2023.3265803","DOIUrl":"10.1109/JXCDC.2023.3265803","url":null,"abstract":"Matrix-based computing is ubiquitous in an increasing number of present-day machine learning applications such as neural networks, regression, and 5G communications. Conventional systems based on von-Neumann architecture are limited by the energy and latency bottleneck induced by the physical separation of the processing and memory units. In-memory computing (IMC) is a novel paradigm where computation is performed directly within the memory, thus eliminating the need for constant data transfer. IMC has shown exceptional throughput and energy efficiency when coupled with crosspoint arrays of resistive memory devices in open-loop matrix-vector-multiplication and closed-loop inverse-matrix-vector multiplication (IMVM) accelerators. However, each application results in a different circuit topology, thus complicating the development of reconfigurable, general-purpose IMC systems. In this article, we present a generalized closed-loop IMVM circuit capable of performing any linear matrix operation by proper memory remapping. We derive closed-form equations for the ideal input-output transfer functions, static error, and dynamic behavior, introducing a novel continuous-time analytical model allowing for orders-of-magnitude simulation speedup with respect to SPICE-based solvers. The proposed circuit represents an ideal candidate for general-purpose accelerators of machine learning.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"9 1","pages":"47-55"},"PeriodicalIF":2.4,"publicationDate":"2023-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/6570653/10138050/10097860.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45573365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-17DOI: 10.1109/JXCDC.2023.3258431
Vinod Kurian Jacob;Jiyue Yang;Haoran He;Puneet Gupta;Kang L Wang;Sudhakar Pamarti
Compute-in-memory (CIM) accelerator has become a popular solution to achieve high energy efficiency for deep learning applications in edge devices. Recent works have demonstrated CIM macros using nonvolatile memories [spin transfer torque (STT)-MRAM and resistive random access memory (RRAM)] to take advantages of their nonvolatility and high density. However, effective computation dynamic range is far lower than their static random access memory (SRAM)-CIM counterparts due to low device ON/ OFF ratio. In this work, we combine a nonvolatile memory based on a voltage-controlled magnetic tunneling junction (VC-MTJ) device, called voltage-controlled MRAM or VC-MRAM, and accurate switched-capacitor-based CIM using a novel in situ magnetic-to-digital converter (MDC). The VC-MTJ device has demonstrated $10times $