Justin M. Correll;Lu Jie;Seungheun Song;Seungjong Lee;Junkang Zhu;Wei Tang;Luke Wormald;Jack Erhardt;Nicolas Breil;Roger Quon;Deepak Kamalanathan;Siddarth Krishnan;Michael Chudzik;Wei D. Lu;Zhengya Zhang;Michael P. Flynn
{"title":"An 8-bit 20.7 TOPS/W Multilevel Cell ReRAM Macro With ADC-Assisted Bit-Serial Processing","authors":"Justin M. Correll;Lu Jie;Seungheun Song;Seungjong Lee;Junkang Zhu;Wei Tang;Luke Wormald;Jack Erhardt;Nicolas Breil;Roger Quon;Deepak Kamalanathan;Siddarth Krishnan;Michael Chudzik;Wei D. Lu;Zhengya Zhang;Michael P. Flynn","doi":"10.1109/JSSC.2025.3540114","DOIUrl":null,"url":null,"abstract":"Analog compute in memory (CIM) with multilevel cell (MLC) resistive random access memory (ReRAM) promises highly dense and efficient compute support for machine learning and scientific computing. This article introduces analog to digital converter (ADC)-assisted bit-serial processing for efficient, high-throughput compute. Bit-serial digital to analog converters (DACs) and 8-bit binary-weighted multicycle sampling (BWMCS) ADCs perform analog vector-matrix multiplication (VMM) on MLC-based crossbar arrays. A direct drive <inline-formula> <tex-math>${g}_{m}$ </tex-math></inline-formula>-boosted transimpedance amplifier (TIA) enables high-speed crossbar readout. We present a system on chip (SoC) prototype consisting of four self-contained ReRAM-based CIM macros and a reduced instruction set computer-five (RISC-V) host. The test chip is fabricated in 65 nm CMOS with foundry-integrated MLC ReRAM. We trained LeNet1 for handwritten digit classification and mapped the CNN weights differentially to 3-bit MLC ReRAM across multiple CIM macros. The classification accuracy loss is 1.6% when compared to the quantization-aware trained model. The measured raw and normalized peak efficiencies are 20.7 and 662 TOPS/W, respectively. The compute density is 8.4 TOPS/mm2.","PeriodicalId":13129,"journal":{"name":"IEEE Journal of Solid-state Circuits","volume":"60 8","pages":"2995-3008"},"PeriodicalIF":5.6000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Solid-state Circuits","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10899870/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Analog compute in memory (CIM) with multilevel cell (MLC) resistive random access memory (ReRAM) promises highly dense and efficient compute support for machine learning and scientific computing. This article introduces analog to digital converter (ADC)-assisted bit-serial processing for efficient, high-throughput compute. Bit-serial digital to analog converters (DACs) and 8-bit binary-weighted multicycle sampling (BWMCS) ADCs perform analog vector-matrix multiplication (VMM) on MLC-based crossbar arrays. A direct drive ${g}_{m}$ -boosted transimpedance amplifier (TIA) enables high-speed crossbar readout. We present a system on chip (SoC) prototype consisting of four self-contained ReRAM-based CIM macros and a reduced instruction set computer-five (RISC-V) host. The test chip is fabricated in 65 nm CMOS with foundry-integrated MLC ReRAM. We trained LeNet1 for handwritten digit classification and mapped the CNN weights differentially to 3-bit MLC ReRAM across multiple CIM macros. The classification accuracy loss is 1.6% when compared to the quantization-aware trained model. The measured raw and normalized peak efficiencies are 20.7 and 662 TOPS/W, respectively. The compute density is 8.4 TOPS/mm2.
期刊介绍:
The IEEE Journal of Solid-State Circuits publishes papers each month in the broad area of solid-state circuits with particular emphasis on transistor-level design of integrated circuits. It also provides coverage of topics such as circuits modeling, technology, systems design, layout, and testing that relate directly to IC design. Integrated circuits and VLSI are of principal interest; material related to discrete circuit design is seldom published. Experimental verification is strongly encouraged.