Pub Date : 2024-11-11DOI: 10.1109/JXCDC.2024.3495612
Nellie Laleni;Franz Müller;Gonzalo Cuñarro;Thomas Kämpfe;Taekwang Jang
This article introduces a 1FeFET-1Capacitance (1F1C) macro based on a 2-bit ferroelectric field-effect transistor (FeFET) cell operating in the charge domain, marking a significant advancement in nonvolatile memory (NVM) and compute-in-memory (CIM). Traditionally, NVMs, such as FeFETs or resistive RAMs (RRAMs), have operated in a single-bit fashion, limiting their computational density and throughput. In contrast, the proposed 2-bit FeFET cell enables higher storage density and improves the computational efficiency in CIM architectures. The macro achieves 111.6 TOPS/W, highlighting its energy efficiency, and demonstrates robust performance on the CIFAR-10 dataset, achieving 89% accuracy with a VGG-8 neural network. These findings underscore the potential of charge-domain, multilevel NVM cells in pushing the boundaries of artificial intelligence (AI) acceleration and energy-efficient computing.
{"title":"A High-Efficiency Charge-Domain Compute-in-Memory 1F1C Macro Using 2-bit FeFET Cells for DNN Processing","authors":"Nellie Laleni;Franz Müller;Gonzalo Cuñarro;Thomas Kämpfe;Taekwang Jang","doi":"10.1109/JXCDC.2024.3495612","DOIUrl":"https://doi.org/10.1109/JXCDC.2024.3495612","url":null,"abstract":"This article introduces a 1FeFET-1Capacitance (1F1C) macro based on a 2-bit ferroelectric field-effect transistor (FeFET) cell operating in the charge domain, marking a significant advancement in nonvolatile memory (NVM) and compute-in-memory (CIM). Traditionally, NVMs, such as FeFETs or resistive RAMs (RRAMs), have operated in a single-bit fashion, limiting their computational density and throughput. In contrast, the proposed 2-bit FeFET cell enables higher storage density and improves the computational efficiency in CIM architectures. The macro achieves 111.6 TOPS/W, highlighting its energy efficiency, and demonstrates robust performance on the CIFAR-10 dataset, achieving 89% accuracy with a VGG-8 neural network. These findings underscore the potential of charge-domain, multilevel NVM cells in pushing the boundaries of artificial intelligence (AI) acceleration and energy-efficient computing.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"10 ","pages":"153-160"},"PeriodicalIF":2.0,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10750057","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142825857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
High-performance edge artificial intelligence (Edge-AI) inference applications aim for high energy efficiency, memory density, and small form factor, requiring a design-space exploration across the whole stack—workloads, architecture, mapping, and co-optimization with emerging technology. In this article, we present a system-technology co-optimization (STCO) framework that interfaces with workload-driven system scaling challenges and physical design-enabled technology offerings. The framework is built on three engines that provide the physical design characterization, dataflow mapping optimizer, and system efficiency predictor. The framework builds on a systolic array accelerator to provide the design-technology characterization points using advanced imec A10 nanosheet CMOS node along with emerging, high-density voltage-gated spin-orbit torque (VGSOT) magnetic memories (MRAM), combined with memory-on-logic fine-pitch 3-D wafer-to-wafer hybrid bonding. We observe that the 3-D system integration of static random-access memory (SRAM)-based design leads to 9% power savings with 53% footprint reduction at iso-frequency with respect to 2-D implementation for the same memory capacity. Three-dimensional nonvolatile memory (NVM)-VGSOT allows $4times $