A 818–4094 TOPS/W Capacitor-Reconfigured Analog CIM for Unified Acceleration of CNNs and Transformers

IF 5.6 1区工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Journal of Solid-state Circuits Pub Date : 2024-09-24 DOI:10.1109/JSSC.2024.3457898

Kentaro Yoshioka

{"title":"A 818–4094 TOPS/W Capacitor-Reconfigured Analog CIM for Unified Acceleration of CNNs and Transformers","authors":"Kentaro Yoshioka","doi":"10.1109/JSSC.2024.3457898","DOIUrl":null,"url":null,"abstract":"The rapid evolution of machine learning has led to the emergence of diverse neural network architectures, such as CNNs, Transformers, and their hybrid models, each with unique computational precision requirements. Transformers, in particular, demand higher precision compared to CNNs. Existing analog compute-in-memory (ACIM) solutions primarily cater to CNNs and struggle to achieve the high precision necessary for Transformers, despite their promise in addressing the memory bottleneck. To bridge this gap, we propose a capacitor-reconfigured CIM (CR-CIM) macro that introduces dual-mode operation, dynamically switching between high-precision and high-efficiency modes based on the active DNN layer. In the CNN mode, the CR-CIM employs bit-parallel computation and an 8-bit ADC to maximize power efficiency, exploiting the inherent error tolerance of CNNs. In contrast, for the Transformer mode, the CR-CIM switches to bit-serial computation and a 10-bit ADC to boost the compute signal-to-noise ratio (CSNR), ensuring the higher precision required by Transformers. This dual-mode functionality of the proposed CR-CIM is enabled by three key technologies: 1) a novel CR-CIM architecture and cell structure; 2) a resource-efficient multi-bit driver for bit-parallel computation; and 3) a software-analog co-design (SAC) strategy for enhanced Transformer computation. Our CR-CIM prototype is the first ACIM design to enable optimized operation for both Transformers and CNNs. CR-CIM achieves 45-dB signal-to-quantization-noise ratio (SQNR) and 31-dB CSNR (8-bit input and 8-bit weight bit-serial MAC) in the Transformer mode and a peak-power efficiency of 4094 TOPS/W (normalized to 1-bit <inline-formula> <tex-math>${\\times } 1$ </tex-math></inline-formula>-bit MAC) in the CNN mode.","PeriodicalId":13129,"journal":{"name":"IEEE Journal of Solid-state Circuits","volume":"60 5","pages":"1844-1855"},"PeriodicalIF":5.6000,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10689660","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Solid-state Circuits","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10689660/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

The rapid evolution of machine learning has led to the emergence of diverse neural network architectures, such as CNNs, Transformers, and their hybrid models, each with unique computational precision requirements. Transformers, in particular, demand higher precision compared to CNNs. Existing analog compute-in-memory (ACIM) solutions primarily cater to CNNs and struggle to achieve the high precision necessary for Transformers, despite their promise in addressing the memory bottleneck. To bridge this gap, we propose a capacitor-reconfigured CIM (CR-CIM) macro that introduces dual-mode operation, dynamically switching between high-precision and high-efficiency modes based on the active DNN layer. In the CNN mode, the CR-CIM employs bit-parallel computation and an 8-bit ADC to maximize power efficiency, exploiting the inherent error tolerance of CNNs. In contrast, for the Transformer mode, the CR-CIM switches to bit-serial computation and a 10-bit ADC to boost the compute signal-to-noise ratio (CSNR), ensuring the higher precision required by Transformers. This dual-mode functionality of the proposed CR-CIM is enabled by three key technologies: 1) a novel CR-CIM architecture and cell structure; 2) a resource-efficient multi-bit driver for bit-parallel computation; and 3) a software-analog co-design (SAC) strategy for enhanced Transformer computation. Our CR-CIM prototype is the first ACIM design to enable optimized operation for both Transformers and CNNs. CR-CIM achieves 45-dB signal-to-quantization-noise ratio (SQNR) and 31-dB CSNR (8-bit input and 8-bit weight bit-serial MAC) in the Transformer mode and a peak-power efficiency of 4094 TOPS/W (normalized to 1-bit

${\times } 1$

-bit MAC) in the CNN mode.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A 818-4094 TOPS/W 用于统一加速 CNN 和变压器的电容重构模拟 CIM

机器学习的快速发展导致了各种神经网络架构的出现，例如cnn， transformer及其混合模型，每个模型都有独特的计算精度要求。与cnn相比，变压器尤其需要更高的精度。现有的模拟内存计算（ACIM）解决方案主要迎合cnn，尽管它们承诺解决内存瓶颈，但难以实现变压器所需的高精度。为了弥补这一差距，我们提出了一个电容重构CIM （CR-CIM）宏，该宏引入了双模式操作，基于主动深度神经网络层在高精度和高效率模式之间动态切换。在CNN模式下，CR-CIM采用位并行计算和8位ADC，最大限度地提高了功率效率，利用了CNN固有的容错能力。相比之下，对于Transformer模式，CR-CIM切换到位串行计算和10位ADC来提高计算信噪比（CSNR），确保Transformer所需的更高精度。提出的CR-CIM的双模功能由三个关键技术实现：1)新的CR-CIM架构和单元结构；2)用于位并行计算的资源高效多比特驱动程序；3)一种增强Transformer计算能力的软件模拟协同设计（SAC）策略。我们的CR-CIM原型是第一个能够优化变压器和cnn运行的ACIM设计。CR-CIM在Transformer模式下可实现45 db的信噪比（SQNR）和31 db的CSNR(8位输入和8位权重位串行MAC)，在CNN模式下可实现4094 TOPS/W的峰值功率效率（归一化为1位${\times} 1$位MAC）。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Journal of Solid-state Circuits 工程技术-工程：电子与电气

CiteScore

11.00

自引率

20.40%

发文量

351

审稿时长

3-6 weeks

期刊介绍： The IEEE Journal of Solid-State Circuits publishes papers each month in the broad area of solid-state circuits with particular emphasis on transistor-level design of integrated circuits. It also provides coverage of topics such as circuits modeling, technology, systems design, layout, and testing that relate directly to IC design. Integrated circuits and VLSI are of principal interest; material related to discrete circuit design is seldom published. Experimental verification is strongly encouraged.