用于高级人工智能边缘芯片的整数-浮点双模增益单元内存计算宏程序

IF 5.6 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Journal of Solid-state Circuits Pub Date : 2024-10-15 DOI:10.1109/JSSC.2024.3470215
Ping-Chun Wu;Win-San Khwa;Jui-Jen Wu;Jian-Wei Su;Chuan-Jia Jhang;Ho-Yu Chen;Zhao-En Ke;Ting-Chien Chiu;Jun-Ming Hsu;Chiao-Yen Cheng;Yu-Chen Chen;Chung-Chuan Lo;Ren-Shuo Liu;Chih-Cheng Hsieh;Kea-Tiong Tang;Meng-Fan Chang
{"title":"用于高级人工智能边缘芯片的整数-浮点双模增益单元内存计算宏程序","authors":"Ping-Chun Wu;Win-San Khwa;Jui-Jen Wu;Jian-Wei Su;Chuan-Jia Jhang;Ho-Yu Chen;Zhao-En Ke;Ting-Chien Chiu;Jun-Ming Hsu;Chiao-Yen Cheng;Yu-Chen Chen;Chung-Chuan Lo;Ren-Shuo Liu;Chih-Cheng Hsieh;Kea-Tiong Tang;Meng-Fan Chang","doi":"10.1109/JSSC.2024.3470215","DOIUrl":null,"url":null,"abstract":"This article presents a novel integer-floating-point (INT-FP) gain-cell (GC)-computing-in-memory (CIM) structure for high-precision multiply-and-accumulate (MAC) operations with high computational flexibility, energy efficiency, and inference accuracy. The proposed device employs: 1) a dual-mode zone-based input processing scheme (ZB-IPS) aimed at eliminating exponent subtraction in order to enhance energy and area efficiency (AEF); 2) a dual-mode local computing cell (DM-LCC) to reuse exponent addition as an adder tree stage for INT-MAC to enhance AEF in both INT and floating-point (FP) modes; and 3) a stationary-based two-port GC array (SB-TP-GCA) to enable concurrent data updates and computation while reducing system-to-CIM and internal data accesses to improve energy efficiency. A 16-nm FinFET 108-kb GC-CIM macro fabricated using 4T gain cells (GCs) achieved energy efficiency of 99.5 TOPS/W in INT-MAC operations involving 128 accumulations of 8b-input, 8b-weight, and 23b-output; and 46.4 TFLOPS/W in FP-MAC operations involving 64 accumulations of BF16-input, BF16-weight, and FP32-output.","PeriodicalId":13129,"journal":{"name":"IEEE Journal of Solid-state Circuits","volume":"60 1","pages":"158-170"},"PeriodicalIF":5.6000,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Integer-Floating-Point Dual-Mode Gain-Cell Computing-in-Memory Macro for Advanced AI Edge Chips\",\"authors\":\"Ping-Chun Wu;Win-San Khwa;Jui-Jen Wu;Jian-Wei Su;Chuan-Jia Jhang;Ho-Yu Chen;Zhao-En Ke;Ting-Chien Chiu;Jun-Ming Hsu;Chiao-Yen Cheng;Yu-Chen Chen;Chung-Chuan Lo;Ren-Shuo Liu;Chih-Cheng Hsieh;Kea-Tiong Tang;Meng-Fan Chang\",\"doi\":\"10.1109/JSSC.2024.3470215\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article presents a novel integer-floating-point (INT-FP) gain-cell (GC)-computing-in-memory (CIM) structure for high-precision multiply-and-accumulate (MAC) operations with high computational flexibility, energy efficiency, and inference accuracy. The proposed device employs: 1) a dual-mode zone-based input processing scheme (ZB-IPS) aimed at eliminating exponent subtraction in order to enhance energy and area efficiency (AEF); 2) a dual-mode local computing cell (DM-LCC) to reuse exponent addition as an adder tree stage for INT-MAC to enhance AEF in both INT and floating-point (FP) modes; and 3) a stationary-based two-port GC array (SB-TP-GCA) to enable concurrent data updates and computation while reducing system-to-CIM and internal data accesses to improve energy efficiency. A 16-nm FinFET 108-kb GC-CIM macro fabricated using 4T gain cells (GCs) achieved energy efficiency of 99.5 TOPS/W in INT-MAC operations involving 128 accumulations of 8b-input, 8b-weight, and 23b-output; and 46.4 TFLOPS/W in FP-MAC operations involving 64 accumulations of BF16-input, BF16-weight, and FP32-output.\",\"PeriodicalId\":13129,\"journal\":{\"name\":\"IEEE Journal of Solid-state Circuits\",\"volume\":\"60 1\",\"pages\":\"158-170\"},\"PeriodicalIF\":5.6000,\"publicationDate\":\"2024-10-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Journal of Solid-state Circuits\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10716755/\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Solid-state Circuits","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10716755/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

本文提出了一种新的整数浮点(INT-FP)增益单元(GC)内存计算(CIM)结构,用于高精度的乘法和累积(MAC)操作,具有很高的计算灵活性、能量效率和推理精度。该器件采用:1)双模基于区域的输入处理方案(ZB-IPS),旨在消除指数减法,以提高能量和面积效率(AEF);2)采用双模局部计算单元(DM-LCC)重用指数加法作为INT- mac的加法树阶段,以增强INT和浮点(FP)模式下的AEF;3)基于固定的双端口GC阵列(SB-TP-GCA),以实现并发数据更新和计算,同时减少系统到cim和内部数据访问,以提高能源效率。使用4T增益单元(gc)制造的16nm FinFET 108-kb GC-CIM宏在涉及128个8b-输入,8b-重量和23b-输出累积的INT-MAC操作中实现了99.5 TOPS/W的能量效率;在涉及BF16-input、BF16-weight和FP32-output的64个累积的FP-MAC操作中,为46.4 TFLOPS/W。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
An Integer-Floating-Point Dual-Mode Gain-Cell Computing-in-Memory Macro for Advanced AI Edge Chips
This article presents a novel integer-floating-point (INT-FP) gain-cell (GC)-computing-in-memory (CIM) structure for high-precision multiply-and-accumulate (MAC) operations with high computational flexibility, energy efficiency, and inference accuracy. The proposed device employs: 1) a dual-mode zone-based input processing scheme (ZB-IPS) aimed at eliminating exponent subtraction in order to enhance energy and area efficiency (AEF); 2) a dual-mode local computing cell (DM-LCC) to reuse exponent addition as an adder tree stage for INT-MAC to enhance AEF in both INT and floating-point (FP) modes; and 3) a stationary-based two-port GC array (SB-TP-GCA) to enable concurrent data updates and computation while reducing system-to-CIM and internal data accesses to improve energy efficiency. A 16-nm FinFET 108-kb GC-CIM macro fabricated using 4T gain cells (GCs) achieved energy efficiency of 99.5 TOPS/W in INT-MAC operations involving 128 accumulations of 8b-input, 8b-weight, and 23b-output; and 46.4 TFLOPS/W in FP-MAC operations involving 64 accumulations of BF16-input, BF16-weight, and FP32-output.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Journal of Solid-state Circuits
IEEE Journal of Solid-state Circuits 工程技术-工程:电子与电气
CiteScore
11.00
自引率
20.40%
发文量
351
审稿时长
3-6 weeks
期刊介绍: The IEEE Journal of Solid-State Circuits publishes papers each month in the broad area of solid-state circuits with particular emphasis on transistor-level design of integrated circuits. It also provides coverage of topics such as circuits modeling, technology, systems design, layout, and testing that relate directly to IC design. Integrated circuits and VLSI are of principal interest; material related to discrete circuit design is seldom published. Experimental verification is strongly encouraged.
期刊最新文献
A 138 Gb/s D-Band 2-D Scalable Transceiver Array With On-Chip Antennas Achieving Sub-1-pJ/b Efficiency in 28-nm Bulk CMOS A Bio-Impedance Readout IC With Phase-Locked Sampling for Real-Time Electrical Impedance Spectroscopy A Capacitor-Free Hybrid-Process Low Dropout Regulator With Ultrahigh-Gain Amplifier and Super Source Follower for 0.297 mV/A Load Regulation and 0.024 mV/V Line Regulation A Versatile Laser Diode Driver With Dynamic Peak Current Regulation and Wide Tunable Pulsewidth Range for Adaptive Pulsed LiDAR A Quadrature-Rotation Phased-Array Transmitter With High-Resolution Phase Tuning and Complex Domain Power Back-Off Efficiency Enhancement
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1