用于边缘应用的28nm无静电全并行rram TD CIM微型机,具有1982 TOPS/W/Bit

IF 2.2 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE IEEE Solid-State Circuits Letters Pub Date : 2024-12-19 DOI:10.1109/LSSC.2024.3520593
Songtao Wei;Peng Yao;Xinying Guo;Dong Wu;Lu Jie;Qi Qin;Bin Gao;Jianshi Tang;He Qian;Sining Pan;Huaqiang Wu
{"title":"用于边缘应用的28nm无静电全并行rram TD CIM微型机,具有1982 TOPS/W/Bit","authors":"Songtao Wei;Peng Yao;Xinying Guo;Dong Wu;Lu Jie;Qi Qin;Bin Gao;Jianshi Tang;He Qian;Sining Pan;Huaqiang Wu","doi":"10.1109/LSSC.2024.3520593","DOIUrl":null,"url":null,"abstract":"Analog computing in memory (CIM) based on resistive nonvolatile memory (NVM) has encountered several issues, such as low parallelism, low computing accuracy, and considerable power consumption. In this letter, a temporal unit based on design technology co-optimization (DTCO) for resistive random access memory is proposed for the first time, with the advantage of eliminating dc current and reducing the deviation of mapped weight. A time-domain (TD) array based on the proposed temporal unit features performing fully parallel matrix-vector multiplication (MVM) in a static-power-free manner, without the consideration of IR drop and limited sensing margin (SM). Besides, a low-power time-digital converter (TDC) with local offset elimination further boosts energy efficiency (EF) and computing accuracy. The fabricated 28-nm TD CIM macro achieves a state-of-the-art normalized EF of 1982 and 1387 TOPS/W/bit under 1b-input, ternary-weight and 4b-input, signed 4b-weight, respectively.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"8 ","pages":"21-24"},"PeriodicalIF":2.2000,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A 28-nm Static-Power-Free Fully Parallel RRAM-Based TD CIM Macro With 1982 TOPS/W/Bit for Edge Applications\",\"authors\":\"Songtao Wei;Peng Yao;Xinying Guo;Dong Wu;Lu Jie;Qi Qin;Bin Gao;Jianshi Tang;He Qian;Sining Pan;Huaqiang Wu\",\"doi\":\"10.1109/LSSC.2024.3520593\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Analog computing in memory (CIM) based on resistive nonvolatile memory (NVM) has encountered several issues, such as low parallelism, low computing accuracy, and considerable power consumption. In this letter, a temporal unit based on design technology co-optimization (DTCO) for resistive random access memory is proposed for the first time, with the advantage of eliminating dc current and reducing the deviation of mapped weight. A time-domain (TD) array based on the proposed temporal unit features performing fully parallel matrix-vector multiplication (MVM) in a static-power-free manner, without the consideration of IR drop and limited sensing margin (SM). Besides, a low-power time-digital converter (TDC) with local offset elimination further boosts energy efficiency (EF) and computing accuracy. The fabricated 28-nm TD CIM macro achieves a state-of-the-art normalized EF of 1982 and 1387 TOPS/W/bit under 1b-input, ternary-weight and 4b-input, signed 4b-weight, respectively.\",\"PeriodicalId\":13032,\"journal\":{\"name\":\"IEEE Solid-State Circuits Letters\",\"volume\":\"8 \",\"pages\":\"21-24\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2024-12-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Solid-State Circuits Letters\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10808165/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Solid-State Circuits Letters","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10808165/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

摘要

基于电阻性非易失性存储器(NVM)的内存模拟计算(CIM)遇到了几个问题,例如低并行性、低计算精度和相当大的功耗。本文首次提出了一种基于设计技术协同优化(DTCO)的时间单元电阻随机存取存储器,具有消除直流电流和减小映射权值偏差的优点。基于时间单元的时域(TD)阵列以静态无功率方式执行完全并行矩阵向量乘法(MVM),而不考虑红外下降和有限感知裕度(SM)。此外,采用局部偏移消除的低功耗时数转换器进一步提高了能效和计算精度。制备的28纳米TD CIM宏分别在1b输入、三元输入和4b输入下实现了1982和1387 TOPS/W/bit的归一化EF。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A 28-nm Static-Power-Free Fully Parallel RRAM-Based TD CIM Macro With 1982 TOPS/W/Bit for Edge Applications
Analog computing in memory (CIM) based on resistive nonvolatile memory (NVM) has encountered several issues, such as low parallelism, low computing accuracy, and considerable power consumption. In this letter, a temporal unit based on design technology co-optimization (DTCO) for resistive random access memory is proposed for the first time, with the advantage of eliminating dc current and reducing the deviation of mapped weight. A time-domain (TD) array based on the proposed temporal unit features performing fully parallel matrix-vector multiplication (MVM) in a static-power-free manner, without the consideration of IR drop and limited sensing margin (SM). Besides, a low-power time-digital converter (TDC) with local offset elimination further boosts energy efficiency (EF) and computing accuracy. The fabricated 28-nm TD CIM macro achieves a state-of-the-art normalized EF of 1982 and 1387 TOPS/W/bit under 1b-input, ternary-weight and 4b-input, signed 4b-weight, respectively.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Solid-State Circuits Letters
IEEE Solid-State Circuits Letters Engineering-Electrical and Electronic Engineering
CiteScore
4.30
自引率
3.70%
发文量
52
期刊最新文献
A 1.54 pJ/b 80 Gb/s D-Band 2-D Scalable Transceiver Array With On-Chip Antennas in 28-nm Bulk CMOS A 0.86 mW 17 fA/√Hz, 129-dB DR Current-Sensing Front-End for Under-Display Ambient Light Sensor With Zero-Compensated Logarithmic TIA 20–26-GHz CMOS PA With High Pout and OP1 dB Using a 1:2 Capacitance-Ratio-Equivalent Power Combiner A Low-Jitter Fractional-N LC-PLL With a 1/4 DTC-Range-Reduction Technique A 23–28-GHz Doherty Power Amplifier With a PVT Insensitive Power Detection for Adaptive Biasing
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1