首页 > 最新文献

IEEE Solid-State Circuits Letters最新文献

英文 中文
An Approximate Digital CIM Macro With Low-Power Multiply-Add Units and Dynamic Sparse-Adaptive Configuring for Edge AI Inference 基于低功耗乘加单元和动态稀疏自适应配置的边缘人工智能近似数字CIM宏
IF 2 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-12 DOI: 10.1109/LSSC.2026.3652570
Xiaofeng Li;Yi Zhan;Purui Zhu;Rui Zhou;Jiayin Song;Heng You;Yumei Zhou;Shushan Qiao
This letter presents an approximate digital compute-in-memory (CIM) macro for low-power edge AI inference. It introduces three hierarchical innovations: 1) novel fused approximate multiply-add units (FAMUs) that reduces power and area consumption; 2) a bit-critical weight allocation architecture that optimally balances accuracy and hardware cost; and 3) a dynamic sparsity-adaptive configuration method to minimize accuracy loss in real-time. The macro achieves an energy efficiency of 60.35 TOPS/W and an area efficiency of 1105 GOPS/mm2 for INT8 MACs, outperforming prior works. It attains negligible accuracy degradation on multiple mainstream datasets and suits well for edge AI inference.
这封信提出了一个近似的数字内存计算(CIM)宏,用于低功耗边缘人工智能推理。它引入了三个层次创新:1)新颖的融合近似乘加单元(famu),降低了功耗和面积消耗;2)位关键权重分配架构,以最佳方式平衡精度和硬件成本;3)采用动态稀疏自适应配置方法,实时降低精度损失。对于INT8 mac,该宏实现了60.35 TOPS/W的能量效率和1105 GOPS/mm2的面积效率,优于先前的工作。它在多个主流数据集上实现了可以忽略不计的精度下降,非常适合边缘人工智能推理。
{"title":"An Approximate Digital CIM Macro With Low-Power Multiply-Add Units and Dynamic Sparse-Adaptive Configuring for Edge AI Inference","authors":"Xiaofeng Li;Yi Zhan;Purui Zhu;Rui Zhou;Jiayin Song;Heng You;Yumei Zhou;Shushan Qiao","doi":"10.1109/LSSC.2026.3652570","DOIUrl":"https://doi.org/10.1109/LSSC.2026.3652570","url":null,"abstract":"This letter presents an approximate digital compute-in-memory (CIM) macro for low-power edge AI inference. It introduces three hierarchical innovations: 1) novel fused approximate multiply-add units (FAMUs) that reduces power and area consumption; 2) a bit-critical weight allocation architecture that optimally balances accuracy and hardware cost; and 3) a dynamic sparsity-adaptive configuration method to minimize accuracy loss in real-time. The macro achieves an energy efficiency of 60.35 TOPS/W and an area efficiency of 1105 GOPS/mm2 for INT8 MACs, outperforming prior works. It attains negligible accuracy degradation on multiple mainstream datasets and suits well for edge AI inference.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"9 ","pages":"45-48"},"PeriodicalIF":2.0,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Standalone-in-Memory Voltage Crossover-Based Assist Switching Circuit for Reliable and Efficient Process Tracking Memory Vmin Improvement in Intel 18A-RibbonFET Technology 在Intel 18a带状场效应管技术中,一种基于独立内存电压交叉的辅助开关电路,用于可靠和高效的过程跟踪内存Vmin改进
IF 2 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-12 DOI: 10.1109/LSSC.2026.3652110
Saroj Satapathy;Amlan Ghosh;John Riley;Jalal Quadri;Jaydeep Kulkarni;Feroze Merchant
Advanced CMOS memory requires voltage biasing assist techniques to achieve low operating voltages (Vmin), which must be deactivated at higher voltages for high electric field reliability. Centralized power management unit (PMU) control signals face timing synchronization and process tracking challenges when distributed across cores to activate assist circuits in various static random access memory arrays, limiting their effectiveness. This restricts the power and area benefits that could be gained from an in-situ, memory circuit-assist enable/disable mechanism. To address this, we propose a novel voltage crossover-based memory assist switching circuit implemented in Intel 18A-RibbonFET technology featuring backside power delivery. Its $150times $ area efficiency enables independent placement within memory blocks, offering 19% array-level power and 17% performance improvements. Silicon measurements show tight variation control and strong simulation correlation.
先进的CMOS存储器需要电压偏置辅助技术来实现低工作电压(Vmin),为了实现高电场可靠性,必须在更高的电压下禁用电压偏置。当集中电源管理单元(PMU)控制信号分布在各个核上以激活各种静态随机存取存储器阵列中的辅助电路时,将面临定时同步和过程跟踪的挑战,从而限制了其有效性。这限制了原位存储电路辅助启用/禁用机制所能获得的功率和面积优势。为了解决这个问题,我们提出了一种新的基于电压交叉的存储辅助开关电路,该电路采用英特尔18A-RibbonFET技术实现,具有背面供电功能。其150倍的面积效率可以在内存块内独立放置,提供19%的阵列级功率和17%的性能改进。硅测量结果显示出严格的变化控制和较强的模拟相关性。
{"title":"A Standalone-in-Memory Voltage Crossover-Based Assist Switching Circuit for Reliable and Efficient Process Tracking Memory Vmin Improvement in Intel 18A-RibbonFET Technology","authors":"Saroj Satapathy;Amlan Ghosh;John Riley;Jalal Quadri;Jaydeep Kulkarni;Feroze Merchant","doi":"10.1109/LSSC.2026.3652110","DOIUrl":"https://doi.org/10.1109/LSSC.2026.3652110","url":null,"abstract":"Advanced CMOS memory requires voltage biasing assist techniques to achieve low operating voltages (Vmin), which must be deactivated at higher voltages for high electric field reliability. Centralized power management unit (PMU) control signals face timing synchronization and process tracking challenges when distributed across cores to activate assist circuits in various static random access memory arrays, limiting their effectiveness. This restricts the power and area benefits that could be gained from an in-situ, memory circuit-assist enable/disable mechanism. To address this, we propose a novel voltage crossover-based memory assist switching circuit implemented in Intel 18A-RibbonFET technology featuring backside power delivery. Its <inline-formula> <tex-math>$150times $ </tex-math></inline-formula> area efficiency enables independent placement within memory blocks, offering 19% array-level power and 17% performance improvements. Silicon measurements show tight variation control and strong simulation correlation.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"9 ","pages":"37-40"},"PeriodicalIF":2.0,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Cryo-CMOS Smart Temperature Sensor for the Ultrawide Temperature Range From 5 K to 296 K Cryo-CMOS智能温度传感器,适用于5 K至296 K的超宽温度范围
IF 2 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-02 DOI: 10.1109/LSSC.2025.3650657
D. Cerviño Fungueiriño;L. A. Enthoven;J. van Staveren;M. Babaie;F. Sebastiano
This work presents a cryo-CMOS smart temperature sensor operating from room temperature down to 5 K. By adopting sensing elements (CMOS bulk diodes, pMOS/DTMOS in weak inversion) that circumvent the poor cryogenic performance of Si BJTs, a robust switched-capacitor second-order sigma–delta readout and cryogenic-aware design techniques, the sensor achieves a maximum error of ±0.73 K (four samples and two-point trim), a resolution below 0.05 K for a 102.4-ms readout duration, and a power consumption of $mathrm {15.5~mu text {W} }$ ( $mathrm {93.5~mu text {W} }$ ) at 5 K (296 K).
这项工作提出了一种低温cmos智能温度传感器,工作温度从室温降至5 K。该传感器采用了克服Si BJTs低温性能差的传感元件(CMOS体二极管、弱反转的pMOS/DTMOS)、稳健的开关电容二阶sigma-delta读出和低温感知设计技术,最大误差为±0.73 K(4个样本和两点trim),读取时间为102.4 ms,分辨率低于0.05 K, 5 K (296 K)时功耗为$ mathm {15.5~mu text {W}}$ ($ mathm {93.5~mu text {W}}$)。
{"title":"A Cryo-CMOS Smart Temperature Sensor for the Ultrawide Temperature Range From 5 K to 296 K","authors":"D. Cerviño Fungueiriño;L. A. Enthoven;J. van Staveren;M. Babaie;F. Sebastiano","doi":"10.1109/LSSC.2025.3650657","DOIUrl":"https://doi.org/10.1109/LSSC.2025.3650657","url":null,"abstract":"This work presents a cryo-CMOS smart temperature sensor operating from room temperature down to 5 K. By adopting sensing elements (CMOS bulk diodes, pMOS/DTMOS in weak inversion) that circumvent the poor cryogenic performance of Si BJTs, a robust switched-capacitor second-order sigma–delta readout and cryogenic-aware design techniques, the sensor achieves a maximum error of ±0.73 K (four samples and two-point trim), a resolution below 0.05 K for a 102.4-ms readout duration, and a power consumption of <inline-formula> <tex-math>$mathrm {15.5~mu text {W} }$ </tex-math></inline-formula> (<inline-formula> <tex-math>$mathrm {93.5~mu text {W} }$ </tex-math></inline-formula>) at 5 K (296 K).","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"9 ","pages":"29-32"},"PeriodicalIF":2.0,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A 0.015-mm2 0.5 -V Synthesizable Hybrid PLL With Multiphase Linear Proportional-Gain Paths 具有多相线性比例增益路径的0.015-mm2 0.5 v可合成混合锁相环
IF 2 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-29 DOI: 10.1109/LSSC.2025.3649215
Qibang Sun;Liqun Feng;Woogeun Rhee;Hanjun Jiang
This brief presents a 0.015-mm2 0.5-V synthesizable hybrid phase locked loop (PLL). All blocks including an analog proportional-gain path can be logically or physically synthesized with digital cells and hardware languages. To mitigate the mismatch and common-mode fluctuation problems of a voltage-mode phase detector, multiphase proportional-gain paths are designed for low spur. An 1.2-GHz prototype PLL implemented in 28-nm CMOS achieves <–65-dBc reference spur under 0.5-V supply. The proposed PLL occupies a core area of only 0.015 mm2 by leveraging the synthesized design.
本文介绍了一种0.015 mm2的0.5 v可合成混合锁相环(PLL)。包括模拟比例增益路径的所有模块都可以用数字单元和硬件语言在逻辑上或物理上合成。为了减轻电压型鉴相器的失配和共模波动问题,设计了低杂散的多相比例增益路径。采用28纳米CMOS实现的1.2 ghz原型锁相环在0.5 v电源下可实现< - 65 dbc参考杂散。通过利用合成设计,所提出的锁相环仅占用0.015 mm2的核心区域。
{"title":"A 0.015-mm2 0.5 -V Synthesizable Hybrid PLL With Multiphase Linear Proportional-Gain Paths","authors":"Qibang Sun;Liqun Feng;Woogeun Rhee;Hanjun Jiang","doi":"10.1109/LSSC.2025.3649215","DOIUrl":"https://doi.org/10.1109/LSSC.2025.3649215","url":null,"abstract":"This brief presents a 0.015-mm2 0.5-V synthesizable hybrid phase locked loop (PLL). All blocks including an analog proportional-gain path can be logically or physically synthesized with digital cells and hardware languages. To mitigate the mismatch and common-mode fluctuation problems of a voltage-mode phase detector, multiphase proportional-gain paths are designed for low spur. An 1.2-GHz prototype PLL implemented in 28-nm CMOS achieves <–65-dBc reference spur under 0.5-V supply. The proposed PLL occupies a core area of only 0.015 mm2 by leveraging the synthesized design.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"9 ","pages":"77-80"},"PeriodicalIF":2.0,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146223694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Broadband and Compact GaN Millimeter-Wave MMIC SPDT Switch Using Modified π-Networks 一种基于改进π网络的宽带紧凑型GaN毫米波MMIC SPDT开关
IF 2 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-22 DOI: 10.1109/LSSC.2025.3646815
Chaorong Wang;Quan Pan;Xiaohu Fang
This letter presents a design methodology for broadband and compact millimeter-wave (mm-wave) single-pole double-throw (SPDT) switches targeting the Ku–Ka band. Conventional SPDT switches based on quarter-wavelength transmission line typically occupy significant chip area, while alternative designs utilizing standard $pi $ -type equivalent circuits often suffer from bandwidth degradation due to the parasitic inductance at the isolation path. To overcome these limitations, a novel SPDT architecture based on modified $pi $ -networks is proposed. This approach incorporates the parasitic inductance of the isolation path into the $pi $ -network design, effectively enhancing the bandwidth performance without increasing circuit complexity. For validation, an SPDT switch was implemented using a 0.15- $mu $ m GaN MMIC process, covering the 10–28 GHz band. Measurement results confirm that the switch achieves an insertion loss below 2.1 dB, return loss better than 12 dB and isolation greater than 42 dB, with a compact core chip area of only 0.62 mm2.
本文介绍了针对Ku-Ka频段的宽带和紧凑型毫米波(mm波)单极双掷(SPDT)开关的设计方法。基于四分之一波长传输线的传统SPDT开关通常占用大量芯片面积,而利用标准$pi $型等效电路的替代设计通常由于隔离路径上的寄生电感而导致带宽下降。为了克服这些限制,提出了一种基于改进$pi $ -网络的SPDT架构。该方法将隔离路径的寄生电感集成到$pi $ -网络设计中,在不增加电路复杂性的情况下有效地提高了带宽性能。为了验证,使用0.15- $mu $ m GaN MMIC工艺实现了SPDT开关,覆盖10-28 GHz频段。测量结果证实,该开关的插入损耗低于2.1 dB,回波损耗优于12 dB,隔离度大于42 dB,核心芯片面积仅为0.62 mm2。
{"title":"A Broadband and Compact GaN Millimeter-Wave MMIC SPDT Switch Using Modified π-Networks","authors":"Chaorong Wang;Quan Pan;Xiaohu Fang","doi":"10.1109/LSSC.2025.3646815","DOIUrl":"https://doi.org/10.1109/LSSC.2025.3646815","url":null,"abstract":"This letter presents a design methodology for broadband and compact millimeter-wave (mm-wave) single-pole double-throw (SPDT) switches targeting the Ku–Ka band. Conventional SPDT switches based on quarter-wavelength transmission line typically occupy significant chip area, while alternative designs utilizing standard <inline-formula> <tex-math>$pi $ </tex-math></inline-formula>-type equivalent circuits often suffer from bandwidth degradation due to the parasitic inductance at the isolation path. To overcome these limitations, a novel SPDT architecture based on modified <inline-formula> <tex-math>$pi $ </tex-math></inline-formula>-networks is proposed. This approach incorporates the parasitic inductance of the isolation path into the <inline-formula> <tex-math>$pi $ </tex-math></inline-formula>-network design, effectively enhancing the bandwidth performance without increasing circuit complexity. For validation, an SPDT switch was implemented using a 0.15-<inline-formula> <tex-math>$mu $ </tex-math></inline-formula>m GaN MMIC process, covering the 10–28 GHz band. Measurement results confirm that the switch achieves an insertion loss below 2.1 dB, return loss better than 12 dB and isolation greater than 42 dB, with a compact core chip area of only 0.62 mm2.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"9 ","pages":"21-24"},"PeriodicalIF":2.0,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
220 GHz, 8.5-dBm Saturated Output Power Wideband Power Amplifier in SiGe BiCMOS 220ghz, 8.5 dbm SiGe BiCMOS饱和输出功率宽带功率放大器
IF 2 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-19 DOI: 10.1109/LSSC.2025.3646267
Xun Chen;Jonas Winkelhake;Muh-Dey Wei;Renato Negra
This letter presents a broadband $G$ -band power amplifier (PA) designed in a 130-nm silicon-germanium (SiGe) bipolar complementary metal-oxide-semiconductor technology. Unlike dual-band matching and staggered tuning techniques to obtain large operation bandwidth (BW), we propose a common broadband amplification stage in this work for its flexibility. In each stage, inductive gain peaking is employed at the base terminal of the common-base (CB) transistor for BW extension. Meanwhile, the correlation between BW, inband gain and gain flatness is studied to determine the practical value of the inductance and AC decoupling capacitance. A current-mirror with thermal runaway protection is placed close to the amplifier in the implementation of each stage. The proposed PA achieves a peak small-signal gain of 27.6 dB over an operation BW from 140 to 238.7 GHz in measurements. The saturated output power, $P_{text {sat}}$ , is 8.5 dBm at 220 GHz with a peak power-added efficiency (PAE) of 2.1%. The 3-dB $P_{text {sat}}$ BW ranges from 150 to 220 GHz with fractional BW of 31.8% in measurements. To the best of the authors’ knowledge, the PA demonstrates the largest BW in $G$ -band compared to other reported silicon-based PAs to date. Furthermore, the inband gain and output power over BW are easily adjustable based on system requirements due to the flexibility of the common broadband amplification stage.
本文介绍了一种采用130纳米硅锗(SiGe)双极互补金属氧化物半导体技术设计的宽带G波段功率放大器(PA)。与双频匹配和交错调谐技术不同,我们在这项工作中提出了一个通用的宽带放大阶段,以获得大的操作带宽(BW)。在每一级中,在共基(CB)晶体管的基极端采用感应增益峰值来扩展BW。同时,研究了带内增益与增益平坦度之间的关系,确定了电感和交流去耦电容的实用值。在每个级的实现中,一个具有热失控保护的电流镜被放置在靠近放大器的位置。在140至238.7 GHz的工作波宽范围内,所提出的放大器实现了27.6 dB的峰值小信号增益。饱和输出功率$P_{text {sat}}$在220 GHz时为8.5 dBm,峰值功率附加效率(PAE)为2.1%。3db $P_{text {sat}}$ BW范围为150 ~ 220 GHz,测量分数BW为31.8%。据作者所知,与迄今为止报道的其他硅基PA相比,该PA在G波段显示出最大的BW。此外,由于普通宽带放大级的灵活性,BW上的带内增益和输出功率很容易根据系统需求进行调整。
{"title":"220 GHz, 8.5-dBm Saturated Output Power Wideband Power Amplifier in SiGe BiCMOS","authors":"Xun Chen;Jonas Winkelhake;Muh-Dey Wei;Renato Negra","doi":"10.1109/LSSC.2025.3646267","DOIUrl":"https://doi.org/10.1109/LSSC.2025.3646267","url":null,"abstract":"This letter presents a broadband <inline-formula> <tex-math>$G$ </tex-math></inline-formula>-band power amplifier (PA) designed in a 130-nm silicon-germanium (SiGe) bipolar complementary metal-oxide-semiconductor technology. Unlike dual-band matching and staggered tuning techniques to obtain large operation bandwidth (BW), we propose a common broadband amplification stage in this work for its flexibility. In each stage, inductive gain peaking is employed at the base terminal of the common-base (CB) transistor for BW extension. Meanwhile, the correlation between BW, inband gain and gain flatness is studied to determine the practical value of the inductance and AC decoupling capacitance. A current-mirror with thermal runaway protection is placed close to the amplifier in the implementation of each stage. The proposed PA achieves a peak small-signal gain of 27.6 dB over an operation BW from 140 to 238.7 GHz in measurements. The saturated output power, <inline-formula> <tex-math>$P_{text {sat}}$ </tex-math></inline-formula>, is 8.5 dBm at 220 GHz with a peak power-added efficiency (PAE) of 2.1%. The 3-dB <inline-formula> <tex-math>$P_{text {sat}}$ </tex-math></inline-formula> BW ranges from 150 to 220 GHz with fractional BW of 31.8% in measurements. To the best of the authors’ knowledge, the PA demonstrates the largest BW in <inline-formula> <tex-math>$G$ </tex-math></inline-formula>-band compared to other reported silicon-based PAs to date. Furthermore, the inband gain and output power over BW are easily adjustable based on system requirements due to the flexibility of the common broadband amplification stage.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"9 ","pages":"17-20"},"PeriodicalIF":2.0,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11304730","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Actiniaria: Distributed Dynamic-IR-Drop-Aware Timing Monitor for AVFS With Lightweight Tentacles Actiniaria:分布式动态ir - drop - aware定时监视器与轻量级触角的AVFS
IF 2 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-17 DOI: 10.1109/LSSC.2025.3645701
Junyi Qian;Lishuo Deng;Cai Li;Yizhi Ding;Weiwei Shan
Advances in integrated circuit (IC) technology have amplified the effects of process, voltage, and temperature (PVT) variations, particularly dynamic IR drop, which severely affects timing. Post-silicon IR drop monitoring circuits are lacking, forcing designers to reserve substantial static guard bands for worst-case scenarios, compromising energy efficiency. Inspired by biomimetics, this letter proposes an Actiniaria-shaped structure for distributed monitoring of multiple IR drop hotspots without invading critical paths (CPs). It comprises lightweight tentacles + central monitor. The tentacles use voltage-sensitive cells to sense the impact of IR drop on timing, while also supporting adaptive tuning for global variations. By using on-chip PVT sensors, Actiniaria tracks the timing characteristics of the longest CP in real time under PVT variations. A two-stage prewarning timing monitor captures timing margins and employs adaptive voltage/frequency scaling (AVFS) strategies to compress redundant guard bands. When applied to an open-source multicore RISC-V processor fabricated in 22-nm CMOS, Actiniaria achieves a 40.6% reduction in power consumption through noninvasive dynamic IR drop monitoring while having only 0.08% area overhead.
集成电路(IC)技术的进步放大了工艺、电压和温度(PVT)变化的影响,特别是动态红外下降,这严重影响了时序。后硅红外跌落监测电路的缺乏,迫使设计者为最坏的情况保留大量的静态保护带,从而降低了能源效率。受仿生学的启发,这封信提出了一种actiniaria形状的结构,用于分布式监测多个IR下降热点,而不会入侵关键路径(CPs)。它由轻型触须+中央监视器组成。触角使用电压敏感细胞来感知IR下降对时间的影响,同时也支持对全局变化的自适应调整。Actiniaria利用片上PVT传感器,实时跟踪PVT变化下最长CP的时序特征。两级预警时序监视器捕获时序裕度,并采用自适应电压/频率缩放(AVFS)策略压缩冗余保护带。当应用于22纳米CMOS制造的开源多核RISC-V处理器时,Actiniaria通过无创动态红外下降监测实现了40.6%的功耗降低,而面积开销仅为0.08%。
{"title":"Actiniaria: Distributed Dynamic-IR-Drop-Aware Timing Monitor for AVFS With Lightweight Tentacles","authors":"Junyi Qian;Lishuo Deng;Cai Li;Yizhi Ding;Weiwei Shan","doi":"10.1109/LSSC.2025.3645701","DOIUrl":"https://doi.org/10.1109/LSSC.2025.3645701","url":null,"abstract":"Advances in integrated circuit (IC) technology have amplified the effects of process, voltage, and temperature (PVT) variations, particularly dynamic IR drop, which severely affects timing. Post-silicon IR drop monitoring circuits are lacking, forcing designers to reserve substantial static guard bands for worst-case scenarios, compromising energy efficiency. Inspired by biomimetics, this letter proposes an Actiniaria-shaped structure for distributed monitoring of multiple IR drop hotspots without invading critical paths (CPs). It comprises lightweight tentacles + central monitor. The tentacles use voltage-sensitive cells to sense the impact of IR drop on timing, while also supporting adaptive tuning for global variations. By using on-chip PVT sensors, Actiniaria tracks the timing characteristics of the longest CP in real time under PVT variations. A two-stage prewarning timing monitor captures timing margins and employs adaptive voltage/frequency scaling (AVFS) strategies to compress redundant guard bands. When applied to an open-source multicore RISC-V processor fabricated in 22-nm CMOS, Actiniaria achieves a 40.6% reduction in power consumption through noninvasive dynamic IR drop monitoring while having only 0.08% area overhead.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"9 ","pages":"25-28"},"PeriodicalIF":2.0,"publicationDate":"2025-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A 28-nm FeFET Compute-in-Memory Macro With 64×64 Array Size and On-Chip 4-Bit Flash ADC 具有64×64阵列大小和片上4位闪存ADC的28纳米ffet内存宏计算
IF 2 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-03 DOI: 10.1109/LSSC.2025.3640522
Vaidehi Garg;Jianwei Jia;Omkar Phadke;Shimeng Yu
Compute-in-memory (CIM) using emerging nonvolatile memory devices is a promising candidate for energy-efficient deep neural network (DNN) inference at the edge. Ferroelectric field-effect transistors (FeFETs) have recently gained attention as nonvolatile, CMOS-compatible devices with a higher on/off ratio and lower read and write energy compared to resistive random-access memory (RRAM). This work demonstrates a 4-kb FeFET-CIM macro fabricated in the GlobalFoundries 28-nm high-k metal gate (HKMG) process. The macro consists of a $64times 64$ FeFET array with peripheral circuits for program, erase, and current-mode CIM operations and eight 4-bit Flash ADCs to quantize the analog partial sums. The proposed design achieves an energy efficiency of 346.6 TOPS/W for $1times 1$ b MAC, an inference accuracy of 85.2% for 16 row parallel compute with 4-bit ADC resolution, and 89.1% with 8 row parallel compute with 3-bit resolution, compared to a software baseline of 89.7% on the VGG-8 model for CIFAR-10.
使用新兴的非易失性存储器件的内存计算(CIM)是边缘高效节能深度神经网络(DNN)推理的一个有前途的候选者。与电阻式随机存取存储器(RRAM)相比,铁电场效应晶体管(fefet)作为一种非易失性、cmos兼容的器件,具有更高的开/关比和更低的读写能量,最近引起了人们的关注。本研究展示了在GlobalFoundries 28纳米高k金属栅极(HKMG)工艺中制造的4kb FeFET-CIM宏。该宏由一个$64 × 64$ FeFET阵列和用于编程、擦除和电流模式CIM操作的外围电路以及8个用于量化模拟部分和的4位Flash adc组成。所提出的设计在$1 × 1$ b MAC下实现了346.6 TOPS/W的能效,在4位ADC分辨率的16行并行计算中实现了85.2%的推理精度,在3位分辨率的8行并行计算中实现了89.1%的推理精度,而在用于CIFAR-10的vgg8模型上实现了89.7%的软件基线。
{"title":"A 28-nm FeFET Compute-in-Memory Macro With 64×64 Array Size and On-Chip 4-Bit Flash ADC","authors":"Vaidehi Garg;Jianwei Jia;Omkar Phadke;Shimeng Yu","doi":"10.1109/LSSC.2025.3640522","DOIUrl":"https://doi.org/10.1109/LSSC.2025.3640522","url":null,"abstract":"Compute-in-memory (CIM) using emerging nonvolatile memory devices is a promising candidate for energy-efficient deep neural network (DNN) inference at the edge. Ferroelectric field-effect transistors (FeFETs) have recently gained attention as nonvolatile, CMOS-compatible devices with a higher on/off ratio and lower read and write energy compared to resistive random-access memory (RRAM). This work demonstrates a 4-kb FeFET-CIM macro fabricated in the GlobalFoundries 28-nm high-k metal gate (HKMG) process. The macro consists of a <inline-formula> <tex-math>$64times 64$ </tex-math></inline-formula> FeFET array with peripheral circuits for program, erase, and current-mode CIM operations and eight 4-bit Flash ADCs to quantize the analog partial sums. The proposed design achieves an energy efficiency of 346.6 TOPS/W for <inline-formula> <tex-math>$1times 1$ </tex-math></inline-formula>b MAC, an inference accuracy of 85.2% for 16 row parallel compute with 4-bit ADC resolution, and 89.1% with 8 row parallel compute with 3-bit resolution, compared to a software baseline of 89.7% on the VGG-8 model for CIFAR-10.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"9 ","pages":"13-16"},"PeriodicalIF":2.0,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A 500 MS/s Robust 2b/cycle Pipelined-SAR ADC Achieving 64.6-dB SNDR and 82.6-dB SFDR With Linearity Enhancement Techniques 一种500 MS/s鲁棒2b/周期流水线sar ADC,通过线性增强技术实现64.6 db SNDR和82.6 db SFDR
IF 2 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-02 DOI: 10.1109/LSSC.2025.3639322
Qiang Yu;Zheng Zhu;Lulu Zhang;Qin Huang;Yao Feng;Chao Liang;Biao Hu;Ling Du;Rongbin Yang;Shuangyi Wu;Qiang Li
This letter presents a 14-bit 500-MS/s 3-stage pipelined successive approximation register (SAR) analog-to-digital converter (ADC). By exploiting robust 2b/cycle SAR ADCs, this ADC incorporates significant voltage and time redundancy. High SFDR is achieved through several linearity enhancement techniques. First, a DAC splitting technique addresses the common-mode voltage matching problem between the input buffer and the sampling circuit. Second, a reference charge neutralization minimizes reference ripple. Finally, a digital harmonic correction is realized with a low-cost and low-latency LUT. Fabricated in a 28-nm CMOS process, the prototype ADC achieves 64.6-dB SNDR and 82.6-dB SFDR at Nyquist.
这封信提出了一个14位500毫秒/秒3级流水线逐次逼近寄存器(SAR)模数转换器(ADC)。通过利用强大的2b/周期SAR ADC,该ADC具有显著的电压和时间冗余。高SFDR是通过几种线性增强技术实现的。首先,DAC分裂技术解决了输入缓冲器和采样电路之间的共模电压匹配问题。其次,参考电荷中和使参考纹波最小化。最后,利用低成本、低延迟的LUT实现了数字谐波校正。原型ADC采用28纳米CMOS工艺制造,在Nyquist实现了64.6 db SNDR和82.6 db SFDR。
{"title":"A 500 MS/s Robust 2b/cycle Pipelined-SAR ADC Achieving 64.6-dB SNDR and 82.6-dB SFDR With Linearity Enhancement Techniques","authors":"Qiang Yu;Zheng Zhu;Lulu Zhang;Qin Huang;Yao Feng;Chao Liang;Biao Hu;Ling Du;Rongbin Yang;Shuangyi Wu;Qiang Li","doi":"10.1109/LSSC.2025.3639322","DOIUrl":"https://doi.org/10.1109/LSSC.2025.3639322","url":null,"abstract":"This letter presents a 14-bit 500-MS/s 3-stage pipelined successive approximation register (SAR) analog-to-digital converter (ADC). By exploiting robust 2b/cycle SAR ADCs, this ADC incorporates significant voltage and time redundancy. High SFDR is achieved through several linearity enhancement techniques. First, a DAC splitting technique addresses the common-mode voltage matching problem between the input buffer and the sampling circuit. Second, a reference charge neutralization minimizes reference ripple. Finally, a digital harmonic correction is realized with a low-cost and low-latency LUT. Fabricated in a 28-nm CMOS process, the prototype ADC achieves 64.6-dB SNDR and 82.6-dB SFDR at Nyquist.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"9 ","pages":"9-12"},"PeriodicalIF":2.0,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A 28-nm Digital Compute-in-Memory Ising Annealer With Asynchronous Random Number Generator for Traveling Salesman Problem 基于异步随机数生成器的28纳米内存计算退火算法研究旅行商问题
IF 2 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-01 DOI: 10.1109/LSSC.2025.3639178
Yuyao Kong;Haomei Liu;Vaidehi Garg;Shimeng Yu
This work presents a compact digital compute-in-memory (DCIM) Ising annealer targeting large-scale combinatorial optimization. A centroid-based weight mapping method combined with hierarchical clustering reduces the memory capacity required for traveling salesman problem (TSP) weights, enabling efficient mapping with limited on-chip storage. An asynchronous random number generator (ARNG) based on dual ring oscillator provides high-quality randomness with tunable probability bias while incurring much smaller hardware overhead than conventional linear feedback shift registers (LFSRs). The proposed architecture was fabricated in 28-nm CMOS, integrating a DCIM array and an on-chip asynchronous-clock-based random number generator (ARNG). Measurement results demonstrate annealing on TSP problems up to 3038 cities. Compared to LFSR-based randomness, the ARNG achieves solution quality closer to the software baseline while maintaining compact area. This design highlights a scalable and energy-efficient hardware framework for Ising-based optimization, showing clear advantages in both memory efficiency and random source quality over prior approaches.
本文提出了一种针对大规模组合优化的紧凑型数字内存计算(DCIM) Ising退火机。基于质心的权值映射方法与层次聚类相结合,降低了旅行商问题(TSP)权值映射所需的内存容量,在有限的片上存储空间下实现了有效的权值映射。基于双环振荡器的异步随机数发生器(ARNG)提供高质量的随机性和可调的概率偏差,同时比传统的线性反馈移位寄存器(LFSRs)产生更小的硬件开销。该架构采用28纳米CMOS工艺,集成了DCIM阵列和片上异步时钟随机数发生器(ARNG)。测量结果表明退火对3038个城市TSP问题的影响。与基于lfsr的随机性相比,ARNG在保持紧凑区域的同时实现了更接近软件基线的解决方案质量。该设计突出了基于ising优化的可扩展和节能硬件框架,与之前的方法相比,在内存效率和随机源质量方面都显示出明显的优势。
{"title":"A 28-nm Digital Compute-in-Memory Ising Annealer With Asynchronous Random Number Generator for Traveling Salesman Problem","authors":"Yuyao Kong;Haomei Liu;Vaidehi Garg;Shimeng Yu","doi":"10.1109/LSSC.2025.3639178","DOIUrl":"https://doi.org/10.1109/LSSC.2025.3639178","url":null,"abstract":"This work presents a compact digital compute-in-memory (DCIM) Ising annealer targeting large-scale combinatorial optimization. A centroid-based weight mapping method combined with hierarchical clustering reduces the memory capacity required for traveling salesman problem (TSP) weights, enabling efficient mapping with limited on-chip storage. An asynchronous random number generator (ARNG) based on dual ring oscillator provides high-quality randomness with tunable probability bias while incurring much smaller hardware overhead than conventional linear feedback shift registers (LFSRs). The proposed architecture was fabricated in 28-nm CMOS, integrating a DCIM array and an on-chip asynchronous-clock-based random number generator (ARNG). Measurement results demonstrate annealing on TSP problems up to 3038 cities. Compared to LFSR-based randomness, the ARNG achieves solution quality closer to the software baseline while maintaining compact area. This design highlights a scalable and energy-efficient hardware framework for Ising-based optimization, showing clear advantages in both memory efficiency and random source quality over prior approaches.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"9 ","pages":"1-4"},"PeriodicalIF":2.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145705914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Solid-State Circuits Letters
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1