Pub Date : 2025-12-03DOI: 10.1109/LSSC.2025.3640522
Vaidehi Garg;Jianwei Jia;Omkar Phadke;Shimeng Yu
Compute-in-memory (CIM) using emerging nonvolatile memory devices is a promising candidate for energy-efficient deep neural network (DNN) inference at the edge. Ferroelectric field-effect transistors (FeFETs) have recently gained attention as nonvolatile, CMOS-compatible devices with a higher on/off ratio and lower read and write energy compared to resistive random-access memory (RRAM). This work demonstrates a 4-kb FeFET-CIM macro fabricated in the GlobalFoundries 28-nm high-k metal gate (HKMG) process. The macro consists of a $64times 64$ FeFET array with peripheral circuits for program, erase, and current-mode CIM operations and eight 4-bit Flash ADCs to quantize the analog partial sums. The proposed design achieves an energy efficiency of 346.6 TOPS/W for $1times 1$ b MAC, an inference accuracy of 85.2% for 16 row parallel compute with 4-bit ADC resolution, and 89.1% with 8 row parallel compute with 3-bit resolution, compared to a software baseline of 89.7% on the VGG-8 model for CIFAR-10.
{"title":"A 28-nm FeFET Compute-in-Memory Macro With 64×64 Array Size and On-Chip 4-Bit Flash ADC","authors":"Vaidehi Garg;Jianwei Jia;Omkar Phadke;Shimeng Yu","doi":"10.1109/LSSC.2025.3640522","DOIUrl":"https://doi.org/10.1109/LSSC.2025.3640522","url":null,"abstract":"Compute-in-memory (CIM) using emerging nonvolatile memory devices is a promising candidate for energy-efficient deep neural network (DNN) inference at the edge. Ferroelectric field-effect transistors (FeFETs) have recently gained attention as nonvolatile, CMOS-compatible devices with a higher on/off ratio and lower read and write energy compared to resistive random-access memory (RRAM). This work demonstrates a 4-kb FeFET-CIM macro fabricated in the GlobalFoundries 28-nm high-k metal gate (HKMG) process. The macro consists of a <inline-formula> <tex-math>$64times 64$ </tex-math></inline-formula> FeFET array with peripheral circuits for program, erase, and current-mode CIM operations and eight 4-bit Flash ADCs to quantize the analog partial sums. The proposed design achieves an energy efficiency of 346.6 TOPS/W for <inline-formula> <tex-math>$1times 1$ </tex-math></inline-formula>b MAC, an inference accuracy of 85.2% for 16 row parallel compute with 4-bit ADC resolution, and 89.1% with 8 row parallel compute with 3-bit resolution, compared to a software baseline of 89.7% on the VGG-8 model for CIFAR-10.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"9 ","pages":"13-16"},"PeriodicalIF":2.0,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This letter presents a 14-bit 500-MS/s 3-stage pipelined successive approximation register (SAR) analog-to-digital converter (ADC). By exploiting robust 2b/cycle SAR ADCs, this ADC incorporates significant voltage and time redundancy. High SFDR is achieved through several linearity enhancement techniques. First, a DAC splitting technique addresses the common-mode voltage matching problem between the input buffer and the sampling circuit. Second, a reference charge neutralization minimizes reference ripple. Finally, a digital harmonic correction is realized with a low-cost and low-latency LUT. Fabricated in a 28-nm CMOS process, the prototype ADC achieves 64.6-dB SNDR and 82.6-dB SFDR at Nyquist.
这封信提出了一个14位500毫秒/秒3级流水线逐次逼近寄存器(SAR)模数转换器(ADC)。通过利用强大的2b/周期SAR ADC,该ADC具有显著的电压和时间冗余。高SFDR是通过几种线性增强技术实现的。首先,DAC分裂技术解决了输入缓冲器和采样电路之间的共模电压匹配问题。其次,参考电荷中和使参考纹波最小化。最后,利用低成本、低延迟的LUT实现了数字谐波校正。原型ADC采用28纳米CMOS工艺制造,在Nyquist实现了64.6 db SNDR和82.6 db SFDR。
{"title":"A 500 MS/s Robust 2b/cycle Pipelined-SAR ADC Achieving 64.6-dB SNDR and 82.6-dB SFDR With Linearity Enhancement Techniques","authors":"Qiang Yu;Zheng Zhu;Lulu Zhang;Qin Huang;Yao Feng;Chao Liang;Biao Hu;Ling Du;Rongbin Yang;Shuangyi Wu;Qiang Li","doi":"10.1109/LSSC.2025.3639322","DOIUrl":"https://doi.org/10.1109/LSSC.2025.3639322","url":null,"abstract":"This letter presents a 14-bit 500-MS/s 3-stage pipelined successive approximation register (SAR) analog-to-digital converter (ADC). By exploiting robust 2b/cycle SAR ADCs, this ADC incorporates significant voltage and time redundancy. High SFDR is achieved through several linearity enhancement techniques. First, a DAC splitting technique addresses the common-mode voltage matching problem between the input buffer and the sampling circuit. Second, a reference charge neutralization minimizes reference ripple. Finally, a digital harmonic correction is realized with a low-cost and low-latency LUT. Fabricated in a 28-nm CMOS process, the prototype ADC achieves 64.6-dB SNDR and 82.6-dB SFDR at Nyquist.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"9 ","pages":"9-12"},"PeriodicalIF":2.0,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01DOI: 10.1109/LSSC.2025.3639178
Yuyao Kong;Haomei Liu;Vaidehi Garg;Shimeng Yu
This work presents a compact digital compute-in-memory (DCIM) Ising annealer targeting large-scale combinatorial optimization. A centroid-based weight mapping method combined with hierarchical clustering reduces the memory capacity required for traveling salesman problem (TSP) weights, enabling efficient mapping with limited on-chip storage. An asynchronous random number generator (ARNG) based on dual ring oscillator provides high-quality randomness with tunable probability bias while incurring much smaller hardware overhead than conventional linear feedback shift registers (LFSRs). The proposed architecture was fabricated in 28-nm CMOS, integrating a DCIM array and an on-chip asynchronous-clock-based random number generator (ARNG). Measurement results demonstrate annealing on TSP problems up to 3038 cities. Compared to LFSR-based randomness, the ARNG achieves solution quality closer to the software baseline while maintaining compact area. This design highlights a scalable and energy-efficient hardware framework for Ising-based optimization, showing clear advantages in both memory efficiency and random source quality over prior approaches.
{"title":"A 28-nm Digital Compute-in-Memory Ising Annealer With Asynchronous Random Number Generator for Traveling Salesman Problem","authors":"Yuyao Kong;Haomei Liu;Vaidehi Garg;Shimeng Yu","doi":"10.1109/LSSC.2025.3639178","DOIUrl":"https://doi.org/10.1109/LSSC.2025.3639178","url":null,"abstract":"This work presents a compact digital compute-in-memory (DCIM) Ising annealer targeting large-scale combinatorial optimization. A centroid-based weight mapping method combined with hierarchical clustering reduces the memory capacity required for traveling salesman problem (TSP) weights, enabling efficient mapping with limited on-chip storage. An asynchronous random number generator (ARNG) based on dual ring oscillator provides high-quality randomness with tunable probability bias while incurring much smaller hardware overhead than conventional linear feedback shift registers (LFSRs). The proposed architecture was fabricated in 28-nm CMOS, integrating a DCIM array and an on-chip asynchronous-clock-based random number generator (ARNG). Measurement results demonstrate annealing on TSP problems up to 3038 cities. Compared to LFSR-based randomness, the ARNG achieves solution quality closer to the software baseline while maintaining compact area. This design highlights a scalable and energy-efficient hardware framework for Ising-based optimization, showing clear advantages in both memory efficiency and random source quality over prior approaches.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"9 ","pages":"1-4"},"PeriodicalIF":2.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145705914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-28DOI: 10.1109/LSSC.2025.3637833
Huanan Guo;Yufeng Yao;Xiang Gao
To reduce the bit-error-rate (BER), equalizers are implemented in high-speed SerDes receivers (RX) to compensate for channel insertion loss and mitigate intersymbol interference (ISI). Conventional analog front-end (AFE) designs primarily focus on amplitude gain while neglecting the influence of phase shift. This brief presents a phase equalization (PEQ) AFE design in a 7-nm FinFET 112 Gb/s DSP-based RX, which reduces the nonlinear phase shift within the Nyquist frequency $(f_{mathrm { nyq}})$ . The proposed PEQ compensates for the phase distortion and helps to achieve a pre-DSP eye opening of 0.17 UI and 47 mV over a 20.7 dB loss channel. The total RX demonstrates a BER less than 3e-7 with 1-tap DFE and 18-tap FFE over a 42.4 dB loss channel, achieving a power efficiency of 2.18 pJ/bit excluding the DSP power.
{"title":"A 112-Gb/s PAM4 Receiver With a Phase Equalization AFE in 7-nm FinFET","authors":"Huanan Guo;Yufeng Yao;Xiang Gao","doi":"10.1109/LSSC.2025.3637833","DOIUrl":"https://doi.org/10.1109/LSSC.2025.3637833","url":null,"abstract":"To reduce the bit-error-rate (BER), equalizers are implemented in high-speed SerDes receivers (RX) to compensate for channel insertion loss and mitigate intersymbol interference (ISI). Conventional analog front-end (AFE) designs primarily focus on amplitude gain while neglecting the influence of phase shift. This brief presents a phase equalization (PEQ) AFE design in a 7-nm FinFET 112 Gb/s DSP-based RX, which reduces the nonlinear phase shift within the Nyquist frequency <inline-formula> <tex-math>$(f_{mathrm { nyq}})$ </tex-math></inline-formula>. The proposed PEQ compensates for the phase distortion and helps to achieve a pre-DSP eye opening of 0.17 UI and 47 mV over a 20.7 dB loss channel. The total RX demonstrates a BER less than 3e-7 with 1-tap DFE and 18-tap FFE over a 42.4 dB loss channel, achieving a power efficiency of 2.18 pJ/bit excluding the DSP power.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"9 ","pages":"5-8"},"PeriodicalIF":2.0,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145705916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-21DOI: 10.1109/LSSC.2025.3636192
Suyash Shrivastava;Pydi Ganga Bahubalindruni
This manuscript presents an experimental characterization of a novel high speed D flip-flop (D-FF). The circuit was fabricated on a $27mu $ m thick flexible polyimide substrate using a nMOS only, single gate amorphous Indium-Gallium-Zinc-Oxide (a-IGZO) thin-film transistor (TFT) technology. Reliable response of the D-FF was noticed from measurements up to a clock and an input signal frequency of $20{mathrm {,}}$ MHz and $2{mathrm {,}}$ MHz, respectively. Further, with an output voltage swing of 60%, D-FF functionality was observed up to clock and input signal frequencies of $25{mathrm {,}}$ MHz and $3{mathrm {,}}$ MHz, respectively. This circuit has shown a power dissipation of $140{mathrm {,}} mu $ W including buffers and the figure-of-merit (FOM) of $142{mathrm {,}}$ MHz/mW, which is almost a 52% improvement compared to the state-of-the-art. In addition, this D-FF is employed in the implementation of an 11-bit up/down (U/D) counter. The U/D counter has shown a reliable operation up to an operating frequency of $8{mathrm {,}}$ MHz with a power consumption of $4.8{mathrm {,}}$ mW. Both circuits were characterized at a low supply voltage of $3{mathrm {,}}$ V, occupying an active area of $0.144{mathrm {,}}$ mm2 and $4.32{mathrm {,}}$ mm2, respectively. These circuits would find potential application in biomedical wearable devices.
{"title":"A High-Speed D-FF and a 11-Bit Up-Down Counter Using Unipolar Oxide TFTs on a Flexible Foil","authors":"Suyash Shrivastava;Pydi Ganga Bahubalindruni","doi":"10.1109/LSSC.2025.3636192","DOIUrl":"https://doi.org/10.1109/LSSC.2025.3636192","url":null,"abstract":"This manuscript presents an experimental characterization of a novel high speed D flip-flop (D-FF). The circuit was fabricated on a <inline-formula> <tex-math>$27mu $ </tex-math></inline-formula>m thick flexible polyimide substrate using a nMOS only, single gate amorphous Indium-Gallium-Zinc-Oxide (a-IGZO) thin-film transistor (TFT) technology. Reliable response of the D-FF was noticed from measurements up to a clock and an input signal frequency of <inline-formula> <tex-math>$20{mathrm {,}}$ </tex-math></inline-formula>MHz and <inline-formula> <tex-math>$2{mathrm {,}}$ </tex-math></inline-formula>MHz, respectively. Further, with an output voltage swing of 60%, D-FF functionality was observed up to clock and input signal frequencies of <inline-formula> <tex-math>$25{mathrm {,}}$ </tex-math></inline-formula>MHz and <inline-formula> <tex-math>$3{mathrm {,}}$ </tex-math></inline-formula>MHz, respectively. This circuit has shown a power dissipation of <inline-formula> <tex-math>$140{mathrm {,}} mu $ </tex-math></inline-formula>W including buffers and the figure-of-merit (FOM) of <inline-formula> <tex-math>$142{mathrm {,}}$ </tex-math></inline-formula>MHz/mW, which is almost a 52% improvement compared to the state-of-the-art. In addition, this D-FF is employed in the implementation of an 11-bit up/down (U/D) counter. The U/D counter has shown a reliable operation up to an operating frequency of <inline-formula> <tex-math>$8{mathrm {,}}$ </tex-math></inline-formula>MHz with a power consumption of <inline-formula> <tex-math>$4.8{mathrm {,}}$ </tex-math></inline-formula>mW. Both circuits were characterized at a low supply voltage of <inline-formula> <tex-math>$3{mathrm {,}}$ </tex-math></inline-formula>V, occupying an active area of <inline-formula> <tex-math>$0.144{mathrm {,}}$ </tex-math></inline-formula>mm2 and <inline-formula> <tex-math>$4.32{mathrm {,}}$ </tex-math></inline-formula>mm2, respectively. These circuits would find potential application in biomedical wearable devices.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"8 ","pages":"373-376"},"PeriodicalIF":2.0,"publicationDate":"2025-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-18DOI: 10.1109/LSSC.2025.3634340
Simon Foster;Scott Block;Daniel McMitchell;Edgar Colin-Beltran;Robert Dailey
A cell monitoring system for performance and safety enhancement is presented. It is the first commercially available single-chip-on-cell near-field contactless solution for automotive battery management, simplifying pack interconnect and reducing points of failure. This letter is a companion paper to the earlier ISSCC paper. It provides further details on the benefits of this architecture, including earlier thermal runaway detection capabilities. It also presents further data on the robustness of the near-field communication link in the presence of common and differential mode interference.
{"title":"Advancing On-Cell Near-Field Monitoring for Thermal Runaway Detection in EV Batteries","authors":"Simon Foster;Scott Block;Daniel McMitchell;Edgar Colin-Beltran;Robert Dailey","doi":"10.1109/LSSC.2025.3634340","DOIUrl":"https://doi.org/10.1109/LSSC.2025.3634340","url":null,"abstract":"A cell monitoring system for performance and safety enhancement is presented. It is the first commercially available single-chip-on-cell near-field contactless solution for automotive battery management, simplifying pack interconnect and reducing points of failure. This letter is a companion paper to the earlier ISSCC paper. It provides further details on the benefits of this architecture, including earlier thermal runaway detection capabilities. It also presents further data on the robustness of the near-field communication link in the presence of common and differential mode interference.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"8 ","pages":"369-372"},"PeriodicalIF":2.0,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11257864","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-03DOI: 10.1109/LSSC.2025.3628176
Harold Pilo;Manish Arora;Chien-An Lai;Mike Lee;Zack Lo;Mayur Randeria
A high-density (HD), SRAM-based register file (RF) has been demonstrated in Intel 18A Technology (Wang et al., 2025 and Pilo et al., 2025) featuring RibbonFET GAA transistors and a back side power delivery network (BSDPN). The RF is optimized for HD and array efficiency and achieves a density of 37.8 Mb/mm2, the highest density reported to date for an RF in the most advanced technology nodes (Wang et al., 2025, Chang et al., 2025, and Pilo et al., 2025). It is implemented with a conventional bitline (BL), two-bank memory architecture and it can be used as the SRAM workhorse for most SoC applications with maximum bit-count of 262Kb.
在Intel 18A Technology (Wang et al., 2025 and Pilo et al., 2025)中已经演示了一种高密度(HD)、基于sram的寄存器文件(RF),该文件具有带状场效应晶体管GAA晶体管和背面供电网络(BSDPN)。该射频针对高清和阵列效率进行了优化,密度达到37.8 Mb/mm2,这是迄今为止报道的最先进技术节点射频的最高密度(Wang et al., 2025; Chang et al., 2025; Pilo et al., 2025)。它采用传统的位线(BL)、双银行存储器架构,可以作为SRAM的主力,用于大多数SoC应用,最大比特数为262Kb。
{"title":"A 37.8 Mb/mm² SRAM in Intel 18A Technology Featuring a Resistive Supply-Line Write Scheme and Write-Assist With Parallel Boost Injection","authors":"Harold Pilo;Manish Arora;Chien-An Lai;Mike Lee;Zack Lo;Mayur Randeria","doi":"10.1109/LSSC.2025.3628176","DOIUrl":"https://doi.org/10.1109/LSSC.2025.3628176","url":null,"abstract":"A high-density (HD), SRAM-based register file (RF) has been demonstrated in Intel 18A Technology (Wang et al., 2025 and Pilo et al., 2025) featuring RibbonFET GAA transistors and a back side power delivery network (BSDPN). The RF is optimized for HD and array efficiency and achieves a density of 37.8 Mb/mm2, the highest density reported to date for an RF in the most advanced technology nodes (Wang et al., 2025, Chang et al., 2025, and Pilo et al., 2025). It is implemented with a conventional bitline (BL), two-bank memory architecture and it can be used as the SRAM workhorse for most SoC applications with maximum bit-count of 262Kb.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"8 ","pages":"357-360"},"PeriodicalIF":2.0,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145510085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This letter presents a CMOS image sensor (CIS) that integrates two operation modes: 1) a high-resolution viewing mode with $0.8~mu $ m 32 Mpixels and 2) a low-power always-on object recognition mode consuming 2.67 mW at 10 frames/s. The CIS features a unique windmill-pattern analog edge extraction circuit that is resilient to illumination variations. An on-chip deep neural network processor was implemented alongside a compact algorithm with only 12 kB for coefficients and 48 kB for working memory. The design incorporates separate circuit areas for high-speed viewing and low-power sensing modes, thereby ensuring optimal performance and energy efficiency.
本文介绍了一种集成了两种工作模式的CMOS图像传感器(CIS): 1)高分辨率观看模式,像素为0.8~ 3.2 m; 2)低功耗始终在线的目标识别模式,以10帧/秒的速度消耗2.67 mW。CIS具有独特的风车模式模拟边缘提取电路,可适应光照变化。片上深度神经网络处理器与紧凑的算法一起实现,系数只有12 kB,工作内存只有48 kB。该设计结合了独立的电路区域,用于高速观看和低功耗传感模式,从而确保了最佳性能和能源效率。
{"title":"A 0.8-μm 32-Mpixel Always-On CMOS Image Sensor With Windmill-Pattern Edge Extraction and On-Chip DNN","authors":"Mamoru Sato;Sachio Akebono;Kazuyoshi Yasuoka;Eriko Kato;Masahiro Tsuruta;Chiaki Takano;Kensuke Ota;Kazuki Haraguchi;Masahiro Watanabe;Genki Fujii;Koichiro Yamanaka;Kazunori Yasuda;Satoshi Minami;Katsuhiko Hanzawa;Kohei Matsuda;Akihiko Kato;Yosuke Ueno","doi":"10.1109/LSSC.2025.3628314","DOIUrl":"https://doi.org/10.1109/LSSC.2025.3628314","url":null,"abstract":"This letter presents a CMOS image sensor (CIS) that integrates two operation modes: 1) a high-resolution viewing mode with <inline-formula> <tex-math>$0.8~mu $ </tex-math></inline-formula>m 32 Mpixels and 2) a low-power always-on object recognition mode consuming 2.67 mW at 10 frames/s. The CIS features a unique windmill-pattern analog edge extraction circuit that is resilient to illumination variations. An on-chip deep neural network processor was implemented alongside a compact algorithm with only 12 kB for coefficients and 48 kB for working memory. The design incorporates separate circuit areas for high-speed viewing and low-power sensing modes, thereby ensuring optimal performance and energy efficiency.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"8 ","pages":"353-356"},"PeriodicalIF":2.0,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145510088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This letter presents an ultralow-noise, power-efficient, and pulse-width modulation (PWM)-dimming-tolerant photocurrent readout circuit for under-display ambient light sensor (ALS). A transimpedance amplifier (TIA) with a feedback diode achieves G$Omega $ -level resistance and 6 fA/$surd $ Hz input current noise, enabling sub-pA resolution. Instability and noise folding are mitigated at low power through a signal-dependent auto-tracking zero for frequency compensation and a low-pass filter for high-frequency noise suppression. A 2-point calibration algorithm improves the linearity of the current-to-frequency (I2F) quantizer without incurring additional power. To extract ambient light in the presence of PWM dimming interference, the readout supports under-sampling, allowing digital cancellation of interference. Fabricated in 180 nm CMOS, the prototype achieves $0.36~rm pA_{pp}$ resolution, 146.3 dB dynamic range (DR), and a 206.3 dB $rm FoM_{DR}$ in 0.84 ms readout time, representing best-in-class performance. Optical ALS measurements under PWM dimming interference are experimentally validated.
{"title":"A 560 μ W, 6 fA/√Hz, 146 dB-DR Ultrasensitive Current Readout Circuit for PWM-Dimming-Tolerant Under-Display Ambient Light Sensors","authors":"Xiuzhi Zhao;Liheng Liu;Shi Chen;Tianxiang Qu;Qinjing Pan;Dan Li;Gan Guo;Zhiliang Hong;Jiawei Xu","doi":"10.1109/LSSC.2025.3626378","DOIUrl":"https://doi.org/10.1109/LSSC.2025.3626378","url":null,"abstract":"This letter presents an ultralow-noise, power-efficient, and pulse-width modulation (PWM)-dimming-tolerant photocurrent readout circuit for under-display ambient light sensor (ALS). A transimpedance amplifier (TIA) with a feedback diode achieves G<inline-formula> <tex-math>$Omega $ </tex-math></inline-formula>-level resistance and 6 fA/<inline-formula> <tex-math>$surd $ </tex-math></inline-formula>Hz input current noise, enabling sub-pA resolution. Instability and noise folding are mitigated at low power through a signal-dependent auto-tracking zero for frequency compensation and a low-pass filter for high-frequency noise suppression. A 2-point calibration algorithm improves the linearity of the current-to-frequency (I2F) quantizer without incurring additional power. To extract ambient light in the presence of PWM dimming interference, the readout supports under-sampling, allowing digital cancellation of interference. Fabricated in 180 nm CMOS, the prototype achieves <inline-formula> <tex-math>$0.36~rm pA_{pp}$ </tex-math></inline-formula> resolution, 146.3 dB dynamic range (DR), and a 206.3 dB <inline-formula> <tex-math>$rm FoM_{DR}$ </tex-math></inline-formula> in 0.84 ms readout time, representing best-in-class performance. Optical ALS measurements under PWM dimming interference are experimentally validated.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"8 ","pages":"361-364"},"PeriodicalIF":2.0,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145510087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-30DOI: 10.1109/LSSC.2025.3627274
Jakob Finkbeiner;Raphael Nägele;Manuel Wittlinger;Markus Grözing;Manfred Berroth;Georg Rademacher
Analog computing offers intrinsic energy and latency benefits that makes it attractive for real-time and edge applications. Conventional analog accelerators suffer from repeated conversions between analog and digital domain, which degrades efficiency and throughput. We propose an all-analog pipelined neural network accelerator architecture in 22 nm fully-depleted silicon-on-insulator (FD-SOI) complementary metal-oxide-semiconductor (CMOS). Measurements of a demonstrator ASIC with analog I/Os and 6 bit weights are presented. The system energy efficiency is 290 TOPS/W or 558 TOPS/W if the energy for bias generation is neglected. The pipelined architecture achieves a throughput of 500M inferences/s and a latency of 1 ns/layer.
{"title":"PANNA: A 558 TOPS/W Pipelined All-Analog Neural Network Accelerator in 22 nm FD-SOI","authors":"Jakob Finkbeiner;Raphael Nägele;Manuel Wittlinger;Markus Grözing;Manfred Berroth;Georg Rademacher","doi":"10.1109/LSSC.2025.3627274","DOIUrl":"https://doi.org/10.1109/LSSC.2025.3627274","url":null,"abstract":"Analog computing offers intrinsic energy and latency benefits that makes it attractive for real-time and edge applications. Conventional analog accelerators suffer from repeated conversions between analog and digital domain, which degrades efficiency and throughput. We propose an all-analog pipelined neural network accelerator architecture in 22 nm fully-depleted silicon-on-insulator (FD-SOI) complementary metal-oxide-semiconductor (CMOS). Measurements of a demonstrator ASIC with analog I/Os and 6 bit weights are presented. The system energy efficiency is 290 TOPS/W or 558 TOPS/W if the energy for bias generation is neglected. The pipelined architecture achieves a throughput of 500M inferences/s and a latency of 1 ns/layer.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"8 ","pages":"365-368"},"PeriodicalIF":2.0,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145510089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}