Pub Date : 2025-12-22DOI: 10.1109/TVLSI.2025.3643939
Song Wang;Kazuteru Namba
Technology scaling and supply voltage reduction make sequential circuits around clock edges increasingly vulnerable to single-event transients (SETs). This work analyzes the sensitive regions of a dual interlocked storage cell (DICE)-based flip-flop (DICEFF) in a 15 nm FinFET process and reveals the correlation between critical charge distribution and SET pulse characteristics. A lightweight fault-tolerant scheme is proposed that integrates delay elements (DEs) with the self-recovery capability of DICE to temporally desynchronize SET pulse arrivals and facilitate self-correction through temporal misalignment. Furthermore, a visualization method based on critical charge distribution is presented to delineate SET tolerance boundaries. HSPICE simulations demonstrate that the proposed method is robust against PVT variations, improving the average critical charge by up to $1.7times $ over the baseline and reducing the risk window by 47%, while maintaining comparable delay and power efficiency.
{"title":"Analysis of a Delay-Element-Based Technique for Enhancing Soft Error Tolerance at Input Nodes Around Clock Edges","authors":"Song Wang;Kazuteru Namba","doi":"10.1109/TVLSI.2025.3643939","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3643939","url":null,"abstract":"Technology scaling and supply voltage reduction make sequential circuits around clock edges increasingly vulnerable to single-event transients (SETs). This work analyzes the sensitive regions of a dual interlocked storage cell (DICE)-based flip-flop (DICEFF) in a 15 nm FinFET process and reveals the correlation between critical charge distribution and SET pulse characteristics. A lightweight fault-tolerant scheme is proposed that integrates delay elements (DEs) with the self-recovery capability of DICE to temporally desynchronize SET pulse arrivals and facilitate self-correction through temporal misalignment. Furthermore, a visualization method based on critical charge distribution is presented to delineate SET tolerance boundaries. HSPICE simulations demonstrate that the proposed method is robust against PVT variations, improving the average critical charge by up to <inline-formula> <tex-math>$1.7times $ </tex-math></inline-formula> over the baseline and reducing the risk window by 47%, while maintaining comparable delay and power efficiency.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 3","pages":"1017-1028"},"PeriodicalIF":3.1,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147280899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural radiance fields (NeRFs) have transformed 3-D reconstruction and rendering, facilitating photorealistic image synthesis from sparse viewpoints. This work introduces an explicit data reuse neural rendering (EDR-NR) architecture, which reduces frequent external memory accesses (EMAs) and cache misses by exploiting the spatial locality from three phases, including rays, ray packets (RPs), and samples. The EDR-NR architecture features a four-stage scheduler that clusters rays on the basis of $Z$ -order, prioritize lagging rays when ray divergence happens, reorders RPs based on spatial proximity, and issues samples out-of-orderly (OoO) according to the availability of on-chip feature data. In addition, a four-tier hierarchical RP marching (HRM) technique is integrated with an axis-aligned bounding box (AABB) to facilitate spatial skipping (SS), reducing redundant computations and improving throughput. Moreover, a balanced allocation strategy for feature storage is proposed to mitigate SRAM bank conflicts. Fabricated using a 40-nm process with a die area of 10.5 mm2, the EDR-NR chip demonstrates a $2.41times $ enhancement in normalized energy efficiency, a $1.21times $ improvement in normalized area efficiency, a $1.20times $ increase in normalized throughput, and a 53.42% reduction in on-chip SRAM consumption compared with state-of-the-art accelerators.
{"title":"An Energy-Efficient Edge Coprocessor for Neural Rendering With Explicit Data Reuse Strategies","authors":"Binzhe Yuan;Xiangyu Zhang;Zeyu Zheng;Yuefeng Zhang;Haochuan Wan;Zhechen Yuan;Junsheng Chen;Yunxiang He;Junran Ding;Xiaoming Zhang;Chaolin Rao;Wenyan Su;Pingqiang Zhou;Jingyi Yu;Xin Lou","doi":"10.1109/TVLSI.2025.3641653","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3641653","url":null,"abstract":"Neural radiance fields (NeRFs) have transformed 3-D reconstruction and rendering, facilitating photorealistic image synthesis from sparse viewpoints. This work introduces an explicit data reuse neural rendering (EDR-NR) architecture, which reduces frequent external memory accesses (EMAs) and cache misses by exploiting the spatial locality from three phases, including rays, ray packets (RPs), and samples. The EDR-NR architecture features a four-stage scheduler that clusters rays on the basis of <inline-formula> <tex-math>$Z$ </tex-math></inline-formula>-order, prioritize lagging rays when ray divergence happens, reorders RPs based on spatial proximity, and issues samples out-of-orderly (OoO) according to the availability of on-chip feature data. In addition, a four-tier hierarchical RP marching (HRM) technique is integrated with an axis-aligned bounding box (AABB) to facilitate spatial skipping (SS), reducing redundant computations and improving throughput. Moreover, a balanced allocation strategy for feature storage is proposed to mitigate SRAM bank conflicts. Fabricated using a 40-nm process with a die area of 10.5 mm<sup>2</sup>, the EDR-NR chip demonstrates a <inline-formula> <tex-math>$2.41times $ </tex-math></inline-formula> enhancement in normalized energy efficiency, a <inline-formula> <tex-math>$1.21times $ </tex-math></inline-formula> improvement in normalized area efficiency, a <inline-formula> <tex-math>$1.20times $ </tex-math></inline-formula> increase in normalized throughput, and a 53.42% reduction in on-chip SRAM consumption compared with state-of-the-art accelerators.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 2","pages":"620-633"},"PeriodicalIF":3.1,"publicationDate":"2025-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this brief, a high-speed [multiway parallel (MWP)] hardware-implemented deflate data compressor (DDC) is proposed for reducing the storage of solid-state drives (SSDs). To minimize the area of the DDC, registers instead of static random access memories (SRAMs) are utilized for building hash tables because multiway data within the DDC are able to access a register-based hash table simultaneously. To further reduce the area of the DDC, the output data of indefinite length are concatenated with a tree-type hardware architecture for reducing the overall concatenation complexity. Moreover, a solid mathematical foundation is established for optimizing the latency values of Lempel–Ziv (LZ)77 circuit, the Huffman encoding circuit, and the output data concatenation circuit within the MWP DDC. The results show that the proposed MWP DDC is capable of achieving a 12.1-Gb/s throughput and a 1.76 compression ratio (CR) with a 1.17-mm2 area and 0.103-$mu $ s latency, under the synthesis of SMIC 55-nm process design kits (PDKs). Hence, the proposed DDC satisfies the SSD compression requirement for a universal serial bus (USB) 3.2 connector.
{"title":"An Area-Efficient and Low-Latency ASIC Design of Deflate Data Compressor for SSD Applications","authors":"Nengyuan Sun;Ming Jin;Jianghong Li;Zhaoyi Niu;Jinghe Wang;Zhiyuan Pan;Jiafeng Cheng;Wenrui Liu;Kai Shi;Jiaqi Wang;Jiawei Zhang;Linhan Wang;Kangning Song;Xinyu Chen;Haoxiang Yu;Weize Yu","doi":"10.1109/TVLSI.2025.3642311","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3642311","url":null,"abstract":"In this brief, a high-speed [multiway parallel (MWP)] hardware-implemented deflate data compressor (DDC) is proposed for reducing the storage of solid-state drives (SSDs). To minimize the area of the DDC, registers instead of static random access memories (SRAMs) are utilized for building hash tables because multiway data within the DDC are able to access a register-based hash table simultaneously. To further reduce the area of the DDC, the output data of indefinite length are concatenated with a tree-type hardware architecture for reducing the overall concatenation complexity. Moreover, a solid mathematical foundation is established for optimizing the latency values of Lempel–Ziv (LZ)77 circuit, the Huffman encoding circuit, and the output data concatenation circuit within the MWP DDC. The results show that the proposed MWP DDC is capable of achieving a 12.1-Gb/s throughput and a 1.76 compression ratio (CR) with a 1.17-mm<sup>2</sup> area and 0.103-<inline-formula> <tex-math>$mu $ </tex-math></inline-formula>s latency, under the synthesis of SMIC 55-nm process design kits (PDKs). Hence, the proposed DDC satisfies the SSD compression requirement for a universal serial bus (USB) 3.2 connector.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 3","pages":"1057-1061"},"PeriodicalIF":3.1,"publicationDate":"2025-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147280527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-15DOI: 10.1109/TVLSI.2025.3642291
Zhengtao Zhu;Wenjie Wang;Fan Luo;Longbin Zhu;Zhijun Zhou;Keping Wang
This brief presents an area-efficient noise-shaping (NS) successive approximation register (SAR) analog-to-digital converter (ADC) employing a parallel-delayed sampling (PDS) technique. PDS samples the residual voltages from multiple ADC conversion cycles to increase the NS effect, without the need for large integration capacitors of the typical cascaded passive integrators. A preamplifier is placed between the sampling capacitors and the integrator to avoid signal attenuation, while further reducing the area of the integrator. PDS and preamplifier introduce two left-half-plane poles to the noise transfer function (NTF) to boost the NS effect, while reducing the impact of the parasitic capacitance to essentially enhance the robustness. A prototype 9-bit NS-SAR ADC is designed in a 130-nm CMOS process. At an oversampling ratio (OSR) of 16, the proposed PDS NS-SAR ADC achieves 80.93-dB peak signal to noise and distortion ratio (SNDR) and provides 4.2 NS/area efficiency factor. It consumes a power of $23.46~mu $ W over a bandwidth of 19.53 kHz, achieving a Schreier figure of merit (FoM${}_{mathrm {S}}$ ) of 170.13 dB.
{"title":"An Area-Efficient Noise-Shaping SAR ADC With Parallel-Delayed Sampling","authors":"Zhengtao Zhu;Wenjie Wang;Fan Luo;Longbin Zhu;Zhijun Zhou;Keping Wang","doi":"10.1109/TVLSI.2025.3642291","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3642291","url":null,"abstract":"This brief presents an area-efficient noise-shaping (NS) successive approximation register (SAR) analog-to-digital converter (ADC) employing a parallel-delayed sampling (PDS) technique. PDS samples the residual voltages from multiple ADC conversion cycles to increase the NS effect, without the need for large integration capacitors of the typical cascaded passive integrators. A preamplifier is placed between the sampling capacitors and the integrator to avoid signal attenuation, while further reducing the area of the integrator. PDS and preamplifier introduce two left-half-plane poles to the noise transfer function (NTF) to boost the NS effect, while reducing the impact of the parasitic capacitance to essentially enhance the robustness. A prototype 9-bit NS-SAR ADC is designed in a 130-nm CMOS process. At an oversampling ratio (OSR) of 16, the proposed PDS NS-SAR ADC achieves 80.93-dB peak signal to noise and distortion ratio (SNDR) and provides 4.2 NS/area efficiency factor. It consumes a power of <inline-formula> <tex-math>$23.46~mu $ </tex-math></inline-formula>W over a bandwidth of 19.53 kHz, achieving a Schreier figure of merit (FoM<inline-formula> <tex-math>${}_{mathrm {S}}$ </tex-math></inline-formula>) of 170.13 dB.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 3","pages":"1053-1056"},"PeriodicalIF":3.1,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147280904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-10DOI: 10.1109/TVLSI.2025.3640215
Youngki Moon;Juyong Lee;Nayeun Kim;Yeonho Choi;Byungsoo Kim;Sungho Kang
The increasing density of dynamic random access memory (DRAM) renders permanent faults and soft errors more prevalent, which critically reduces yield and reliability. Although error correction code (ECC) can mitigate this issue, existing ECCs are not optimized for fault correction. As a result, fault tolerance remains insufficient, and the error correction capability in the presence of faults is degraded. Therefore, to improve DRAM robustness by efficiently addressing both permanent faults and soft errors, this brief proposes a fault-aware adaptive on-die ECC (FADE) in which two ECC engines independently operate in either fault mode (FM) or error mode (EM) according to the number of faulty symbols (FSs). In FM, a fault polynomial is reconstructed by reusing the fault addresses that the built-in self-repair (BISR) stores in content-addressable memory (CAM). To calculate the corresponding fault magnitudes, a modified decoding equation is employed. As a result, the number of correctable FSs in FM doubles compared to the conventional ECC. Moreover, with the proposed symbol-based fault isolation, both fault tolerance and error correction capability in the presence of faults are drastically enhanced. Additionally, the experimental results show that the proposed design can be implemented with a reasonable overhead in terms of delay and area.
{"title":"FADE: Fault-Aware Adaptive On-Die ECC for Improving Robustness","authors":"Youngki Moon;Juyong Lee;Nayeun Kim;Yeonho Choi;Byungsoo Kim;Sungho Kang","doi":"10.1109/TVLSI.2025.3640215","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3640215","url":null,"abstract":"The increasing density of dynamic random access memory (DRAM) renders permanent faults and soft errors more prevalent, which critically reduces yield and reliability. Although error correction code (ECC) can mitigate this issue, existing ECCs are not optimized for fault correction. As a result, fault tolerance remains insufficient, and the error correction capability in the presence of faults is degraded. Therefore, to improve DRAM robustness by efficiently addressing both permanent faults and soft errors, this brief proposes a fault-aware adaptive on-die ECC (FADE) in which two ECC engines independently operate in either fault mode (FM) or error mode (EM) according to the number of faulty symbols (FSs). In FM, a fault polynomial is reconstructed by reusing the fault addresses that the built-in self-repair (BISR) stores in content-addressable memory (CAM). To calculate the corresponding fault magnitudes, a modified decoding equation is employed. As a result, the number of correctable FSs in FM doubles compared to the conventional ECC. Moreover, with the proposed symbol-based fault isolation, both fault tolerance and error correction capability in the presence of faults are drastically enhanced. Additionally, the experimental results show that the proposed design can be implemented with a reasonable overhead in terms of delay and area.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 2","pages":"707-710"},"PeriodicalIF":3.1,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-09DOI: 10.1109/TVLSI.2025.3637272
Shatadal Chatterjee;Jitumani Sarma
The flash analog-to-digital converters (ADCs), essential for high-speed embedded systems, face inherent linearity constraints due to device mismatch in the resistor ladder and comparator stages. While individual analytical models exist for these mismatch sources, designers rely on Monte Carlo simulations to evaluate the combined errors. This brief introduces a unified analytical framework with closed-form expressions that capture both mismatch sources, enabling efficient estimation of root mean square (rms) integral nonlinearity/differential nonlinearity (INL/DNL). Validated against circuit simulations, the model achieves a mean absolute error (MAE) of 2.71% ($boldsymbol {sigma _{textbf {DNL}}}$ ) and 2.51% ($sigma _{text {INL}}$ ), and the maximum absolute error (MaxE) remains within 5.44%. This predictive capability guides high-yield, precision, power, and area (PPA)-optimized system-on-chip (SoC) design, enabling over $3{times }$ silicon area reduction through application-specific optimization.
{"title":"An Analytical Model of Mismatch Dominance Crossover in High-Speed Flash ADC Cores","authors":"Shatadal Chatterjee;Jitumani Sarma","doi":"10.1109/TVLSI.2025.3637272","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3637272","url":null,"abstract":"The flash analog-to-digital converters (ADCs), essential for high-speed embedded systems, face inherent linearity constraints due to device mismatch in the resistor ladder and comparator stages. While individual analytical models exist for these mismatch sources, designers rely on Monte Carlo simulations to evaluate the combined errors. This brief introduces a unified analytical framework with closed-form expressions that capture both mismatch sources, enabling efficient estimation of root mean square (rms) integral nonlinearity/differential nonlinearity (INL/DNL). Validated against circuit simulations, the model achieves a mean absolute error (MAE) of 2.71% (<inline-formula> <tex-math>$boldsymbol {sigma _{textbf {DNL}}}$ </tex-math></inline-formula>) and 2.51% (<inline-formula> <tex-math>$sigma _{text {INL}}$ </tex-math></inline-formula>), and the maximum absolute error (MaxE) remains within 5.44%. This predictive capability guides high-yield, precision, power, and area (PPA)-optimized system-on-chip (SoC) design, enabling over <inline-formula> <tex-math>$3{times }$ </tex-math></inline-formula> silicon area reduction through application-specific optimization.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 2","pages":"702-706"},"PeriodicalIF":3.1,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-09DOI: 10.1109/TVLSI.2025.3639588
Junhak Kim;Young-Wook Kim;Sinho Lee;Yoojin Jung;Min-Seong Choo;Kwanseo Park
This work presents a ring oscillator (RO)-based low-jitter injection-locked clock and data recovery (ILCDR) with a pattern-dependent pulse filtering (PDPF) technique. The conventional ILCDR has a drawback that data jitter is transferred to the recovered clock. To reduce jitter, the PDPF technique is employed to filter out the injection pulses occurring in data patterns that cause high data-dependent jitter (DDJ). Adopting the PDPF technique with an injection timing control loop, the ILCDR optimizes injection timing and maximizes timing margin. Fabricated in a 28-nm CMOS technology, the proposed ILCDR occupies an active area of 0.03 mm2 and consumes 13.6 mW at 10 Gb/s. The measured jitter tolerance (JTOL) is 1 UIpp at 35 MHz with a bit error rate (BER) of $10^{-12}$ .
{"title":"A Pattern-Dependent Pulse Filtering Technique for Low-Jitter Injection-Locked CDR in 28-nm CMOS","authors":"Junhak Kim;Young-Wook Kim;Sinho Lee;Yoojin Jung;Min-Seong Choo;Kwanseo Park","doi":"10.1109/TVLSI.2025.3639588","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3639588","url":null,"abstract":"This work presents a ring oscillator (RO)-based low-jitter injection-locked clock and data recovery (ILCDR) with a pattern-dependent pulse filtering (PDPF) technique. The conventional ILCDR has a drawback that data jitter is transferred to the recovered clock. To reduce jitter, the PDPF technique is employed to filter out the injection pulses occurring in data patterns that cause high data-dependent jitter (DDJ). Adopting the PDPF technique with an injection timing control loop, the ILCDR optimizes injection timing and maximizes timing margin. Fabricated in a 28-nm CMOS technology, the proposed ILCDR occupies an active area of 0.03 mm<sup>2</sup> and consumes 13.6 mW at 10 Gb/s. The measured jitter tolerance (JTOL) is 1 UI<sub>pp</sub> at 35 MHz with a bit error rate (BER) of <inline-formula> <tex-math>$10^{-12}$ </tex-math></inline-formula>.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 2","pages":"711-715"},"PeriodicalIF":3.1,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-02DOI: 10.1109/TVLSI.2025.3634734
Xiao Wang;Xin Sun;Yibin Zheng;Runkun Li;Kong-Pang Pun
This brief presents an amplifierless current-to-digital converter (CDC) that uniquely integrates an open-loop pseudo-differential current mirror with a current-integration successive-approximation-register analog-to-digital converter (ADC). The proposed architecture enables the CDC to achieve high-speed operation at low power consumption, which is critical for the intended applications in dynamic optical coherence tomography (OCT) systems. Fabricated in 65-nm CMOS, the prototype occupies 0.019 mm2, consumes $380~mu $ W from a 1-V supply, and achieves a 47-dB dynamic range (DR) with a 50-MS/s sample rate. It achieves Walden’s and Schreier’s figures of merit of 92 fJ/step and 148 dB, respectively, both being the best among reported CDCs.
本文介绍了一种无放大器的电流-数字转换器(CDC),该转换器独特地集成了开环伪差分电流镜和电流集成连续近似寄存器模数转换器(ADC)。所提出的架构使CDC能够在低功耗下实现高速运行,这对于动态光学相干层析成像(OCT)系统的预期应用至关重要。该原型机采用65纳米CMOS制造,占地0.019 mm2,功耗为380~mu $ W (1 v电源),采样率为50 ms /s,动态范围为47 db。它达到Walden 's和Schreier 's的优点值分别为92 fJ/步和148 dB,两者都是报道的cdc中最好的。
{"title":"A 0.38-mW, 50-MS/s, 2.3-μApp Current-Integration SAR-Based Current-to-Digital Converter for Real-Time OCT Imaging","authors":"Xiao Wang;Xin Sun;Yibin Zheng;Runkun Li;Kong-Pang Pun","doi":"10.1109/TVLSI.2025.3634734","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3634734","url":null,"abstract":"This brief presents an amplifierless current-to-digital converter (CDC) that uniquely integrates an open-loop pseudo-differential current mirror with a current-integration successive-approximation-register analog-to-digital converter (ADC). The proposed architecture enables the CDC to achieve high-speed operation at low power consumption, which is critical for the intended applications in dynamic optical coherence tomography (OCT) systems. Fabricated in 65-nm CMOS, the prototype occupies 0.019 mm<sup>2</sup>, consumes <inline-formula> <tex-math>$380~mu $ </tex-math></inline-formula>W from a 1-V supply, and achieves a 47-dB dynamic range (DR) with a 50-MS/s sample rate. It achieves Walden’s and Schreier’s figures of merit of 92 fJ/step and 148 dB, respectively, both being the best among reported CDCs.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 2","pages":"697-701"},"PeriodicalIF":3.1,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-25DOI: 10.1109/TVLSI.2025.3630312
{"title":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems Society Information","authors":"","doi":"10.1109/TVLSI.2025.3630312","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3630312","url":null,"abstract":"","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 12","pages":"C3-C3"},"PeriodicalIF":3.1,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11268918","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145595110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-25DOI: 10.1109/TVLSI.2025.3621790
Saeed Aghapour;Kiarash Sedghighadikolaei;Attila A. Yavuz;Bechir Hamdaoui;Mehran Mozaffari-Kermani
After the acceptance of [1], an error was introduced, which we aim to resolve here. The abbreviation ML stands for module lattice-based, not “machine learning.” The first sentence of the first paragraph is corrected from the version that was published in Early Access. It should have read, “Barrett modular reduction and multiplication are essential primitives for efficient modular computation in cryptographic schemes, including postquantum standards such as module lattice-based (ML) key encapsulation mechanism (KEM) and ML-digital signature algorithm (DSA).” In the Introduction, the same correction has been made for the abbreviation ML.
{"title":"Corrections to “Efficient Fault-Detection Architectures for Barrett Reduction and Multiplication in Classical and Post-Quantum Cryptographic Systems”","authors":"Saeed Aghapour;Kiarash Sedghighadikolaei;Attila A. Yavuz;Bechir Hamdaoui;Mehran Mozaffari-Kermani","doi":"10.1109/TVLSI.2025.3621790","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3621790","url":null,"abstract":"After the acceptance of [1], an error was introduced, which we aim to resolve here. The abbreviation ML stands for module lattice-based, not “machine learning.” The first sentence of the first paragraph is corrected from the version that was published in Early Access. It should have read, “Barrett modular reduction and multiplication are essential primitives for efficient modular computation in cryptographic schemes, including postquantum standards such as module lattice-based (ML) key encapsulation mechanism (KEM) and ML-digital signature algorithm (DSA).” In the Introduction, the same correction has been made for the abbreviation ML.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 12","pages":"3545-3545"},"PeriodicalIF":3.1,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11268919","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145595105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}