首页 > 最新文献

IEEE Transactions on Very Large Scale Integration (VLSI) Systems最新文献

英文 中文
Analysis of a Delay-Element-Based Technique for Enhancing Soft Error Tolerance at Input Nodes Around Clock Edges 基于延迟单元的时钟边输入节点软容错技术分析
IF 3.1 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-22 DOI: 10.1109/TVLSI.2025.3643939
Song Wang;Kazuteru Namba
Technology scaling and supply voltage reduction make sequential circuits around clock edges increasingly vulnerable to single-event transients (SETs). This work analyzes the sensitive regions of a dual interlocked storage cell (DICE)-based flip-flop (DICEFF) in a 15 nm FinFET process and reveals the correlation between critical charge distribution and SET pulse characteristics. A lightweight fault-tolerant scheme is proposed that integrates delay elements (DEs) with the self-recovery capability of DICE to temporally desynchronize SET pulse arrivals and facilitate self-correction through temporal misalignment. Furthermore, a visualization method based on critical charge distribution is presented to delineate SET tolerance boundaries. HSPICE simulations demonstrate that the proposed method is robust against PVT variations, improving the average critical charge by up to $1.7times $ over the baseline and reducing the risk window by 47%, while maintaining comparable delay and power efficiency.
技术缩放和电源电压降低使得时钟边缘附近的顺序电路越来越容易受到单事件瞬变(set)的影响。本文分析了15nm FinFET工艺中基于双联锁存储单元(DICE)的触发器(DICEFF)的敏感区域,揭示了临界电荷分布与SET脉冲特性之间的相关性。提出了一种轻量级容错方案,该方案将延迟元件(DEs)与DICE的自恢复能力相结合,使SET脉冲到达时间去同步,并通过时间偏差进行自校正。在此基础上,提出了一种基于临界电荷分布的可视化方法来描绘SET容差边界。HSPICE仿真表明,所提出的方法对PVT变化具有鲁棒性,比基线提高了平均临界电荷1.7倍,降低了47%的风险窗口,同时保持了相当的延迟和功率效率。
{"title":"Analysis of a Delay-Element-Based Technique for Enhancing Soft Error Tolerance at Input Nodes Around Clock Edges","authors":"Song Wang;Kazuteru Namba","doi":"10.1109/TVLSI.2025.3643939","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3643939","url":null,"abstract":"Technology scaling and supply voltage reduction make sequential circuits around clock edges increasingly vulnerable to single-event transients (SETs). This work analyzes the sensitive regions of a dual interlocked storage cell (DICE)-based flip-flop (DICEFF) in a 15 nm FinFET process and reveals the correlation between critical charge distribution and SET pulse characteristics. A lightweight fault-tolerant scheme is proposed that integrates delay elements (DEs) with the self-recovery capability of DICE to temporally desynchronize SET pulse arrivals and facilitate self-correction through temporal misalignment. Furthermore, a visualization method based on critical charge distribution is presented to delineate SET tolerance boundaries. HSPICE simulations demonstrate that the proposed method is robust against PVT variations, improving the average critical charge by up to <inline-formula> <tex-math>$1.7times $ </tex-math></inline-formula> over the baseline and reducing the risk window by 47%, while maintaining comparable delay and power efficiency.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 3","pages":"1017-1028"},"PeriodicalIF":3.1,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147280899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Energy-Efficient Edge Coprocessor for Neural Rendering With Explicit Data Reuse Strategies 基于显式数据重用策略的神经渲染节能边缘协处理器
IF 3.1 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-17 DOI: 10.1109/TVLSI.2025.3641653
Binzhe Yuan;Xiangyu Zhang;Zeyu Zheng;Yuefeng Zhang;Haochuan Wan;Zhechen Yuan;Junsheng Chen;Yunxiang He;Junran Ding;Xiaoming Zhang;Chaolin Rao;Wenyan Su;Pingqiang Zhou;Jingyi Yu;Xin Lou
Neural radiance fields (NeRFs) have transformed 3-D reconstruction and rendering, facilitating photorealistic image synthesis from sparse viewpoints. This work introduces an explicit data reuse neural rendering (EDR-NR) architecture, which reduces frequent external memory accesses (EMAs) and cache misses by exploiting the spatial locality from three phases, including rays, ray packets (RPs), and samples. The EDR-NR architecture features a four-stage scheduler that clusters rays on the basis of $Z$ -order, prioritize lagging rays when ray divergence happens, reorders RPs based on spatial proximity, and issues samples out-of-orderly (OoO) according to the availability of on-chip feature data. In addition, a four-tier hierarchical RP marching (HRM) technique is integrated with an axis-aligned bounding box (AABB) to facilitate spatial skipping (SS), reducing redundant computations and improving throughput. Moreover, a balanced allocation strategy for feature storage is proposed to mitigate SRAM bank conflicts. Fabricated using a 40-nm process with a die area of 10.5 mm2, the EDR-NR chip demonstrates a $2.41times $ enhancement in normalized energy efficiency, a $1.21times $ improvement in normalized area efficiency, a $1.20times $ increase in normalized throughput, and a 53.42% reduction in on-chip SRAM consumption compared with state-of-the-art accelerators.
神经辐射场(nerf)已经改变了三维重建和渲染,促进了稀疏视点的逼真图像合成。这项工作引入了一种显式数据重用神经渲染(EDR-NR)架构,该架构通过利用三个阶段的空间局部性,包括射线、射线包(rp)和样本,减少了频繁的外部存储器访问(ema)和缓存丢失。EDR-NR架构具有一个四阶段调度器,该调度器根据$Z$顺序对光线进行聚类,在光线发生发散时优先考虑滞后光线,根据空间接近度重新排序rp,并根据片上特征数据的可用性发出无序样本(OoO)。此外,将四层分层RP行军(HRM)技术与轴向边界框(AABB)集成在一起,以促进空间跳过(SS),减少冗余计算并提高吞吐量。此外,提出了一种特征存储均衡分配策略,以缓解SRAM库冲突。EDR-NR芯片采用40纳米工艺制造,芯片面积为10.5 mm2,与最先进的加速器相比,标准化能效提高了2.41倍,标准化面积效率提高了1.21倍,标准化吞吐量提高了1.20倍,片上SRAM消耗降低了53.42%。
{"title":"An Energy-Efficient Edge Coprocessor for Neural Rendering With Explicit Data Reuse Strategies","authors":"Binzhe Yuan;Xiangyu Zhang;Zeyu Zheng;Yuefeng Zhang;Haochuan Wan;Zhechen Yuan;Junsheng Chen;Yunxiang He;Junran Ding;Xiaoming Zhang;Chaolin Rao;Wenyan Su;Pingqiang Zhou;Jingyi Yu;Xin Lou","doi":"10.1109/TVLSI.2025.3641653","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3641653","url":null,"abstract":"Neural radiance fields (NeRFs) have transformed 3-D reconstruction and rendering, facilitating photorealistic image synthesis from sparse viewpoints. This work introduces an explicit data reuse neural rendering (EDR-NR) architecture, which reduces frequent external memory accesses (EMAs) and cache misses by exploiting the spatial locality from three phases, including rays, ray packets (RPs), and samples. The EDR-NR architecture features a four-stage scheduler that clusters rays on the basis of <inline-formula> <tex-math>$Z$ </tex-math></inline-formula>-order, prioritize lagging rays when ray divergence happens, reorders RPs based on spatial proximity, and issues samples out-of-orderly (OoO) according to the availability of on-chip feature data. In addition, a four-tier hierarchical RP marching (HRM) technique is integrated with an axis-aligned bounding box (AABB) to facilitate spatial skipping (SS), reducing redundant computations and improving throughput. Moreover, a balanced allocation strategy for feature storage is proposed to mitigate SRAM bank conflicts. Fabricated using a 40-nm process with a die area of 10.5 mm<sup>2</sup>, the EDR-NR chip demonstrates a <inline-formula> <tex-math>$2.41times $ </tex-math></inline-formula> enhancement in normalized energy efficiency, a <inline-formula> <tex-math>$1.21times $ </tex-math></inline-formula> improvement in normalized area efficiency, a <inline-formula> <tex-math>$1.20times $ </tex-math></inline-formula> increase in normalized throughput, and a 53.42% reduction in on-chip SRAM consumption compared with state-of-the-art accelerators.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 2","pages":"620-633"},"PeriodicalIF":3.1,"publicationDate":"2025-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Area-Efficient and Low-Latency ASIC Design of Deflate Data Compressor for SSD Applications 一种面向SSD应用的放气数据压缩器的面积高效低延迟ASIC设计
IF 3.1 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-17 DOI: 10.1109/TVLSI.2025.3642311
Nengyuan Sun;Ming Jin;Jianghong Li;Zhaoyi Niu;Jinghe Wang;Zhiyuan Pan;Jiafeng Cheng;Wenrui Liu;Kai Shi;Jiaqi Wang;Jiawei Zhang;Linhan Wang;Kangning Song;Xinyu Chen;Haoxiang Yu;Weize Yu
In this brief, a high-speed [multiway parallel (MWP)] hardware-implemented deflate data compressor (DDC) is proposed for reducing the storage of solid-state drives (SSDs). To minimize the area of the DDC, registers instead of static random access memories (SRAMs) are utilized for building hash tables because multiway data within the DDC are able to access a register-based hash table simultaneously. To further reduce the area of the DDC, the output data of indefinite length are concatenated with a tree-type hardware architecture for reducing the overall concatenation complexity. Moreover, a solid mathematical foundation is established for optimizing the latency values of Lempel–Ziv (LZ)77 circuit, the Huffman encoding circuit, and the output data concatenation circuit within the MWP DDC. The results show that the proposed MWP DDC is capable of achieving a 12.1-Gb/s throughput and a 1.76 compression ratio (CR) with a 1.17-mm2 area and 0.103- $mu $ s latency, under the synthesis of SMIC 55-nm process design kits (PDKs). Hence, the proposed DDC satisfies the SSD compression requirement for a universal serial bus (USB) 3.2 connector.
在本文中,提出了一种高速[多路并行(MWP)]硬件实现的deflate数据压缩器(DDC),用于减少固态硬盘(ssd)的存储。为了最小化DDC的面积,使用寄存器而不是静态随机存取存储器(sram)来构建哈希表,因为DDC内的多路数据能够同时访问基于寄存器的哈希表。为了进一步减小DDC的面积,对不确定长度的输出数据采用树型硬件架构进行连接,以降低整体连接复杂度。为优化MWP DDC内的Lempel-Ziv (LZ)77电路、Huffman编码电路和输出数据拼接电路的延迟值奠定了坚实的数学基础。结果表明,在中芯国际55纳米工艺设计套件(PDKs)的合成下,MWP DDC能够实现12.1 gb /s的吞吐量和1.76的压缩比(CR), 1.17 mm2的面积和0.103- $mu $ s的延迟。因此,建议的DDC满足SSD对通用串行总线(USB) 3.2连接器的压缩要求。
{"title":"An Area-Efficient and Low-Latency ASIC Design of Deflate Data Compressor for SSD Applications","authors":"Nengyuan Sun;Ming Jin;Jianghong Li;Zhaoyi Niu;Jinghe Wang;Zhiyuan Pan;Jiafeng Cheng;Wenrui Liu;Kai Shi;Jiaqi Wang;Jiawei Zhang;Linhan Wang;Kangning Song;Xinyu Chen;Haoxiang Yu;Weize Yu","doi":"10.1109/TVLSI.2025.3642311","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3642311","url":null,"abstract":"In this brief, a high-speed [multiway parallel (MWP)] hardware-implemented deflate data compressor (DDC) is proposed for reducing the storage of solid-state drives (SSDs). To minimize the area of the DDC, registers instead of static random access memories (SRAMs) are utilized for building hash tables because multiway data within the DDC are able to access a register-based hash table simultaneously. To further reduce the area of the DDC, the output data of indefinite length are concatenated with a tree-type hardware architecture for reducing the overall concatenation complexity. Moreover, a solid mathematical foundation is established for optimizing the latency values of Lempel–Ziv (LZ)77 circuit, the Huffman encoding circuit, and the output data concatenation circuit within the MWP DDC. The results show that the proposed MWP DDC is capable of achieving a 12.1-Gb/s throughput and a 1.76 compression ratio (CR) with a 1.17-mm<sup>2</sup> area and 0.103-<inline-formula> <tex-math>$mu $ </tex-math></inline-formula>s latency, under the synthesis of SMIC 55-nm process design kits (PDKs). Hence, the proposed DDC satisfies the SSD compression requirement for a universal serial bus (USB) 3.2 connector.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 3","pages":"1057-1061"},"PeriodicalIF":3.1,"publicationDate":"2025-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147280527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Area-Efficient Noise-Shaping SAR ADC With Parallel-Delayed Sampling 一种具有并行延迟采样的面积高效噪声整形SAR ADC
IF 3.1 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-15 DOI: 10.1109/TVLSI.2025.3642291
Zhengtao Zhu;Wenjie Wang;Fan Luo;Longbin Zhu;Zhijun Zhou;Keping Wang
This brief presents an area-efficient noise-shaping (NS) successive approximation register (SAR) analog-to-digital converter (ADC) employing a parallel-delayed sampling (PDS) technique. PDS samples the residual voltages from multiple ADC conversion cycles to increase the NS effect, without the need for large integration capacitors of the typical cascaded passive integrators. A preamplifier is placed between the sampling capacitors and the integrator to avoid signal attenuation, while further reducing the area of the integrator. PDS and preamplifier introduce two left-half-plane poles to the noise transfer function (NTF) to boost the NS effect, while reducing the impact of the parasitic capacitance to essentially enhance the robustness. A prototype 9-bit NS-SAR ADC is designed in a 130-nm CMOS process. At an oversampling ratio (OSR) of 16, the proposed PDS NS-SAR ADC achieves 80.93-dB peak signal to noise and distortion ratio (SNDR) and provides 4.2 NS/area efficiency factor. It consumes a power of $23.46~mu $ W over a bandwidth of 19.53 kHz, achieving a Schreier figure of merit (FoM ${}_{mathrm {S}}$ ) of 170.13 dB.
本文介绍了一种采用并行延迟采样(PDS)技术的面积高效噪声整形(NS)逐次逼近寄存器(SAR)模数转换器(ADC)。PDS对多个ADC转换周期的剩余电压进行采样,以增加NS效应,而不需要典型级联无源积分器的大型集成电容器。在采样电容和积分器之间放置一个前置放大器,以避免信号衰减,同时进一步减小积分器的面积。PDS和前置放大器在噪声传递函数(NTF)中引入两个左半平面极点来增强NS效应,同时降低寄生电容的影响,从根本上增强鲁棒性。设计了一个基于130纳米CMOS工艺的9位NS-SAR ADC原型。在过采样比(OSR)为16的情况下,所设计的PDS NS- sar ADC峰值信噪比(SNDR)为80.93 db,效率因子为4.2 NS/面积。在19.53 kHz的带宽上,它的功耗为23.46~mu $ W,实现了170.13 dB的施瑞尔优值(FoM ${}_{ mathm {S}}$)。
{"title":"An Area-Efficient Noise-Shaping SAR ADC With Parallel-Delayed Sampling","authors":"Zhengtao Zhu;Wenjie Wang;Fan Luo;Longbin Zhu;Zhijun Zhou;Keping Wang","doi":"10.1109/TVLSI.2025.3642291","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3642291","url":null,"abstract":"This brief presents an area-efficient noise-shaping (NS) successive approximation register (SAR) analog-to-digital converter (ADC) employing a parallel-delayed sampling (PDS) technique. PDS samples the residual voltages from multiple ADC conversion cycles to increase the NS effect, without the need for large integration capacitors of the typical cascaded passive integrators. A preamplifier is placed between the sampling capacitors and the integrator to avoid signal attenuation, while further reducing the area of the integrator. PDS and preamplifier introduce two left-half-plane poles to the noise transfer function (NTF) to boost the NS effect, while reducing the impact of the parasitic capacitance to essentially enhance the robustness. A prototype 9-bit NS-SAR ADC is designed in a 130-nm CMOS process. At an oversampling ratio (OSR) of 16, the proposed PDS NS-SAR ADC achieves 80.93-dB peak signal to noise and distortion ratio (SNDR) and provides 4.2 NS/area efficiency factor. It consumes a power of <inline-formula> <tex-math>$23.46~mu $ </tex-math></inline-formula>W over a bandwidth of 19.53 kHz, achieving a Schreier figure of merit (FoM<inline-formula> <tex-math>${}_{mathrm {S}}$ </tex-math></inline-formula>) of 170.13 dB.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 3","pages":"1053-1056"},"PeriodicalIF":3.1,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147280904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FADE: Fault-Aware Adaptive On-Die ECC for Improving Robustness 衰减:用于提高鲁棒性的故障感知自适应芯片内ECC
IF 3.1 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-10 DOI: 10.1109/TVLSI.2025.3640215
Youngki Moon;Juyong Lee;Nayeun Kim;Yeonho Choi;Byungsoo Kim;Sungho Kang
The increasing density of dynamic random access memory (DRAM) renders permanent faults and soft errors more prevalent, which critically reduces yield and reliability. Although error correction code (ECC) can mitigate this issue, existing ECCs are not optimized for fault correction. As a result, fault tolerance remains insufficient, and the error correction capability in the presence of faults is degraded. Therefore, to improve DRAM robustness by efficiently addressing both permanent faults and soft errors, this brief proposes a fault-aware adaptive on-die ECC (FADE) in which two ECC engines independently operate in either fault mode (FM) or error mode (EM) according to the number of faulty symbols (FSs). In FM, a fault polynomial is reconstructed by reusing the fault addresses that the built-in self-repair (BISR) stores in content-addressable memory (CAM). To calculate the corresponding fault magnitudes, a modified decoding equation is employed. As a result, the number of correctable FSs in FM doubles compared to the conventional ECC. Moreover, with the proposed symbol-based fault isolation, both fault tolerance and error correction capability in the presence of faults are drastically enhanced. Additionally, the experimental results show that the proposed design can be implemented with a reasonable overhead in terms of delay and area.
动态随机存取存储器(DRAM)的密度不断增加,导致永久故障和软错误更加普遍,这严重降低了成品率和可靠性。虽然纠错码(error correction code, ECC)可以缓解这个问题,但现有的纠错码并没有针对错误纠错进行优化。因此,容错能力不足,存在故障时的纠错能力降低。因此,为了通过有效地解决永久故障和软错误来提高DRAM的鲁棒性,本文提出了一种故障感知的自适应片上ECC (FADE),其中两个ECC引擎根据故障符号(fs)的数量在故障模式(FM)或错误模式(EM)下独立运行。在FM中,通过重用内置自修复(BISR)存储在内容可寻址存储器(CAM)中的故障地址来重构故障多项式。为了计算相应的故障大小,采用了一个改进的解码方程。因此,FM中可校正的频响数是传统ECC的两倍。此外,基于符号的故障隔离大大提高了系统的容错能力和纠错能力。此外,实验结果表明,该设计可以在合理的延迟和面积开销下实现。
{"title":"FADE: Fault-Aware Adaptive On-Die ECC for Improving Robustness","authors":"Youngki Moon;Juyong Lee;Nayeun Kim;Yeonho Choi;Byungsoo Kim;Sungho Kang","doi":"10.1109/TVLSI.2025.3640215","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3640215","url":null,"abstract":"The increasing density of dynamic random access memory (DRAM) renders permanent faults and soft errors more prevalent, which critically reduces yield and reliability. Although error correction code (ECC) can mitigate this issue, existing ECCs are not optimized for fault correction. As a result, fault tolerance remains insufficient, and the error correction capability in the presence of faults is degraded. Therefore, to improve DRAM robustness by efficiently addressing both permanent faults and soft errors, this brief proposes a fault-aware adaptive on-die ECC (FADE) in which two ECC engines independently operate in either fault mode (FM) or error mode (EM) according to the number of faulty symbols (FSs). In FM, a fault polynomial is reconstructed by reusing the fault addresses that the built-in self-repair (BISR) stores in content-addressable memory (CAM). To calculate the corresponding fault magnitudes, a modified decoding equation is employed. As a result, the number of correctable FSs in FM doubles compared to the conventional ECC. Moreover, with the proposed symbol-based fault isolation, both fault tolerance and error correction capability in the presence of faults are drastically enhanced. Additionally, the experimental results show that the proposed design can be implemented with a reasonable overhead in terms of delay and area.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 2","pages":"707-710"},"PeriodicalIF":3.1,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Analytical Model of Mismatch Dominance Crossover in High-Speed Flash ADC Cores 高速闪存ADC核中失配优势交叉的分析模型
IF 3.1 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-09 DOI: 10.1109/TVLSI.2025.3637272
Shatadal Chatterjee;Jitumani Sarma
The flash analog-to-digital converters (ADCs), essential for high-speed embedded systems, face inherent linearity constraints due to device mismatch in the resistor ladder and comparator stages. While individual analytical models exist for these mismatch sources, designers rely on Monte Carlo simulations to evaluate the combined errors. This brief introduces a unified analytical framework with closed-form expressions that capture both mismatch sources, enabling efficient estimation of root mean square (rms) integral nonlinearity/differential nonlinearity (INL/DNL). Validated against circuit simulations, the model achieves a mean absolute error (MAE) of 2.71% ( $boldsymbol {sigma _{textbf {DNL}}}$ ) and 2.51% ( $sigma _{text {INL}}$ ), and the maximum absolute error (MaxE) remains within 5.44%. This predictive capability guides high-yield, precision, power, and area (PPA)-optimized system-on-chip (SoC) design, enabling over $3{times }$ silicon area reduction through application-specific optimization.
对于高速嵌入式系统至关重要的闪存模数转换器(adc),由于电阻阶梯和比较器级的器件不匹配而面临固有的线性限制。虽然存在这些不匹配源的单独分析模型,但设计人员依靠蒙特卡罗模拟来评估组合误差。本文介绍了一个统一的分析框架,该框架具有封闭形式的表达式,可以捕获两个不匹配源,从而能够有效地估计均方根(rms)积分非线性/微分非线性(INL/DNL)。通过电路仿真验证,该模型的平均绝对误差(MAE)为2.71% ($boldsymbol {sigma _{textbf {DNL}}}$)和2.51% ($sigma _{text {INL}}$),最大绝对误差(MaxE)保持在5.44%以内。这种预测能力指导了高产量、高精度、功耗和面积(PPA)优化的片上系统(SoC)设计,通过特定应用的优化,使硅面积减少了3倍以上。
{"title":"An Analytical Model of Mismatch Dominance Crossover in High-Speed Flash ADC Cores","authors":"Shatadal Chatterjee;Jitumani Sarma","doi":"10.1109/TVLSI.2025.3637272","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3637272","url":null,"abstract":"The flash analog-to-digital converters (ADCs), essential for high-speed embedded systems, face inherent linearity constraints due to device mismatch in the resistor ladder and comparator stages. While individual analytical models exist for these mismatch sources, designers rely on Monte Carlo simulations to evaluate the combined errors. This brief introduces a unified analytical framework with closed-form expressions that capture both mismatch sources, enabling efficient estimation of root mean square (rms) integral nonlinearity/differential nonlinearity (INL/DNL). Validated against circuit simulations, the model achieves a mean absolute error (MAE) of 2.71% (<inline-formula> <tex-math>$boldsymbol {sigma _{textbf {DNL}}}$ </tex-math></inline-formula>) and 2.51% (<inline-formula> <tex-math>$sigma _{text {INL}}$ </tex-math></inline-formula>), and the maximum absolute error (MaxE) remains within 5.44%. This predictive capability guides high-yield, precision, power, and area (PPA)-optimized system-on-chip (SoC) design, enabling over <inline-formula> <tex-math>$3{times }$ </tex-math></inline-formula> silicon area reduction through application-specific optimization.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 2","pages":"702-706"},"PeriodicalIF":3.1,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Pattern-Dependent Pulse Filtering Technique for Low-Jitter Injection-Locked CDR in 28-nm CMOS 用于低抖动注入锁定CDR的模式相关脉冲滤波技术
IF 3.1 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-09 DOI: 10.1109/TVLSI.2025.3639588
Junhak Kim;Young-Wook Kim;Sinho Lee;Yoojin Jung;Min-Seong Choo;Kwanseo Park
This work presents a ring oscillator (RO)-based low-jitter injection-locked clock and data recovery (ILCDR) with a pattern-dependent pulse filtering (PDPF) technique. The conventional ILCDR has a drawback that data jitter is transferred to the recovered clock. To reduce jitter, the PDPF technique is employed to filter out the injection pulses occurring in data patterns that cause high data-dependent jitter (DDJ). Adopting the PDPF technique with an injection timing control loop, the ILCDR optimizes injection timing and maximizes timing margin. Fabricated in a 28-nm CMOS technology, the proposed ILCDR occupies an active area of 0.03 mm2 and consumes 13.6 mW at 10 Gb/s. The measured jitter tolerance (JTOL) is 1 UIpp at 35 MHz with a bit error rate (BER) of $10^{-12}$ .
本研究提出了一种基于环形振荡器(RO)的低抖动注入锁定时钟和数据恢复(ILCDR)与模式相关脉冲滤波(PDPF)技术。传统的ILCDR有一个缺点,就是数据抖动会传递到恢复的时钟上。为了减少抖动,采用PDPF技术过滤掉发生在数据模式中引起高数据相关性抖动(DDJ)的注入脉冲。ILCDR采用PDPF技术和注入定时控制回路,优化了注入定时并最大化了定时余量。该ILCDR采用28纳米CMOS技术制造,其有效面积为0.03 mm2,功耗为13.6 mW,速率为10 Gb/s。测量的抖动容差(JTOL)在35 MHz时为1 upipp,误码率(BER)为10^{-12}$。
{"title":"A Pattern-Dependent Pulse Filtering Technique for Low-Jitter Injection-Locked CDR in 28-nm CMOS","authors":"Junhak Kim;Young-Wook Kim;Sinho Lee;Yoojin Jung;Min-Seong Choo;Kwanseo Park","doi":"10.1109/TVLSI.2025.3639588","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3639588","url":null,"abstract":"This work presents a ring oscillator (RO)-based low-jitter injection-locked clock and data recovery (ILCDR) with a pattern-dependent pulse filtering (PDPF) technique. The conventional ILCDR has a drawback that data jitter is transferred to the recovered clock. To reduce jitter, the PDPF technique is employed to filter out the injection pulses occurring in data patterns that cause high data-dependent jitter (DDJ). Adopting the PDPF technique with an injection timing control loop, the ILCDR optimizes injection timing and maximizes timing margin. Fabricated in a 28-nm CMOS technology, the proposed ILCDR occupies an active area of 0.03 mm<sup>2</sup> and consumes 13.6 mW at 10 Gb/s. The measured jitter tolerance (JTOL) is 1 UI<sub>pp</sub> at 35 MHz with a bit error rate (BER) of <inline-formula> <tex-math>$10^{-12}$ </tex-math></inline-formula>.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 2","pages":"711-715"},"PeriodicalIF":3.1,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A 0.38-mW, 50-MS/s, 2.3-μApp Current-Integration SAR-Based Current-to-Digital Converter for Real-Time OCT Imaging 用于实时OCT成像的0.38 mw, 50 ms /s, 2.3 μ app电流集成sar电流-数字转换器
IF 3.1 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-02 DOI: 10.1109/TVLSI.2025.3634734
Xiao Wang;Xin Sun;Yibin Zheng;Runkun Li;Kong-Pang Pun
This brief presents an amplifierless current-to-digital converter (CDC) that uniquely integrates an open-loop pseudo-differential current mirror with a current-integration successive-approximation-register analog-to-digital converter (ADC). The proposed architecture enables the CDC to achieve high-speed operation at low power consumption, which is critical for the intended applications in dynamic optical coherence tomography (OCT) systems. Fabricated in 65-nm CMOS, the prototype occupies 0.019 mm2, consumes $380~mu $ W from a 1-V supply, and achieves a 47-dB dynamic range (DR) with a 50-MS/s sample rate. It achieves Walden’s and Schreier’s figures of merit of 92 fJ/step and 148 dB, respectively, both being the best among reported CDCs.
本文介绍了一种无放大器的电流-数字转换器(CDC),该转换器独特地集成了开环伪差分电流镜和电流集成连续近似寄存器模数转换器(ADC)。所提出的架构使CDC能够在低功耗下实现高速运行,这对于动态光学相干层析成像(OCT)系统的预期应用至关重要。该原型机采用65纳米CMOS制造,占地0.019 mm2,功耗为380~mu $ W (1 v电源),采样率为50 ms /s,动态范围为47 db。它达到Walden 's和Schreier 's的优点值分别为92 fJ/步和148 dB,两者都是报道的cdc中最好的。
{"title":"A 0.38-mW, 50-MS/s, 2.3-μApp Current-Integration SAR-Based Current-to-Digital Converter for Real-Time OCT Imaging","authors":"Xiao Wang;Xin Sun;Yibin Zheng;Runkun Li;Kong-Pang Pun","doi":"10.1109/TVLSI.2025.3634734","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3634734","url":null,"abstract":"This brief presents an amplifierless current-to-digital converter (CDC) that uniquely integrates an open-loop pseudo-differential current mirror with a current-integration successive-approximation-register analog-to-digital converter (ADC). The proposed architecture enables the CDC to achieve high-speed operation at low power consumption, which is critical for the intended applications in dynamic optical coherence tomography (OCT) systems. Fabricated in 65-nm CMOS, the prototype occupies 0.019 mm<sup>2</sup>, consumes <inline-formula> <tex-math>$380~mu $ </tex-math></inline-formula>W from a 1-V supply, and achieves a 47-dB dynamic range (DR) with a 50-MS/s sample rate. It achieves Walden’s and Schreier’s figures of merit of 92 fJ/step and 148 dB, respectively, both being the best among reported CDCs.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 2","pages":"697-701"},"PeriodicalIF":3.1,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Society Information 超大规模集成电路(VLSI)系统学报
IF 3.1 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-11-25 DOI: 10.1109/TVLSI.2025.3630312
{"title":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems Society Information","authors":"","doi":"10.1109/TVLSI.2025.3630312","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3630312","url":null,"abstract":"","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 12","pages":"C3-C3"},"PeriodicalIF":3.1,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11268918","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145595110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Corrections to “Efficient Fault-Detection Architectures for Barrett Reduction and Multiplication in Classical and Post-Quantum Cryptographic Systems” 对“经典和后量子密码系统中用于巴雷特约简和乘法的有效故障检测架构”的修正
IF 3.1 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-11-25 DOI: 10.1109/TVLSI.2025.3621790
Saeed Aghapour;Kiarash Sedghighadikolaei;Attila A. Yavuz;Bechir Hamdaoui;Mehran Mozaffari-Kermani
After the acceptance of [1], an error was introduced, which we aim to resolve here. The abbreviation ML stands for module lattice-based, not “machine learning.” The first sentence of the first paragraph is corrected from the version that was published in Early Access. It should have read, “Barrett modular reduction and multiplication are essential primitives for efficient modular computation in cryptographic schemes, including postquantum standards such as module lattice-based (ML) key encapsulation mechanism (KEM) and ML-digital signature algorithm (DSA).” In the Introduction, the same correction has been made for the abbreviation ML.
在接受[1]之后,引入了一个错误,我们的目标是在这里解决这个错误。缩写ML代表基于模块格的,而不是“机器学习”。第一段的第一句话是在Early Access中发布的版本中更正的。它应该是这样写的:“巴雷特模块化约简和乘法是加密方案中高效模块化计算的基本要素,包括后量子标准,如基于模块格(ML)的密钥封装机制(KEM)和ML数字签名算法(DSA)。”在引言中,对缩写ML做了同样的更正。
{"title":"Corrections to “Efficient Fault-Detection Architectures for Barrett Reduction and Multiplication in Classical and Post-Quantum Cryptographic Systems”","authors":"Saeed Aghapour;Kiarash Sedghighadikolaei;Attila A. Yavuz;Bechir Hamdaoui;Mehran Mozaffari-Kermani","doi":"10.1109/TVLSI.2025.3621790","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3621790","url":null,"abstract":"After the acceptance of [1], an error was introduced, which we aim to resolve here. The abbreviation ML stands for module lattice-based, not “machine learning.” The first sentence of the first paragraph is corrected from the version that was published in Early Access. It should have read, “Barrett modular reduction and multiplication are essential primitives for efficient modular computation in cryptographic schemes, including postquantum standards such as module lattice-based (ML) key encapsulation mechanism (KEM) and ML-digital signature algorithm (DSA).” In the Introduction, the same correction has been made for the abbreviation ML.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 12","pages":"3545-3545"},"PeriodicalIF":3.1,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11268919","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145595105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1