首页 > 最新文献

IEEE Transactions on Very Large Scale Integration (VLSI) Systems最新文献

英文 中文
A Highly Reliable RRAM-Based 12T2R NVSRAM Architecture With Dual-Layer ECC 具有双层ECC的高可靠的基于rram的12T2R NVSRAM架构
IF 3.1 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-10-28 DOI: 10.1109/TVLSI.2025.3619392
Huimeng Guo;Yujia Li;Yiqing Li;Tingrui Ren;Liang Wang;Yuanfu Zhao
Static random access memory (SRAM) plays a critical role in chips due to its high-speed access capabilities, but it suffers from data loss upon power-down and is susceptible to radiation-induced faults. Nonvolatile SRAM (NVSRAM) has attracted substantial research attention for combining the high-speed operation of SRAM with the nonvolatile storage capabilities of emerging memory technologies. This article proposes a 12T2R NVSRAM cell based on resistive random access memory (RRAM), achieving nanosecond-scale data backup and recovery. The novel design integrates an independent RRAM operation path and an SRAM power-gating switch, ensuring reliable backup and low-power sleep mode. Building on the memory array, the system further integrates a power management module, control and driver circuitry, and a dual-layer error correction code (ECC) strategy. This holistic co-design across device, circuit, and architecture levels delivers enhanced reliability, energy efficiency, and fault tolerance. Simulation results under 65 nm CMOS process demonstrate significant improvements in key performance metrics, including speed, power consumption, noise margin, store/restore yield, and bit error rate (BER). All functional modules meet the design specifications, with markedly improved data backup and restoration success rates, providing a promising solution for next-generation high-performance nonvolatile memory (NVM) systems.
静态随机存取存储器(SRAM)由于其高速存取能力在芯片中起着至关重要的作用,但它在断电时容易丢失数据,并且容易受到辐射引起的故障的影响。非易失性SRAM (NVSRAM)由于将高速运行的SRAM与新兴存储技术的非易失性存储能力相结合而引起了大量的研究关注。本文提出了一种基于电阻式随机存取存储器(RRAM)的12T2R NVSRAM单元,实现了纳秒级的数据备份和恢复。新颖的设计集成了独立的RRAM操作路径和SRAM电源门控开关,确保可靠的备份和低功耗休眠模式。该系统以存储阵列为基础,进一步集成了电源管理模块、控制和驱动电路以及双层纠错码(ECC)策略。这种跨器件、电路和架构级别的整体协同设计提供了更高的可靠性、能效和容错性。在65纳米CMOS工艺下的仿真结果表明,在关键性能指标上有了显著的改进,包括速度、功耗、噪声余量、存储/恢复产率和误码率(BER)。所有功能模块均符合设计要求,数据备份和恢复成功率显著提高,为下一代高性能非易失性存储器(NVM)系统提供了一种有前景的解决方案。
{"title":"A Highly Reliable RRAM-Based 12T2R NVSRAM Architecture With Dual-Layer ECC","authors":"Huimeng Guo;Yujia Li;Yiqing Li;Tingrui Ren;Liang Wang;Yuanfu Zhao","doi":"10.1109/TVLSI.2025.3619392","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3619392","url":null,"abstract":"Static random access memory (SRAM) plays a critical role in chips due to its high-speed access capabilities, but it suffers from data loss upon power-down and is susceptible to radiation-induced faults. Nonvolatile SRAM (NVSRAM) has attracted substantial research attention for combining the high-speed operation of SRAM with the nonvolatile storage capabilities of emerging memory technologies. This article proposes a 12T2R NVSRAM cell based on resistive random access memory (RRAM), achieving nanosecond-scale data backup and recovery. The novel design integrates an independent RRAM operation path and an SRAM power-gating switch, ensuring reliable backup and low-power sleep mode. Building on the memory array, the system further integrates a power management module, control and driver circuitry, and a dual-layer error correction code (ECC) strategy. This holistic co-design across device, circuit, and architecture levels delivers enhanced reliability, energy efficiency, and fault tolerance. Simulation results under 65 nm CMOS process demonstrate significant improvements in key performance metrics, including speed, power consumption, noise margin, store/restore yield, and bit error rate (BER). All functional modules meet the design specifications, with markedly improved data backup and restoration success rates, providing a promising solution for next-generation high-performance nonvolatile memory (NVM) systems.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 1","pages":"294-306"},"PeriodicalIF":3.1,"publicationDate":"2025-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145847743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Efficient VLSI Architecture for Hammerstein-Type Spline Adaptive Filters hammerstein型样条自适应滤波器的高效VLSI结构
IF 3.1 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-10-27 DOI: 10.1109/TVLSI.2025.3621655
Pavankumar Ganjimala;Subrahmanyam Mula
The Hammerstein spline adaptive filter (HSAF) is a class of nonlinear adaptive filters (NAFs), known for its flexible nonlinear modeling and low complexity in applications, such as self-interference cancellation in wireless communications. This brief proposes a delayed dual-weight update reformulation for the HSAF and its efficient high-throughput and low-power architecture. We also propose hardware-efficient techniques for mapping spline interpolation and updating the spline control points in HSAF. The proposed delayed HSAF (DHSAF) architecture is synthesized using Cadence Genus in 45-nm CMOS technology. Synthesis results show that the proposed DHSAF achieves significantly higher throughput compared to the basic HSAF, with only minimal area and power overhead. Furthermore, the proposed DHSAF outperforms the state-of-the-art RFF-KLMS architecture in terms of both area and power efficiency.
Hammerstein样条自适应滤波器(HSAF)是一类非线性自适应滤波器(NAFs),以其灵活的非线性建模和较低的应用复杂度而闻名,例如在无线通信中的自干扰抵消。本文提出了一种延迟双权重更新的HSAF及其高效的高吞吐量和低功耗架构。我们还提出了在HSAF中映射样条插值和更新样条控制点的硬件效率技术。提出的延迟HSAF (DHSAF)架构采用Cadence Genus在45纳米CMOS技术合成。综合结果表明,与基本HSAF相比,所提出的DHSAF具有显著更高的吞吐量,且仅具有最小的面积和功耗开销。此外,所提出的DHSAF在面积和功率效率方面都优于最先进的RFF-KLMS架构。
{"title":"An Efficient VLSI Architecture for Hammerstein-Type Spline Adaptive Filters","authors":"Pavankumar Ganjimala;Subrahmanyam Mula","doi":"10.1109/TVLSI.2025.3621655","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3621655","url":null,"abstract":"The Hammerstein spline adaptive filter (HSAF) is a class of nonlinear adaptive filters (NAFs), known for its flexible nonlinear modeling and low complexity in applications, such as self-interference cancellation in wireless communications. This brief proposes a delayed dual-weight update reformulation for the HSAF and its efficient high-throughput and low-power architecture. We also propose hardware-efficient techniques for mapping spline interpolation and updating the spline control points in HSAF. The proposed delayed HSAF (DHSAF) architecture is synthesized using Cadence Genus in 45-nm CMOS technology. Synthesis results show that the proposed DHSAF achieves significantly higher throughput compared to the basic HSAF, with only minimal area and power overhead. Furthermore, the proposed DHSAF outperforms the state-of-the-art RFF-KLMS architecture in terms of both area and power efficiency.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 2","pages":"662-666"},"PeriodicalIF":3.1,"publicationDate":"2025-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Reconfigurable Built-In Self-Test Scheme for the Evaluation Circuits of Digital SRAM-IMC Architectures 数字SRAM-IMC体系结构评估电路的可重构内置自检方案
IF 3.1 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-10-07 DOI: 10.1109/TVLSI.2025.3615468
Sunrui Zhang;Xiaole Cui;Feng Wei;Xing Zhang
Digital static random access memory-based in-memory computing (SRAM-IMC) is a promising computation paradigm to break the von-Neumann bottleneck. However, the IMC architectures also bring a series of challenges for testing, because of the circuit structures and operations that do not exist in the conventional memories. One of the challenges is the testing of evaluation circuits in the digital SRAM-IMC architectures, because the primary inputs (PIs) of the evaluation circuits cannot be directly accessed by the testers. Several test approaches such as the conventional logic built-in self-test (LBIST) modules, the indirect and the scan-chain-based test methods are proposed to address this issue. Nevertheless, these solutions suffer from the low test performance or the high area consumption. This work proposes a reconfigurable built-in self-test (BIST) scheme for the evaluation circuits. By reusing the IMC bitcells and operations, the proposed BIST scheme implements the separate pattern generation (PG) and response analysis (RA) processes. Furthermore, the diverse pattern generators, including the Fibonacci linear feedback shift register (LFSR) and weighted LFSR (WLFSR) with adjustable feedback polynomials and the cellular automata (CA), are realized to improve the test efficiency and fault coverage. The evaluation results show that the proposed BIST scheme has better test performance comparing with the indirect and the scan-chain-based test approaches. The proposed BIST scheme has comparable test performance, whereas it has much less area overhead comparing with the conventional LBIST schemes. Additionally, the proposed BIST scheme is testable and repairable.
基于数字静态随机存取存储器的内存计算(SRAM-IMC)是一种很有前途的打破冯-诺伊曼瓶颈的计算范式。然而,由于传统存储器中不存在的电路结构和操作,IMC架构也给测试带来了一系列挑战。其中一个挑战是在数字SRAM-IMC体系结构中测试评估电路,因为测试人员不能直接访问评估电路的主输入(pi)。针对这一问题,提出了传统逻辑内置自检(LBIST)模块、间接测试方法和基于扫描链的测试方法。然而,这些解决方案存在测试性能低或面积消耗大的问题。本文提出了一种可重构的评估电路内置自检(BIST)方案。通过重用IMC位元和操作,BIST方案实现了独立的模式生成(PG)和响应分析(RA)过程。在此基础上,实现了具有可调反馈多项式的斐波那契线性反馈移位寄存器(LFSR)、加权移位寄存器(WLFSR)和元胞自动机(CA)等多种模式生成器,提高了测试效率和故障覆盖率。评估结果表明,与间接测试和基于扫描链的测试方法相比,所提出的BIST方案具有更好的测试性能。与传统的LBIST方案相比,所提出的BIST方案具有相当的测试性能,而且面积开销要小得多。此外,所提出的BIST方案具有可测试性和可修复性。
{"title":"A Reconfigurable Built-In Self-Test Scheme for the Evaluation Circuits of Digital SRAM-IMC Architectures","authors":"Sunrui Zhang;Xiaole Cui;Feng Wei;Xing Zhang","doi":"10.1109/TVLSI.2025.3615468","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3615468","url":null,"abstract":"Digital static random access memory-based in-memory computing (SRAM-IMC) is a promising computation paradigm to break the von-Neumann bottleneck. However, the IMC architectures also bring a series of challenges for testing, because of the circuit structures and operations that do not exist in the conventional memories. One of the challenges is the testing of evaluation circuits in the digital SRAM-IMC architectures, because the primary inputs (PIs) of the evaluation circuits cannot be directly accessed by the testers. Several test approaches such as the conventional logic built-in self-test (LBIST) modules, the indirect and the scan-chain-based test methods are proposed to address this issue. Nevertheless, these solutions suffer from the low test performance or the high area consumption. This work proposes a reconfigurable built-in self-test (BIST) scheme for the evaluation circuits. By reusing the IMC bitcells and operations, the proposed BIST scheme implements the separate pattern generation (PG) and response analysis (RA) processes. Furthermore, the diverse pattern generators, including the Fibonacci linear feedback shift register (LFSR) and weighted LFSR (WLFSR) with adjustable feedback polynomials and the cellular automata (CA), are realized to improve the test efficiency and fault coverage. The evaluation results show that the proposed BIST scheme has better test performance comparing with the indirect and the scan-chain-based test approaches. The proposed BIST scheme has comparable test performance, whereas it has much less area overhead comparing with the conventional LBIST schemes. Additionally, the proposed BIST scheme is testable and repairable.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 1","pages":"280-293"},"PeriodicalIF":3.1,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145847777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NVLIM: MTJ and CMOS-Based Nonvolatile Latch Design With Protection Against Triple-Node-Upsets for Robust Computing 基于MTJ和cmos的非易失锁存器设计与抗三节点干扰的鲁棒计算
IF 3.1 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-10-07 DOI: 10.1109/TVLSI.2025.3614568
Aibin Yan;Yue Zhang;Litao Wang;Zhengfeng Huang;Qingyang Zhang;Tianming Ni;Patrick Girard;Xiaoqing Wen
Soft errors and power dissipation emerge as critical challenges in developing high-reliability and cost-sensitive embedded systems. To address these issues, the magnetic tunnel junction (MTJ) is considered a promising solution due to its nonvolatility and its compatibility with traditional CMOS manufacturing processes. In this work, we propose a novel nonvolatile (NV) latch consisting of inverters and MTJs, namely, NVLIM, which provides nonvolatility and robust partial tolerance against triple-node-upsets (TNUs) at low cost. NVLIM integrates a TNU-tolerant block based on CMOS with a backup-restore block using MTJs. Simulation results incorporating process, voltage, and temperature (PVT) variations, bias temperature instability (BTI) impact, and Monte Carlo simulations demonstrate the balanced performance in terms of nonvolatility, robust partial TNU tolerance, and comprehensive overhead of the proposed latch.
软误差和功耗是开发高可靠性和成本敏感型嵌入式系统的关键挑战。为了解决这些问题,磁隧道结(MTJ)被认为是一个很有前途的解决方案,因为它的非挥发性和与传统CMOS制造工艺的兼容性。在这项工作中,我们提出了一种由逆变器和mtj组成的新型非易失性(NV)锁存器,即NVLIM,它以低成本提供对三节点扰动(tnu)的非易失性和鲁棒部分容限。NVLIM集成了基于CMOS的tu容错块和使用mtj的备份恢复块。结合过程、电压和温度(PVT)变化、偏置温度不稳定性(BTI)影响和蒙特卡罗模拟的仿真结果表明,所提出的锁存器在非挥发性、鲁棒的部分TNU容限和综合开销方面具有平衡的性能。
{"title":"NVLIM: MTJ and CMOS-Based Nonvolatile Latch Design With Protection Against Triple-Node-Upsets for Robust Computing","authors":"Aibin Yan;Yue Zhang;Litao Wang;Zhengfeng Huang;Qingyang Zhang;Tianming Ni;Patrick Girard;Xiaoqing Wen","doi":"10.1109/TVLSI.2025.3614568","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3614568","url":null,"abstract":"Soft errors and power dissipation emerge as critical challenges in developing high-reliability and cost-sensitive embedded systems. To address these issues, the magnetic tunnel junction (MTJ) is considered a promising solution due to its nonvolatility and its compatibility with traditional CMOS manufacturing processes. In this work, we propose a novel nonvolatile (NV) latch consisting of inverters and MTJs, namely, NVLIM, which provides nonvolatility and robust partial tolerance against triple-node-upsets (TNUs) at low cost. NVLIM integrates a TNU-tolerant block based on CMOS with a backup-restore block using MTJs. Simulation results incorporating process, voltage, and temperature (PVT) variations, bias temperature instability (BTI) impact, and Monte Carlo simulations demonstrate the balanced performance in terms of nonvolatility, robust partial TNU tolerance, and comprehensive overhead of the proposed latch.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 1","pages":"268-279"},"PeriodicalIF":3.1,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145847740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FPGA Implementation of PoolFormer Network Using Python-Driven High-Level Synthesis Framework for Edge-AIoT Speech Recognition 基于python驱动的高级合成框架的边缘aiot语音识别的FPGA实现
IF 3.1 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-10-06 DOI: 10.1109/TVLSI.2025.3614721
Tiancheng Cao;Zhongyi Zhang;Wei Soon Ng;Wang Ling Goh;Yuan Gao
This brief presents an edge-AIoT speech recognition system, which is based on a new spiking feature extraction (SFE) method and a PoolFormer (PF) neural network optimized for implementation on field-programmable gate array (FPGA) hardware. A Python-driven high-level synthesis (HLS) flow is adopted to accelerate software-to-hardware conversion for fast validation, demonstrating the potential of FPGA-based solutions in edge applications. This work provides a holistic end-to-end solution for ultralow-power speech recognition, leveraging HLS to bridge the gap between software and hardware development. Implemented in a Xilinx PYNQ-Z2 FPGA board, this optimized PF model achieved a speech recognition accuracy rate of 95.41% on the 35-class Google Commands dataset with a parameter count of 39k.
本文介绍了一种边缘aiot语音识别系统,该系统基于一种新的峰值特征提取(SFE)方法和一种针对现场可编程门阵列(FPGA)硬件优化的PoolFormer (PF)神经网络。采用python驱动的高级综合(HLS)流程来加速软件到硬件的转换,以实现快速验证,展示了基于fpga的解决方案在边缘应用中的潜力。这项工作为超低功耗语音识别提供了一个整体的端到端解决方案,利用HLS弥合了软件和硬件开发之间的差距。该优化的PF模型在Xilinx PYNQ-Z2 FPGA板上实现,在参数数为39k的35类谷歌命令数据集上实现了95.41%的语音识别准确率。
{"title":"FPGA Implementation of PoolFormer Network Using Python-Driven High-Level Synthesis Framework for Edge-AIoT Speech Recognition","authors":"Tiancheng Cao;Zhongyi Zhang;Wei Soon Ng;Wang Ling Goh;Yuan Gao","doi":"10.1109/TVLSI.2025.3614721","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3614721","url":null,"abstract":"This brief presents an edge-AIoT speech recognition system, which is based on a new spiking feature extraction (SFE) method and a PoolFormer (PF) neural network optimized for implementation on field-programmable gate array (FPGA) hardware. A Python-driven high-level synthesis (HLS) flow is adopted to accelerate software-to-hardware conversion for fast validation, demonstrating the potential of FPGA-based solutions in edge applications. This work provides a holistic end-to-end solution for ultralow-power speech recognition, leveraging HLS to bridge the gap between software and hardware development. Implemented in a Xilinx PYNQ-Z2 FPGA board, this optimized PF model achieved a speech recognition accuracy rate of 95.41% on the 35-class Google Commands dataset with a parameter count of 39k.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 1","pages":"317-321"},"PeriodicalIF":3.1,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145847797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Energy-Efficient Neuromorphic Self-Attention Core Exploiting Dual Sparsity in Neurons and Spikes 利用神经元和尖峰的对偶稀疏性的高效神经形态自注意核
IF 3.1 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-10-06 DOI: 10.1109/TVLSI.2025.3613675
P. J. Zhou;Y. C. Chen;R. C. Ma;G. C. Qiao;N. Ning;Q. Yu;S. G. Hu
Neuromorphic computing is distinguished by its strong hardware friendliness and high computational sparsity, making it well-suited for a wide range of edge applications. However, the highly anticipated neuromorphic transformer still faces significant challenges in energy efficiency due to its heavy computational workload (primarily stemming from the self-attention mechanism), making edge deployment difficult. To address this problem, this work proposes a neuron inhibition mechanism that identifies and bypasses the computation of inactive neurons (i.e., those with extremely negative membrane potentials (MPs) and do not fire spikes). It introduces additional neuron sparsity into neuromorphic computing, significantly reducing the computational workload during inference. The mechanism is implemented using a configurable negative MP threshold. If a neuron’s MP falls below this threshold, a mask is generated to bypass the neuron in subsequent calculations. This technology facilitates the development of a sparse2 synaptic processing unit (S2-SPE), which performs computations only on synapses associated with valid spike inputs in active (non-inhibited) neurons, thereby efficiently leveraging both neuron and spike sparsity. Eventually, a neuromorphic attention core is developed based on the S2-SPE, and its effectiveness is validated on various datasets. The experimental results demonstrate that the core can bypass over 98% of invalid calculations in the neuromorphic attention mechanism, achieving an outstanding energy efficiency of sub-0.06 pJ/SOP, which represents over 50% improvement compared to the baseline and outperforms related state-of-the-art (SOTA) works. This work is expected to advance neuromorphic hardware toward greater energy efficiency and facilitate its deployment in edge applications.
神经形态计算以其强大的硬件友好性和高计算稀疏性而闻名,使其非常适合广泛的边缘应用。然而,备受期待的神经形态变压器仍然面临着能源效率方面的重大挑战,因为它的计算工作量很大(主要源于自关注机制),使得边缘部署变得困难。为了解决这个问题,这项工作提出了一种神经元抑制机制,该机制可以识别和绕过非活动神经元的计算(即那些具有极负膜电位(MPs)且不发射峰值的神经元)。它在神经形态计算中引入了额外的神经元稀疏性,显著降低了推理过程中的计算工作量。该机制使用可配置的负MP阈值实现。如果一个神经元的MP低于这个阈值,就会生成一个掩码,在随后的计算中绕过该神经元。该技术促进了sparse2突触处理单元(S2-SPE)的开发,该单元仅在与活动(非抑制)神经元中有效尖峰输入相关的突触上执行计算,从而有效地利用神经元和尖峰稀疏性。最后,基于S2-SPE开发了一个神经形态注意力核心,并在多个数据集上验证了其有效性。实验结果表明,该核心可以绕过神经形态注意机制中98%以上的无效计算,实现了低于0.06 pJ/SOP的卓越能量效率,与基线相比提高了50%以上,优于相关的最先进(SOTA)工作。这项工作有望将神经形态硬件推向更高的能源效率,并促进其在边缘应用中的部署。
{"title":"An Energy-Efficient Neuromorphic Self-Attention Core Exploiting Dual Sparsity in Neurons and Spikes","authors":"P. J. Zhou;Y. C. Chen;R. C. Ma;G. C. Qiao;N. Ning;Q. Yu;S. G. Hu","doi":"10.1109/TVLSI.2025.3613675","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3613675","url":null,"abstract":"Neuromorphic computing is distinguished by its strong hardware friendliness and high computational sparsity, making it well-suited for a wide range of edge applications. However, the highly anticipated neuromorphic transformer still faces significant challenges in energy efficiency due to its heavy computational workload (primarily stemming from the self-attention mechanism), making edge deployment difficult. To address this problem, this work proposes a neuron inhibition mechanism that identifies and bypasses the computation of inactive neurons (i.e., those with extremely negative membrane potentials (MPs) and do not fire spikes). It introduces additional neuron sparsity into neuromorphic computing, significantly reducing the computational workload during inference. The mechanism is implemented using a configurable negative MP threshold. If a neuron’s MP falls below this threshold, a mask is generated to bypass the neuron in subsequent calculations. This technology facilitates the development of a sparse<sup>2</sup> synaptic processing unit (S<sup>2</sup>-SPE), which performs computations only on synapses associated with valid spike inputs in active (non-inhibited) neurons, thereby efficiently leveraging both neuron and spike sparsity. Eventually, a neuromorphic attention core is developed based on the S<sup>2</sup>-SPE, and its effectiveness is validated on various datasets. The experimental results demonstrate that the core can bypass over 98% of invalid calculations in the neuromorphic attention mechanism, achieving an outstanding energy efficiency of sub-0.06 pJ/SOP, which represents over 50% improvement compared to the baseline and outperforms related state-of-the-art (SOTA) works. This work is expected to advance neuromorphic hardware toward greater energy efficiency and facilitate its deployment in edge applications.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 1","pages":"322-326"},"PeriodicalIF":3.1,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145847800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An SAR-Assisted Noise-Shaping Pipeline ADC With Gain-Boosted Cascoded Floating Inverter Amplifier 一种增益增强级联编码浮动逆变放大器的sar辅助噪声整形管路ADC
IF 3.1 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-10-03 DOI: 10.1109/TVLSI.2025.3612411
Xianghui Zhang;Guolong Fu;HaoYu Tian;Yanbo Zhang;Zhangming Zhu
This brief presents a successive approximation register (SAR)-assisted noise-shaping (NS) pipelined analog-to-digital converter (ADC) with the reusing feedback capacitor (RFC) technique. With the proposed RFC technique, the feedback capacitor can simultaneously accomplish the 1st-stage residue transfer, the 2nd-stage quantization error extraction and feedback, and reference voltage matching, which reduces circuit complexity and enhances linearity. To mitigate the noise leakage resulting from gain error, a single-stage closed-loop gain-boosted cascoded floating inverter amplifier (GBCFIA) with 85-dB open-loop gain is proposed. The GBCFIA demonstrates a combination of robustness, high accuracy, and enhanced energy efficiency. Fabricated in a 65-nm CMOS process, the ADC prototype achieves a measured 78.5-dB signal-to-noise and distortion ratio (SNDR) in a 250 MS/s at an oversampling ratio (OSR) of 8. With 2.96-mW power consumption, it achieves an SNDR-based Schreier figure of merit (FoM) of 175.7 dB.
本文介绍了一种采用重复反馈电容(RFC)技术的逐次逼近寄存器(SAR)辅助噪声整形(NS)流水线模数转换器(ADC)。采用RFC技术,反馈电容可以同时完成第一级残差传递、第二级量化误差提取和反馈以及参考电压匹配,降低了电路复杂度,提高了线性度。为了减轻增益误差引起的噪声泄漏,提出了一种开环增益为85 db的单级增益增强级联式浮动逆变放大器(GBCFIA)。GBCFIA展示了鲁棒性,高精度和提高能源效率的组合。该ADC原型采用65纳米CMOS工艺制造,在过采样比(OSR)为8的情况下,在250 MS/s的速度下实现了78.5 db的信噪比和失真比(SNDR)。在2.96 mw的功耗下,它实现了175.7 dB的基于sndr的Schreier优值(FoM)。
{"title":"An SAR-Assisted Noise-Shaping Pipeline ADC With Gain-Boosted Cascoded Floating Inverter Amplifier","authors":"Xianghui Zhang;Guolong Fu;HaoYu Tian;Yanbo Zhang;Zhangming Zhu","doi":"10.1109/TVLSI.2025.3612411","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3612411","url":null,"abstract":"This brief presents a successive approximation register (SAR)-assisted noise-shaping (NS) pipelined analog-to-digital converter (ADC) with the reusing feedback capacitor (RFC) technique. With the proposed RFC technique, the feedback capacitor can simultaneously accomplish the 1st-stage residue transfer, the 2nd-stage quantization error extraction and feedback, and reference voltage matching, which reduces circuit complexity and enhances linearity. To mitigate the noise leakage resulting from gain error, a single-stage closed-loop gain-boosted cascoded floating inverter amplifier (GBCFIA) with 85-dB open-loop gain is proposed. The GBCFIA demonstrates a combination of robustness, high accuracy, and enhanced energy efficiency. Fabricated in a 65-nm CMOS process, the ADC prototype achieves a measured 78.5-dB signal-to-noise and distortion ratio (SNDR) in a 250 MS/s at an oversampling ratio (OSR) of 8. With 2.96-mW power consumption, it achieves an SNDR-based Schreier figure of merit (FoM) of 175.7 dB.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 1","pages":"312-316"},"PeriodicalIF":3.1,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145847763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Corrections to “Enhancing Memory BIST With an Optimized RTL-BIST IP Core: A Low-Power, High-Fault-Coverage Approach” 更正“使用优化的RTL-BIST IP核增强内存BIST:低功耗,高故障覆盖方法”
IF 3.1 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-09-26 DOI: 10.1109/TVLSI.2025.3600857
Ming-Yi Lin;Wei-Kuan Chiang;Chin-Hung Wang
In the above article [1], the citation for “March mSR” in Tables VI through Table IX was incorrect. The correct citation should have been [16].
在上述文章b[1]中,表六到表九中对“March mSR”的引用是不正确的。正确的引用应该是b[16]。
{"title":"Corrections to “Enhancing Memory BIST With an Optimized RTL-BIST IP Core: A Low-Power, High-Fault-Coverage Approach”","authors":"Ming-Yi Lin;Wei-Kuan Chiang;Chin-Hung Wang","doi":"10.1109/TVLSI.2025.3600857","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3600857","url":null,"abstract":"In the above article [1], the citation for “March mSR” in Tables VI through Table IX was incorrect. The correct citation should have been [16].","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 10","pages":"2902-2902"},"PeriodicalIF":3.1,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11181251","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145141676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Society Information 超大规模集成电路(VLSI)系统学报
IF 3.1 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-09-26 DOI: 10.1109/TVLSI.2025.3609598
{"title":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems Society Information","authors":"","doi":"10.1109/TVLSI.2025.3609598","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3609598","url":null,"abstract":"","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 10","pages":"C3-C3"},"PeriodicalIF":3.1,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11181241","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145141677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Test Cycle Reduction for TSV Test Using Streaming Scan Network on 3-D IC 基于流扫描网络的三维集成电路TSV测试周期缩短
IF 3.1 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-09-26 DOI: 10.1109/TVLSI.2025.3611704
Sunghoon Kim;Donghyun Han;Dayoung Kim;Sungho Kang
Through-silicon vias (TSVs) are essential for interdie connections in 3-D integrated circuits (3-D ICs), but their susceptibility to defects necessitates effective testing. As the correct operation of TSVs is critical to the overall reliability of 3-D ICs, their testing is regarded as an essential part of the 3-D IC testing process. Conventional TSV testing based on the IEEE-1838 standard architecture often increases test cycles due to the serial data application through die wrapper registers (DWRs). In this brief, an architecture is proposed that enables efficient TSV testing by using the streaming scan network (SSN) architecture, which is already utilized for logic core testing. TSV groups are treated as identical cores, allowing parallel application of test patterns via the SSN bus and streaming scan host (SSH). The experimental results show that the proposed method achieves a significant reduction in test cycles compared with the conventional TSV test method based on the IEEE-1838 standard architecture, without incurring a noticeable increase in area overhead. The reduction in test cycles becomes more pronounced as the number of TSVs increases.
硅通孔(tsv)是三维集成电路(3-D ic)中相互连接的关键,但其对缺陷的敏感性需要有效的测试。由于tsv的正确工作对三维集成电路的整体可靠性至关重要,因此tsv的测试被认为是三维集成电路测试过程中必不可少的一部分。基于IEEE-1838标准架构的传统TSV测试通常由于通过封装寄存器(dwr)的串行数据应用而增加测试周期。在本文中,我们提出了一种架构,通过使用流扫描网络(SSN)架构来实现高效的TSV测试,该架构已经用于逻辑核心测试。TSV组被视为相同的核心,允许通过SSN总线和流扫描主机(SSH)并行应用测试模式。实验结果表明,与基于IEEE-1838标准架构的传统TSV测试方法相比,该方法在不显著增加面积开销的情况下显著减少了测试周期。随着tsv数量的增加,测试周期的减少变得更加明显。
{"title":"Test Cycle Reduction for TSV Test Using Streaming Scan Network on 3-D IC","authors":"Sunghoon Kim;Donghyun Han;Dayoung Kim;Sungho Kang","doi":"10.1109/TVLSI.2025.3611704","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3611704","url":null,"abstract":"Through-silicon vias (TSVs) are essential for interdie connections in 3-D integrated circuits (3-D ICs), but their susceptibility to defects necessitates effective testing. As the correct operation of TSVs is critical to the overall reliability of 3-D ICs, their testing is regarded as an essential part of the 3-D IC testing process. Conventional TSV testing based on the IEEE-1838 standard architecture often increases test cycles due to the serial data application through die wrapper registers (DWRs). In this brief, an architecture is proposed that enables efficient TSV testing by using the streaming scan network (SSN) architecture, which is already utilized for logic core testing. TSV groups are treated as identical cores, allowing parallel application of test patterns via the SSN bus and streaming scan host (SSH). The experimental results show that the proposed method achieves a significant reduction in test cycles compared with the conventional TSV test method based on the IEEE-1838 standard architecture, without incurring a noticeable increase in area overhead. The reduction in test cycles becomes more pronounced as the number of TSVs increases.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 1","pages":"307-311"},"PeriodicalIF":3.1,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145847835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1