首页 > 最新文献

IEEE Transactions on Very Large Scale Integration (VLSI) Systems最新文献

英文 中文
Design and Analysis of a New Three-Stage Feedback Amplifier Utilizing Signal Flow Graph Domain Inspection Approach 利用信号流图域检测法设计和分析新型三级反馈放大器
IF 2.8 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-07-25 DOI: 10.1109/TVLSI.2024.3426516
M. Ghashghai;M. B. Ghaznavi-Ghoushchi
In this article, the design strategy with the analysis in the graph domain and changing the signal flow graph (SFG) of an amplifier are employed according to the graph rules at the system level. A three-stage amplifier, which expands the dual-path structure and buffering-based pole relocation amplifier through the graph domain inspection by using the graph rules, is proposed. By adding order of denominator in main fraction of the equivalent impedance of active zero block, the proposed amplifier can effectively increase the driving ability while enhancing the amplifier’s stability for a large range of capacitive load. The second pole is located at a higher frequency to increase the phase margin (PM). Circuit implementation of the proposed amplifier is simulated in 0.18- $mu $ CMOS standard technology with 0.004-mm2 active area and 8.8- $mu $ power consumption. Post-layout simulation results show 130 dB in dc gain, with a 670-kHz unity-gain frequency, while the amplifier uses a 400-fF compensation capacitor. The amplifier has obtained a PM of 60.4° at C $_{text {L}} =3.7$ nF. An average slew rate (SR) of 0.38 v/ $mu $ s was measured when the proposed amplifier was in unity-gain configuration to drive a 3.7-nF load capacitor. FoMS and FoML in the proposed amplifier are improved by 116% and 107%, respectively.
本文采用图域分析的设计策略,并根据系统级的图规则改变放大器的信号流图(SFG)。本文提出了一种三级放大器,它通过图域检验,利用图规则扩展了双通道结构和基于缓冲的极点重置放大器。通过增加有源零块等效阻抗主分数分母的阶次,所提出的放大器可以有效提高驱动能力,同时增强放大器在大范围电容负载下的稳定性。第二极位于较高频率,以增加相位裕度(PM)。在 0.004 平方米有源面积和 8.8 美元功耗的 0.18 美元 CMOS 标准技术中模拟了拟议放大器的电路实现。布局后仿真结果显示,直流增益为 130 dB,单增益频率为 670 kHz,而放大器使用了 400-fF 补偿电容器。在 C $_{text {L}} =3.7$ nF 时,放大器的 PM 为 60.4°。当该放大器采用单增益配置驱动 3.7 nF 负载电容器时,测得平均压摆率 (SR) 为 0.38 v/ $mu $ s。拟议放大器的 FoMS 和 FoML 分别提高了 116% 和 107%。
{"title":"Design and Analysis of a New Three-Stage Feedback Amplifier Utilizing Signal Flow Graph Domain Inspection Approach","authors":"M. Ghashghai;M. B. Ghaznavi-Ghoushchi","doi":"10.1109/TVLSI.2024.3426516","DOIUrl":"10.1109/TVLSI.2024.3426516","url":null,"abstract":"In this article, the design strategy with the analysis in the graph domain and changing the signal flow graph (SFG) of an amplifier are employed according to the graph rules at the system level. A three-stage amplifier, which expands the dual-path structure and buffering-based pole relocation amplifier through the graph domain inspection by using the graph rules, is proposed. By adding order of denominator in main fraction of the equivalent impedance of active zero block, the proposed amplifier can effectively increase the driving ability while enhancing the amplifier’s stability for a large range of capacitive load. The second pole is located at a higher frequency to increase the phase margin (PM). Circuit implementation of the proposed amplifier is simulated in 0.18-\u0000<inline-formula> <tex-math>$mu $ </tex-math></inline-formula>\u0000 CMOS standard technology with 0.004-mm2 active area and 8.8-\u0000<inline-formula> <tex-math>$mu $ </tex-math></inline-formula>\u0000 power consumption. Post-layout simulation results show 130 dB in dc gain, with a 670-kHz unity-gain frequency, while the amplifier uses a 400-fF compensation capacitor. The amplifier has obtained a PM of 60.4° at C\u0000<inline-formula> <tex-math>$_{text {L}} =3.7$ </tex-math></inline-formula>\u0000 nF. An average slew rate (SR) of 0.38 v/\u0000<inline-formula> <tex-math>$mu $ </tex-math></inline-formula>\u0000 s was measured when the proposed amplifier was in unity-gain configuration to drive a 3.7-nF load capacitor. FoMS and FoML in the proposed amplifier are improved by 116% and 107%, respectively.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"32 10","pages":"1792-1800"},"PeriodicalIF":2.8,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141780232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Society Information 电气和电子工程师学会超大规模集成 (VLSI) 系统学会论文集信息
IF 2.8 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-07-25 DOI: 10.1109/TVLSI.2024.3418151
{"title":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems Society Information","authors":"","doi":"10.1109/TVLSI.2024.3418151","DOIUrl":"10.1109/TVLSI.2024.3418151","url":null,"abstract":"","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"32 8","pages":"C3-C3"},"PeriodicalIF":2.8,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10609532","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141780234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design of Octave Tuning Range LC VCO With Ultralow KVCO Using Frequency-Dependent Implicit Capacitance Neutralization Technique 利用频率相关隐含电容中和技术设计具有超低 $K_{text{VCO}}$ 的倍频程调谐范围 $LC$ VCO
IF 2.8 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-07-25 DOI: 10.1109/TVLSI.2024.3430544
Youming Zhang;Xusheng Tang;Tonglu Jiao;Peng Liu;Jingchen Liu
This article presents a technique of frequency-dependent implicit capacitance neutralization (FD-ICN) among capacitor bank and varactors in LC voltage-controlled oscillator (VCO) to facilitate ultralow VCO gain ( $K_{text {VCO}}$ ) across an octave frequency tuning range (TR). Series interconnect inductors are used between capacitor units to feature the inverse capacitance-frequency (C–f) relationship with varactors, thus yielding the proposed FD-ICN. A split 8-bit capacitor bank with centrosymmetric double-cross layout pattern is designed, enabling an enhanced FD-ICN through multiple capacitance equivalent iterations of the capacitor bank. The proposed FD-ICN technique is validated in a prototype of dual-mode electric-coupling VCO and fabricated in 130-nm CMOS process, exhibiting a measured frequency TR of 72.1% from 6.49 to 13.81 GHz with a $K_{text {VCO}}$ of 7–73 MHz/V. The VCO shows a competitive phase noise (PN) and figures-of-merit in TR (FoM $rm {_{T}}$ ) from −121.3 to −111.4 dBc/Hz and 197.4 to 201.7 dBc/Hz at 1-MHz offset across the whole TR.
本文提出了一种 LC 压控振荡器(VCO)中电容器组和变容器之间的频率相关隐式电容中和(FD-ICN)技术,以促进整个倍频程频率调谐范围(TR)内的超低 VCO 增益($K_{text {VCO}}$)。电容器单元之间使用串联互联电感器,以实现变容电容器的反电容-频率(C-f)关系,从而产生拟议的 FD-ICN。设计了一个具有中心对称双交叉布局模式的 8 位分离式电容器组,通过电容器组的多次电容等效迭代,实现了增强型 FD-ICN 。所提出的 FD-ICN 技术在双模电耦合 VCO 原型中得到了验证,该原型采用 130 纳米 CMOS 工艺制造,在 6.49 至 13.81 GHz 范围内的测量频率 TR 为 72.1%,K_{text {VCO}}$ 为 7-73 MHz/V。在整个 TR 期间,该 VCO 显示出极具竞争力的相位噪声 (PN) 和 TR 中的优越性(FoM $rm {_{T}}$ ),分别为 -121.3 至 -111.4 dBc/Hz,以及在 1-MHz 偏移时的 197.4 至 201.7 dBc/Hz。
{"title":"Design of Octave Tuning Range LC VCO With Ultralow KVCO Using Frequency-Dependent Implicit Capacitance Neutralization Technique","authors":"Youming Zhang;Xusheng Tang;Tonglu Jiao;Peng Liu;Jingchen Liu","doi":"10.1109/TVLSI.2024.3430544","DOIUrl":"10.1109/TVLSI.2024.3430544","url":null,"abstract":"This article presents a technique of frequency-dependent implicit capacitance neutralization (FD-ICN) among capacitor bank and varactors in LC voltage-controlled oscillator (VCO) to facilitate ultralow VCO gain (\u0000<inline-formula> <tex-math>$K_{text {VCO}}$ </tex-math></inline-formula>\u0000) across an octave frequency tuning range (TR). Series interconnect inductors are used between capacitor units to feature the inverse capacitance-frequency (C–f) relationship with varactors, thus yielding the proposed FD-ICN. A split 8-bit capacitor bank with centrosymmetric double-cross layout pattern is designed, enabling an enhanced FD-ICN through multiple capacitance equivalent iterations of the capacitor bank. The proposed FD-ICN technique is validated in a prototype of dual-mode electric-coupling VCO and fabricated in 130-nm CMOS process, exhibiting a measured frequency TR of 72.1% from 6.49 to 13.81 GHz with a \u0000<inline-formula> <tex-math>$K_{text {VCO}}$ </tex-math></inline-formula>\u0000 of 7–73 MHz/V. The VCO shows a competitive phase noise (PN) and figures-of-merit in TR (FoM\u0000<inline-formula> <tex-math>$rm {_{T}}$ </tex-math></inline-formula>\u0000) from −121.3 to −111.4 dBc/Hz and 197.4 to 201.7 dBc/Hz at 1-MHz offset across the whole TR.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"32 10","pages":"1908-1918"},"PeriodicalIF":2.8,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141780233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Publication Information IEEE 超大规模集成 (VLSI) 系统论文集 出版信息
IF 2.8 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-07-25 DOI: 10.1109/TVLSI.2024.3415749
{"title":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems Publication Information","authors":"","doi":"10.1109/TVLSI.2024.3415749","DOIUrl":"10.1109/TVLSI.2024.3415749","url":null,"abstract":"","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"32 8","pages":"C2-C2"},"PeriodicalIF":2.8,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10609531","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141780235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multidie 3-D Stacking of Memory Dominated Neuromorphic Architectures 内存主导型神经形态架构的多层三维堆叠
IF 2.8 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-07-25 DOI: 10.1109/TVLSI.2024.3421625
Leandro M. Giacomini Rocha;Refik Bilgic;Mohamed Naeim;Sudipta Das;Herman Oprins;Amirreza Yousefzadeh;Mario Konijnenburg;Dragomir Milojevic;James Myers;Julien Ryckaert;Dwaipayan Biswas
Event-driven neuromorphic processors for artificial intelligence (AI) inference on edge/IoT devices require largeon-chip memory capacity, for efficient execution of spiking neural networks (NNs). In this work, we evaluate 3-D stacking benefits on SENECA, a digital neuromorphic accelerator core, sweeping itson-chip memory capacity from 2 up to 32 Mb in both legacy planar and advanced nanosheet CMOS logic nodes. In a planar CMOS node (GF-22 nm), two-die memory-on-logic (MoL) partitioning enables $8times $ moreon-chip memory, and it boosts operating frequency by 7% with 26% less power than the 2-D. Moving to an advanced nanosheet technology (imec A10), multidie (up to 7 dies) MoL stacking enables a performance increase of up to 29% and power savings up to 31%. Furthermore, a core folding (CF) partitioning in A10 shows up to 16% performance improvement with 12% total power savings with respect to the 2-D implementation on the same technology. We also demonstrate no thermal overhead for multidie stacking at advanced nodes for designs exhibiting low power density. These physical design explorations lay the foundation for system technology co-optimization studies for edge devices.
边缘/物联网设备上用于人工智能(AI)推理的事件驱动神经形态处理器需要较大的片上存储器容量,以高效执行尖峰神经网络(NN)。在这项工作中,我们评估了数字神经形态加速器内核 SENECA 的三维堆叠优势,在传统的平面和先进的纳米片 CMOS 逻辑节点中,将其片上存储器容量从 2 Mb 提升到 32 Mb。在平面CMOS节点(GF-22 nm)上,双芯片逻辑存储器(MoL)分区实现了8倍的片上存储器容量,并将工作频率提高了7%,功耗比2-D低26%。采用先进的纳米片技术(imec A10),多芯片(最多 7 个芯片)MoL 堆叠可使性能提高 29%,功耗降低 31%。此外,A10 中的内核折叠(CF)分区与相同技术上的 2-D 实现相比,性能提高了 16%,总功耗降低了 12%。我们还证明,在先进节点上进行低功率密度设计时,多芯片堆叠不会产生热开销。这些物理设计探索为边缘器件的系统技术协同优化研究奠定了基础。
{"title":"Multidie 3-D Stacking of Memory Dominated Neuromorphic Architectures","authors":"Leandro M. Giacomini Rocha;Refik Bilgic;Mohamed Naeim;Sudipta Das;Herman Oprins;Amirreza Yousefzadeh;Mario Konijnenburg;Dragomir Milojevic;James Myers;Julien Ryckaert;Dwaipayan Biswas","doi":"10.1109/TVLSI.2024.3421625","DOIUrl":"10.1109/TVLSI.2024.3421625","url":null,"abstract":"Event-driven neuromorphic processors for artificial intelligence (AI) inference on edge/IoT devices require largeon-chip memory capacity, for efficient execution of spiking neural networks (NNs). In this work, we evaluate 3-D stacking benefits on SENECA, a digital neuromorphic accelerator core, sweeping itson-chip memory capacity from 2 up to 32 Mb in both legacy planar and advanced nanosheet CMOS logic nodes. In a planar CMOS node (GF-22 nm), two-die memory-on-logic (MoL) partitioning enables \u0000<inline-formula> <tex-math>$8times $ </tex-math></inline-formula>\u0000 moreon-chip memory, and it boosts operating frequency by 7% with 26% less power than the 2-D. Moving to an advanced nanosheet technology (imec A10), multidie (up to 7 dies) MoL stacking enables a performance increase of up to 29% and power savings up to 31%. Furthermore, a core folding (CF) partitioning in A10 shows up to 16% performance improvement with 12% total power savings with respect to the 2-D implementation on the same technology. We also demonstrate no thermal overhead for multidie stacking at advanced nodes for designs exhibiting low power density. These physical design explorations lay the foundation for system technology co-optimization studies for edge devices.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"32 11","pages":"2144-2148"},"PeriodicalIF":2.8,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141780236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Low-Latency PAPR Reduction Architecture for Discrete Multitone Based on Approximate Midrange 基于近似中频的离散多音低延迟 PAPR 降低架构
IF 2.8 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-07-23 DOI: 10.1109/tvlsi.2024.3430094
Byeong Yong Kong
{"title":"Low-Latency PAPR Reduction Architecture for Discrete Multitone Based on Approximate Midrange","authors":"Byeong Yong Kong","doi":"10.1109/tvlsi.2024.3430094","DOIUrl":"https://doi.org/10.1109/tvlsi.2024.3430094","url":null,"abstract":"","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"16 1","pages":""},"PeriodicalIF":2.8,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141780237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Model-Based Study on the Limit of the Dynamic Load Regulation Performance of a Digital Low Dropout Regulator 基于模型的数字低压差稳压器动态负载调节性能极限研究
IF 2.8 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-07-18 DOI: 10.1109/TVLSI.2024.3425771
Yichen Xu;Zhaoqing Wang;Jonghyun Oh;Mingoo Seok
A digital low dropout (DLDO) regulator is one of the most critical building blocks in on-chip power management for its technology portability, voltage scalability, and other benefits associated with digital-oriented design. A key metric of DLDOs is the dynamic load regulation performance, often measured as the maximum current that a DLDO can quickly supply upon a significant load step under a voltage droop constraint (usually 10% of the output voltage). Previous works focused on architecture and circuit techniques to improve this metric. However, limited research focuses on the model development for the dynamic load regulation performance. To fill this gap, in this article, we propose the analytical models of the maximum load current of the standard DLDOs employing feedback and feedforward control laws. The developed models shed light on the impact of various design parameters on the total load current of a DLDO, with which both circuit and system designers can navigate the design space quickly and effectively.
数字低压降(DLDO)稳压器是片上电源管理中最重要的构件之一,因为它具有技术可移植性、电压可扩展性以及与面向数字设计相关的其他优势。DLDO 的一个关键指标是动态负载调节性能,通常以 DLDO 在电压下降限制(通常为输出电压的 10%)下,在出现显著负载阶跃时能快速提供的最大电流来衡量。以前的工作主要集中在改进这一指标的架构和电路技术上。然而,针对动态负载调节性能模型开发的研究却十分有限。为了填补这一空白,我们在本文中提出了采用反馈和前馈控制法的标准 DLDO 的最大负载电流分析模型。所开发的模型揭示了各种设计参数对 DLDO 总负载电流的影响,电路和系统设计人员可以利用这些模型快速有效地浏览设计空间。
{"title":"Model-Based Study on the Limit of the Dynamic Load Regulation Performance of a Digital Low Dropout Regulator","authors":"Yichen Xu;Zhaoqing Wang;Jonghyun Oh;Mingoo Seok","doi":"10.1109/TVLSI.2024.3425771","DOIUrl":"10.1109/TVLSI.2024.3425771","url":null,"abstract":"A digital low dropout (DLDO) regulator is one of the most critical building blocks in on-chip power management for its technology portability, voltage scalability, and other benefits associated with digital-oriented design. A key metric of DLDOs is the dynamic load regulation performance, often measured as the maximum current that a DLDO can quickly supply upon a significant load step under a voltage droop constraint (usually 10% of the output voltage). Previous works focused on architecture and circuit techniques to improve this metric. However, limited research focuses on the model development for the dynamic load regulation performance. To fill this gap, in this article, we propose the analytical models of the maximum load current of the standard DLDOs employing feedback and feedforward control laws. The developed models shed light on the impact of various design parameters on the total load current of a DLDO, with which both circuit and system designers can navigate the design space quickly and effectively.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"32 10","pages":"1822-1829"},"PeriodicalIF":2.8,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141739165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Functionally Possible Path Delay Faults With High Functional Switching Activity 功能开关活动频繁时可能出现的路径延迟故障
IF 2.8 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-07-16 DOI: 10.1109/TVLSI.2024.3425817
Irith Pomeranz;Yervant Zorian
Chip aging that results in small delay defects is one of the possible causes for silent data corruption that has been observed in large datacenters. Chip aging is exacerbated by high software workloads when the chip is deployed in a system. Small delay defects are detected by tests for path delay faults. Path delay faults are typically selected to include the longest testable paths. In addition, functionally possible paths are selected to ensure the detection of small delay defects that can cause a chip to fail during functional operation. To address chip aging, it is suggested in this brief that the longest functionally possible paths through as many lines as possible with the highest susceptibilities to aging should be targeted. A path selection procedure at the gate level is described, that uses the switching activity under functional test sequences to identify functionally possible paths that are the most susceptible to aging. Experimental results for benchmark circuits show that the length of a path and the functional switching activities for lines along the path are independent, and each criterion alone leads to the selection of different paths. The results suggest that both criteria need to be used together for path selection.
在大型数据中心观察到的静默数据损坏现象中,导致小延迟缺陷的芯片老化是可能的原因之一。在系统中部署芯片时,高软件工作量会加剧芯片老化。小延迟缺陷可通过路径延迟故障测试检测出来。路径延迟故障的选择通常包括最长的可测试路径。此外,还要选择功能上可能的路径,以确保检测到可能导致芯片在功能运行期间失效的小延迟缺陷。为解决芯片老化问题,本简介建议应尽可能多地选择最长的功能可能路径,这些路径应是最容易老化的线路。本文介绍了一种门级路径选择程序,该程序利用功能测试序列下的开关活动来识别最易老化的功能可能路径。基准电路的实验结果表明,路径长度和路径沿线的功能开关活动是相互独立的,单凭这两项标准就能选择不同的路径。结果表明,在选择路径时需要同时使用这两个标准。
{"title":"Functionally Possible Path Delay Faults With High Functional Switching Activity","authors":"Irith Pomeranz;Yervant Zorian","doi":"10.1109/TVLSI.2024.3425817","DOIUrl":"10.1109/TVLSI.2024.3425817","url":null,"abstract":"Chip aging that results in small delay defects is one of the possible causes for silent data corruption that has been observed in large datacenters. Chip aging is exacerbated by high software workloads when the chip is deployed in a system. Small delay defects are detected by tests for path delay faults. Path delay faults are typically selected to include the longest testable paths. In addition, functionally possible paths are selected to ensure the detection of small delay defects that can cause a chip to fail during functional operation. To address chip aging, it is suggested in this brief that the longest functionally possible paths through as many lines as possible with the highest susceptibilities to aging should be targeted. A path selection procedure at the gate level is described, that uses the switching activity under functional test sequences to identify functionally possible paths that are the most susceptible to aging. Experimental results for benchmark circuits show that the length of a path and the functional switching activities for lines along the path are independent, and each criterion alone leads to the selection of different paths. The results suggest that both criteria need to be used together for path selection.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"32 11","pages":"2159-2163"},"PeriodicalIF":2.8,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141717633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Novel Optimized Designs of Modulo 2n+1 Adder for Quantum Computing 用于量子计算的模 2$^n$ $+$ 1 加法器的新型优化设计
IF 2.8 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-07-15 DOI: 10.1109/TVLSI.2024.3418930
Bhaskar Gaur;Himanshu Thapliyal
Quantum modular adders are one of the most fundamental yet versatile quantum computation operations. They help implement the functions of higher complexity, such as subtraction and multiplication, which are used in applications, such as quantum cryptanalysis, quantum image processing, and securing communication. To the best of our knowledge, there is no existing design of quantum modulo ( $2^{n}+1$ ) adder (QMA). In this work, we propose four quantum adders targeted specifically for modulo ( $2^{n}+1$ ) addition. These adders can provide both regular and modulo ( $2^{n}+1$ ) sum concurrently, enhancing their application in residue number system-based arithmetic. Our first design, QMA1, is a novel quantum modulo ( $2^{n}+1$ ) adder. The second proposed adder, QMA2, optimizes the utilization of quantum gates within the QMA1, resulting in 37.5% reduced CNOT gate count, 46.15% reduced CNOT depth, and 26.5% decrease in both Toffoli gates and depth. We propose a third adder QMA3 that uses zero resets, a dynamic circuits-based feature that reuses qubits, leading to 25% savings in qubit count. Our fourth design, QMA4, demonstrates the benefit of incorporating additional zero resets to achieve a purer $|0$ $rangle $ state, reducing quantum state preparation errors. Notably, we conducted experiments using 5-qubit configurations of the proposed modulo ( $2^{n}+1$ ) adders on the IBM Washington, a 127-qubit quantum computer based on the Eagle R1 architecture, to demonstrate a 28.8% reduction in QMA1’s error of which do the following: 1) 18.63% error reduction happens due to gate/depth reduction in QMA2; 2) 2.53% drop in error due to qubit reduction in QMA3; and 3) 7.64% error decreased due to application of additional zero resets in QMA4.
量子模块加法器是最基本但用途最广的量子计算操作之一。它们有助于实现更高难度的功能,如减法和乘法,这些功能可用于量子密码分析、量子图像处理和安全通信等应用。据我们所知,目前还没有量子模(2^{n}+1$ )加法器(QMA)的设计。在这项工作中,我们提出了四种专门针对模数 ( 2^{n}+1$ ) 加法的量子加法器。这些加法器可以同时提供正则和模数 ( $2^{n}+1$ ) 加法,从而提高了它们在基于残差数系统的算术中的应用。我们的第一个设计 QMA1 是一种新型量子模 ( $2^{n}+1$ ) 加法器。我们提出的第二个加法器 QMA2 优化了 QMA1 中量子门的利用率,使 CNOT 门数减少了 37.5%,CNOT 深度减少了 46.15%,Toffoli 门数和深度均减少了 26.5%。我们提出的第三个加法器 QMA3 使用了零重置,这是一种基于动态电路的量子比特再利用功能,从而节省了 25% 的量子比特数。我们的第四个设计 QMA4 展示了加入额外的零重置以实现更纯净的 $|0$ $rangle $ 状态,从而减少量子态准备误差的好处。值得注意的是,我们在基于 Eagle R1 架构的 127 量子位量子计算机 IBM Washington 上使用 5 量子位配置的拟议模数 ( $2^{n}+1$ ) 加法器进行了实验,证明 QMA1 的误差减少了 28.8%,具体表现如下:1) 由于 QMA2 中的门/深度减少,误差降低了 18.63%;2) 由于 QMA3 中的量子位减少,误差降低了 2.53%;3) 由于 QMA4 中应用了额外的零重置,误差降低了 7.64%。
{"title":"Novel Optimized Designs of Modulo 2n+1 Adder for Quantum Computing","authors":"Bhaskar Gaur;Himanshu Thapliyal","doi":"10.1109/TVLSI.2024.3418930","DOIUrl":"10.1109/TVLSI.2024.3418930","url":null,"abstract":"Quantum modular adders are one of the most fundamental yet versatile quantum computation operations. They help implement the functions of higher complexity, such as subtraction and multiplication, which are used in applications, such as quantum cryptanalysis, quantum image processing, and securing communication. To the best of our knowledge, there is no existing design of quantum modulo (\u0000<inline-formula> <tex-math>$2^{n}+1$ </tex-math></inline-formula>\u0000) adder (QMA). In this work, we propose four quantum adders targeted specifically for modulo (\u0000<inline-formula> <tex-math>$2^{n}+1$ </tex-math></inline-formula>\u0000) addition. These adders can provide both regular and modulo (\u0000<inline-formula> <tex-math>$2^{n}+1$ </tex-math></inline-formula>\u0000) sum concurrently, enhancing their application in residue number system-based arithmetic. Our first design, QMA1, is a novel quantum modulo (\u0000<inline-formula> <tex-math>$2^{n}+1$ </tex-math></inline-formula>\u0000) adder. The second proposed adder, QMA2, optimizes the utilization of quantum gates within the QMA1, resulting in 37.5% reduced CNOT gate count, 46.15% reduced CNOT depth, and 26.5% decrease in both Toffoli gates and depth. We propose a third adder QMA3 that uses zero resets, a dynamic circuits-based feature that reuses qubits, leading to 25% savings in qubit count. Our fourth design, QMA4, demonstrates the benefit of incorporating additional zero resets to achieve a purer \u0000<inline-formula> <tex-math>$|0$ </tex-math></inline-formula>\u0000<inline-formula> <tex-math>$rangle $ </tex-math></inline-formula>\u0000 state, reducing quantum state preparation errors. Notably, we conducted experiments using 5-qubit configurations of the proposed modulo (\u0000<inline-formula> <tex-math>$2^{n}+1$ </tex-math></inline-formula>\u0000) adders on the IBM Washington, a 127-qubit quantum computer based on the Eagle R1 architecture, to demonstrate a 28.8% reduction in QMA1’s error of which do the following: 1) 18.63% error reduction happens due to gate/depth reduction in QMA2; 2) 2.53% drop in error due to qubit reduction in QMA3; and 3) 7.64% error decreased due to application of additional zero resets in QMA4.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"32 9","pages":"1759-1763"},"PeriodicalIF":2.8,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141717635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High-Accuracy and Low-Multiplication Recursive Discrete Cosine Transform Algorithm Design and Its Realization in Mel-Scale Frequency Cepstral Coefficients 高精度、低乘法递归离散余弦变换算法设计及其在 Mel-Scale 频率倒频谱系数中的实现
IF 2.8 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-07-15 DOI: 10.1109/TVLSI.2024.3422994
Shin-Chi Lai;Szu-Ting Wang;Yi-Chang Zhu;Ying-Hsiu Hung;Jeng-Dao Lee;Wei-Da Chen
This brief introduces an innovative recursive discrete cosine transform (DCT) algorithm characterized by its exceptional precision and minimal multiplication requirements. Through the strategic implementation of data reordering and “q” value adjustment schemes, the proposed algorithm entails only a single constant-multiplication operation featuring a fixed cosine coefficient within the iterative phase. By judiciously selecting an appropriate “q” value (q =41), it achieves outstanding results, reaching peak signal-to-noise ratios (PSNRs) of 94.9 and 100.9 dB under 18-bit and 20-bit word length (WL) conditions, respectively, in terms of decimal places. Notably, the proposed algorithm substantially diminishes the number of multiplications by 86.08%, offset by an increase of 2688 additions. The proposed design has a simpler structure and utilizes fewer hardware resources. In field programmable gate array (FPGA) implementation, the device is composed of 43 combinational adaptive look-up tables (ALUTs) specifically allocated for constant multiplication (CM). Overall, the proposed accelerator totally takes 158 combinational ALUTs, 65 registers, a 960-bit read-only memory (ROM), and a 1024-bit random access memory (RAM) in hardware realization and can be operated at a maximum frequency of 156.62 MHz. Therefore, it is particularly well-suited for VLSI implementation in a parallel calculation of Mel-scale frequency cepstral coefficients (MFCCs).
本简介介绍了一种创新的递归离散余弦变换(DCT)算法,其特点是精度极高,乘法要求极低。通过战略性地实施数据重排和 "q "值调整方案,所提出的算法在迭代阶段只需进行一次固定余弦系数的常数乘法运算。通过明智地选择合适的 "q "值(q =41),该算法取得了出色的成果,在 18 位和 20 位字长(WL)条件下,以小数位数计算,峰值信噪比(PSNR)分别达到 94.9 和 100.9 dB。值得注意的是,所提出的算法大大减少了 86.08% 的乘法运算次数,但增加了 2688 次加法运算。拟议的设计结构更简单,利用的硬件资源更少。在现场可编程门阵列(FPGA)实现中,该器件由 43 个组合自适应查找表(ALUT)组成,专门分配给常数乘法(CM)。总体而言,所提出的加速器在硬件实现中需要 158 个组合自适应查找表 (ALUT)、65 个寄存器、960 位只读存储器 (ROM) 和 1024 位随机存取存储器 (RAM),最高运行频率可达 156.62 MHz。因此,它特别适合用于并行计算梅尔尺度频率倒频谱系数(MFCC)的 VLSI 实现。
{"title":"High-Accuracy and Low-Multiplication Recursive Discrete Cosine Transform Algorithm Design and Its Realization in Mel-Scale Frequency Cepstral Coefficients","authors":"Shin-Chi Lai;Szu-Ting Wang;Yi-Chang Zhu;Ying-Hsiu Hung;Jeng-Dao Lee;Wei-Da Chen","doi":"10.1109/TVLSI.2024.3422994","DOIUrl":"10.1109/TVLSI.2024.3422994","url":null,"abstract":"This brief introduces an innovative recursive discrete cosine transform (DCT) algorithm characterized by its exceptional precision and minimal multiplication requirements. Through the strategic implementation of data reordering and “q” value adjustment schemes, the proposed algorithm entails only a single constant-multiplication operation featuring a fixed cosine coefficient within the iterative phase. By judiciously selecting an appropriate “q” value (q =41), it achieves outstanding results, reaching peak signal-to-noise ratios (PSNRs) of 94.9 and 100.9 dB under 18-bit and 20-bit word length (WL) conditions, respectively, in terms of decimal places. Notably, the proposed algorithm substantially diminishes the number of multiplications by 86.08%, offset by an increase of 2688 additions. The proposed design has a simpler structure and utilizes fewer hardware resources. In field programmable gate array (FPGA) implementation, the device is composed of 43 combinational adaptive look-up tables (ALUTs) specifically allocated for constant multiplication (CM). Overall, the proposed accelerator totally takes 158 combinational ALUTs, 65 registers, a 960-bit read-only memory (ROM), and a 1024-bit random access memory (RAM) in hardware realization and can be operated at a maximum frequency of 156.62 MHz. Therefore, it is particularly well-suited for VLSI implementation in a parallel calculation of Mel-scale frequency cepstral coefficients (MFCCs).","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"32 11","pages":"2139-2143"},"PeriodicalIF":2.8,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141717634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1