首页 > 最新文献

Integration-The Vlsi Journal最新文献

英文 中文
Novel hybrid TFET-FinFET 12T SRAM cells with enhanced write margin and read performance 新型混合 TFET-FinFET 12T SRAM 单元,可提高写入裕度和读取性能
IF 2.2 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-10-10 DOI: 10.1016/j.vlsi.2024.102294
Seyed Arman Sabaghpour, Behzad Ebrahimi, Pooya Torkzadeh
This work presents two innovative 12T cells combining tunnel field-effect transistor (TFET) and fin field-effect transistor (FinFET) technologies. These cells address reverse bias current issues by incorporating separate paths for reading data and write enhancement cut transistors, enhancing hold/read/write static noise margin (H/R/WSNM), reducing read time, and minimizing power consumption from TFET leakage. At 0.6 V, the first (second) SRAM cell shows a WSNM improvement over O_7T, 8T, CA_10T, 12T, and HF_10T cells by 152 % (93 %), 152 % (93 %), 157.7 % (97.5 %), 95 % (50 %), and 104 % (57 %), respectively. The leakage power of the first (second) 12T TFET SRAM cell is two (four) orders of magnitude lower than O_7T and 8T SRAM cells. These hybrid SRAM cells also exhibit faster read operations across VDD voltage levels (0.3 V–1 V) and the first 12T cell demonstrates shorter write access times than 12T and CA_10T SRAM cells. These characteristics make the proposed cells particularly suitable for energy-efficient IoT devices and medical applications, where balancing power, area, performance, and data integrity is critical.
这项研究提出了两种结合隧道场效应晶体管(TFET)和鳍式场效应晶体管(FinFET)技术的创新型 12T 单元。这些单元通过为读取数据和写入增强型切割晶体管整合独立路径、增强保持/读取/写入静态噪声裕度 (H/R/WSNM)、缩短读取时间以及最大限度降低 TFET 漏电功耗,解决了反向偏置电流问题。在 0.6 V 电压下,第一(第二)SRAM 单元的 WSNM 比 O_7T、8T、CA_10T、12T 和 HF_10T 单元分别提高了 152 %(93 %)、152 %(93 %)、157.7 %(97.5 %)、95 %(50 %)和 104 %(57 %)。第一个(第二个)12T TFET SRAM 单元的漏功率比 O_7T 和 8T SRAM 单元低两(四)个数量级。与 12T 和 CA_10T SRAM 相比,这些混合 SRAM 单元的读取操作速度更快,跨 VDD 电压电平(0.3 V-1 V),第一个 12T 单元的写入访问时间更短。这些特性使拟议的单元特别适用于高能效物联网设备和医疗应用,在这些应用中,平衡功耗、面积、性能和数据完整性至关重要。
{"title":"Novel hybrid TFET-FinFET 12T SRAM cells with enhanced write margin and read performance","authors":"Seyed Arman Sabaghpour,&nbsp;Behzad Ebrahimi,&nbsp;Pooya Torkzadeh","doi":"10.1016/j.vlsi.2024.102294","DOIUrl":"10.1016/j.vlsi.2024.102294","url":null,"abstract":"<div><div>This work presents two innovative 12T cells combining tunnel field-effect transistor (TFET) and fin field-effect transistor (FinFET) technologies. These cells address reverse bias current issues by incorporating separate paths for reading data and write enhancement cut transistors, enhancing hold/read/write static noise margin (H/R/WSNM), reducing read time, and minimizing power consumption from TFET leakage. At 0.6 V, the first (second) SRAM cell shows a WSNM improvement over O_7T, 8T, CA_10T, 12T, and HF_10T cells by 152 % (93 %), 152 % (93 %), 157.7 % (97.5 %), 95 % (50 %), and 104 % (57 %), respectively. The leakage power of the first (second) 12T TFET SRAM cell is two (four) orders of magnitude lower than O_7T and 8T SRAM cells. These hybrid SRAM cells also exhibit faster read operations across <em>V</em><sub>DD</sub> voltage levels (0.3 V–1 V) and the first 12T cell demonstrates shorter write access times than 12T and CA_10T SRAM cells. These characteristics make the proposed cells particularly suitable for energy-efficient IoT devices and medical applications, where balancing power, area, performance, and data integrity is critical.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"100 ","pages":"Article 102294"},"PeriodicalIF":2.2,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142437845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Digital background calibration algorithm for pipelined ADC based on time-delay neural network with genetic algorithm feature selection 基于时延神经网络和遗传算法特征选择的流水线 ADC 数字背景校准算法
IF 2.2 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-10-10 DOI: 10.1016/j.vlsi.2024.102295
Yongsheng Yin, Long Li, Jiashen Li, Yukun Song, Honghui Deng, Hongmei Chen, Luotian Wu, Muqi Li, Xu Meng
This paper presents a novel background calibration method for pipelined analog-to-digital converters (ADCs) using a time-delay neural network (TDNN), which is optimized through genetic algorithm (GA) techniques. The proposed technique leverages TDNN to create enhanced feature sets, significantly improving the calibration of nonlinear errors exhibiting memory effects. It harnesses the GA's global optimization capabilities for feature selection, effectively reducing the feature dimension and consequently alleviating the NN's computational burden. A parallel pipeline architecture is devised for the calibration circuit, with its implementation realized on FPGA to facilitate forward inference processing. The inference circuit is synthesized using TSMC's 90 nm CMOS process, achieving a power consumption of 40.11 mW and an area of 0.45 mm2. Simulations based on MATLAB for a 14-bit Pipelined ADC demonstrate that the proposed calibration method significantly improves the SFDR from 59.77 dB to 165.52 dB, and ENOB from 8.79 bits to 19.23 bits, surpassing the target ADC's specifications. Moreover, the dimensionality of features is effectively reduced by up to 34 % without compromising the calibration performance.
本文针对流水线模数转换器 (ADC) 提出了一种新型背景校准方法,该方法使用时延神经网络 (TDNN),并通过遗传算法 (GA) 技术进行了优化。所提出的技术利用 TDNN 创建增强型特征集,显著改善了表现出记忆效应的非线性误差的校准。它利用遗传算法的全局优化能力进行特征选择,有效降低了特征维度,从而减轻了 NN 的计算负担。为校准电路设计了并行流水线架构,并在 FPGA 上实现,以方便前向推理处理。推理电路采用台积电 90 纳米 CMOS 工艺合成,功耗为 40.11 mW,面积为 0.45 mm2。基于 MATLAB 的 14 位流水线 ADC 仿真表明,所提出的校准方法显著提高了 SFDR,从 59.77 dB 提高到 165.52 dB,ENOB 从 8.79 bits 提高到 19.23 bits,超过了目标 ADC 的规格。此外,在不影响校准性能的情况下,特征维数有效降低了 34%。
{"title":"Digital background calibration algorithm for pipelined ADC based on time-delay neural network with genetic algorithm feature selection","authors":"Yongsheng Yin,&nbsp;Long Li,&nbsp;Jiashen Li,&nbsp;Yukun Song,&nbsp;Honghui Deng,&nbsp;Hongmei Chen,&nbsp;Luotian Wu,&nbsp;Muqi Li,&nbsp;Xu Meng","doi":"10.1016/j.vlsi.2024.102295","DOIUrl":"10.1016/j.vlsi.2024.102295","url":null,"abstract":"<div><div>This paper presents a novel background calibration method for pipelined analog-to-digital converters (ADCs) using a time-delay neural network (TDNN), which is optimized through genetic algorithm (GA) techniques. The proposed technique leverages TDNN to create enhanced feature sets, significantly improving the calibration of nonlinear errors exhibiting memory effects. It harnesses the GA's global optimization capabilities for feature selection, effectively reducing the feature dimension and consequently alleviating the NN's computational burden. A parallel pipeline architecture is devised for the calibration circuit, with its implementation realized on FPGA to facilitate forward inference processing. The inference circuit is synthesized using TSMC's 90 nm CMOS process, achieving a power consumption of 40.11 mW and an area of 0.45 mm<sup>2</sup>. Simulations based on MATLAB for a 14-bit Pipelined ADC demonstrate that the proposed calibration method significantly improves the SFDR from 59.77 dB to 165.52 dB, and ENOB from 8.79 bits to 19.23 bits, surpassing the target ADC's specifications. Moreover, the dimensionality of features is effectively reduced by up to 34 % without compromising the calibration performance.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"100 ","pages":"Article 102295"},"PeriodicalIF":2.2,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142432446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel architecture of high performance fully differential two stage RFC OTA designed using DFVF and hybrid cascode compensation techniques 采用 DFVF 和混合级联补偿技术设计的新型高性能全差分两级 RFC OTA 架构
IF 2.2 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-10-10 DOI: 10.1016/j.vlsi.2024.102296
Annu Dabas , Shweta Kumari , Maneesha Gupta , Richa Yadav
In this work, a novel fully differential two stage class AB Recycling Folded Cascode Operational Transconductance Amplifier (RFC OTA) using Differential Flipped Voltage Follower (DFVF) has been proposed. The DFVF and Dynamic Threshold Metal Oxide Semiconductor (DTMOS) transistors have been used as differential input stage of the proposed RFC OTA. These techniques provide enhancement in gain and bandwidth of the proposed OTA. To further improve the performance of proposed circuit, positive feedback at current mirror load along with Hybrid Cascode compensation have been implemented. A common source (CS) amplifier has been used between gate and source terminals of differential input stage which further boosts the transconductance. The proposed RFC OTA is designed and simulated using 180 nm CMOS technology with load capacitance of 10 pF. It provides an excellent dc gain of 112.61 dB and gain bandwidth product (GBW) of 25.88 MHz along with 88.140 phase margin. The proposed circuit dissipates 124.66 μW of power at ± 0.5V supply voltage. The Monte Carlo analysis against device mismatch has also been performed to prove robustness of the proposed circuit.
本研究提出了一种使用差分翻转电压跟随器(DFVF)的新型全差分两级 AB 类回收折叠级联运算跨导放大器(RFC OTA)。DFVF 和动态阈值金属氧化物半导体(DTMOS)晶体管被用作拟议 RFC OTA 的差分输入级。这些技术提高了拟议 OTA 的增益和带宽。为了进一步提高拟议电路的性能,在电流镜负载上实施了正反馈以及混合级联补偿。在差分输入级的栅极和源极之间使用了共源(CS)放大器,从而进一步提高了跨导。拟议的 RFC OTA 采用 180 nm CMOS 技术设计和仿真,负载电容为 10 pF。它的直流增益高达 112.61 dB,增益带宽积 (GBW) 为 25.88 MHz,相位裕度为 88.140。在 ± 0.5V 电源电压下,拟议电路的耗散功率为 124.66 μW。此外,还针对器件失配进行了蒙特卡罗分析,以证明所提电路的稳健性。
{"title":"A novel architecture of high performance fully differential two stage RFC OTA designed using DFVF and hybrid cascode compensation techniques","authors":"Annu Dabas ,&nbsp;Shweta Kumari ,&nbsp;Maneesha Gupta ,&nbsp;Richa Yadav","doi":"10.1016/j.vlsi.2024.102296","DOIUrl":"10.1016/j.vlsi.2024.102296","url":null,"abstract":"<div><div>In this work, a novel fully differential two stage class AB Recycling Folded Cascode Operational Transconductance Amplifier (RFC OTA) using Differential Flipped Voltage Follower (DFVF) has been proposed. The DFVF and Dynamic Threshold Metal Oxide Semiconductor (DTMOS) transistors have been used as differential input stage of the proposed RFC OTA. These techniques provide enhancement in gain and bandwidth of the proposed OTA. To further improve the performance of proposed circuit, positive feedback at current mirror load along with Hybrid Cascode compensation have been implemented. A common source (CS) amplifier has been used between gate and source terminals of differential input stage which further boosts the transconductance. The proposed RFC OTA is designed and simulated using 180 nm CMOS technology with load capacitance of 10 pF. It provides an excellent dc gain of 112.61 dB and gain bandwidth product (GBW) of 25.88 MHz along with 88.14<sup>0</sup> phase margin. The proposed circuit dissipates 124.66 μW of power at ± 0.5V supply voltage. The Monte Carlo analysis against device mismatch has also been performed to prove robustness of the proposed circuit.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"100 ","pages":"Article 102296"},"PeriodicalIF":2.2,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142444707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-bit error detection and correction technique using HVDK (Horizontal-Vertical-Diagonal-Knight) parity 使用 HVDK(水平-垂直-对角-骑士)奇偶校验的多比特错误检测和纠正技术
IF 2.2 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-10-10 DOI: 10.1016/j.vlsi.2024.102297
Abdul Aziz , Md. Asaf-uddowla Golap , Md. Rahat Ebne Alamgir Porosh , Md. Tasnimul Khair Tousif , Muhammad Sheikh Sadi
In a data stream, errors are quite likely to occur and sometimes this is much more terrible. So, data safety is very important in digital systems, especially in critical and real-time systems, microprocessors, embedded systems, computer memory, and data communication. The probability of soft error increases with the exponential rate of increasing transistor per chip, operational voltage, particle strike, condensation of bit-cell area, etc. To ensure data integrity, safety, and system reliability, error detection, and correction are fundamental components of data transmission and storage systems. Existing error correction techniques can solve several bits of error. However, these existing methods are not fully efficient, as some consume a lot of time, space, and bit overhead. An ideal approach will have the potential to minimize all of these parameters. This research paper proposes a novel error correction approach with horizontal, vertical, diagonal, and knight (HVDK) parity bits. This approach has been taken to correct 5-bit errors in 64 bits of data word using the parity-based technique with less bit overhead. Our research advances the knowledge of error correction methods and sheds light on how to pick and use parity bit schemes that are appropriate for different applications.
在数据流中,发生错误的可能性很大,有时甚至更为可怕。因此,数据安全在数字系统中非常重要,尤其是在关键和实时系统、微处理器、嵌入式系统、计算机内存和数据通信中。随着每芯片晶体管、工作电压、粒子撞击、位元面积浓缩等指数级速度的增加,软错误的概率也在增加。为确保数据完整性、安全性和系统可靠性,错误检测和纠正是数据传输和存储系统的基本组成部分。现有的纠错技术可以解决几个比特的错误。然而,这些现有的方法并不完全有效,因为有些方法需要消耗大量的时间、空间和比特开销。一种理想的方法有可能将所有这些参数降到最低。本研究论文提出了一种采用水平、垂直、对角和骑士(HVDK)奇偶校验位的新型纠错方法。该方法采用基于奇偶校验的技术,以较低的位开销纠正 64 位数据字中的 5 位错误。我们的研究增进了对纠错方法的了解,并阐明了如何选择和使用适合不同应用的奇偶校验位方案。
{"title":"Multi-bit error detection and correction technique using HVDK (Horizontal-Vertical-Diagonal-Knight) parity","authors":"Abdul Aziz ,&nbsp;Md. Asaf-uddowla Golap ,&nbsp;Md. Rahat Ebne Alamgir Porosh ,&nbsp;Md. Tasnimul Khair Tousif ,&nbsp;Muhammad Sheikh Sadi","doi":"10.1016/j.vlsi.2024.102297","DOIUrl":"10.1016/j.vlsi.2024.102297","url":null,"abstract":"<div><div>In a data stream, errors are quite likely to occur and sometimes this is much more terrible. So, data safety is very important in digital systems, especially in critical and real-time systems, microprocessors, embedded systems, computer memory, and data communication. The probability of soft error increases with the exponential rate of increasing transistor per chip, operational voltage, particle strike, condensation of bit-cell area, etc. To ensure data integrity, safety, and system reliability, error detection, and correction are fundamental components of data transmission and storage systems. Existing error correction techniques can solve several bits of error. However, these existing methods are not fully efficient, as some consume a lot of time, space, and bit overhead. An ideal approach will have the potential to minimize all of these parameters. This research paper proposes a novel error correction approach with horizontal, vertical, diagonal, and knight (HVDK) parity bits. This approach has been taken to correct 5-bit errors in 64 bits of data word using the parity-based technique with less bit overhead. Our research advances the knowledge of error correction methods and sheds light on how to pick and use parity bit schemes that are appropriate for different applications.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"100 ","pages":"Article 102297"},"PeriodicalIF":2.2,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A low-power common-mode insensitive rail-to-rail dynamic comparator for ADCs 用于 ADC 的低功耗共模不敏感轨至轨动态比较器
IF 2.2 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-10-05 DOI: 10.1016/j.vlsi.2024.102288
Nidhi Sharma , Rajesh Kumar Srivastava , Deep Sehgal , Devarshi Mrinal Das
This paper presents a low-power, high-speed dynamic comparator with a rail-to-rail input common-mode (Vi,cm) range. The proposed comparator has high-speed performance throughout the 0-Vdd Vi,cm range, thus attributing common-mode insensitivity. This work introduces a merger of NMOS and PMOS dynamic pre-amplifiers with a modified latch to achieve the rail-to-rail Vi,cm operation. A novel activation clock logic is also proposed, activating only one pre-amplifier based on the Vi,cm value and ensuring low-power consumption and provides reduction of 17% in the energy per conversion as compared to the comparator without activation clock logic. The proposed comparator is designed using 65-nm CMOS technology with a 1.2 V supply voltage and is operating at 1 GHz frequency. We have presented the analytical models of the delay and offset which is verified with the rigorous post-layout simulation results. To validate the robustness of the proposed comparator, the PVT corner analysis with Monte Carlo simulation is also performed for different Vi,cm.
本文介绍了一种具有轨至轨输入共模 (Vi,cm) 范围的低功耗、高速动态比较器。所提出的比较器在整个 0-Vdd Vi,cm 范围内都具有高速性能,因此具有共模不敏感性。这项工作将 NMOS 和 PMOS 动态前置放大器与改进型锁存器合并,以实现轨至轨 Vi,cm 操作。此外,还提出了一种新颖的激活时钟逻辑,根据 Vi,cm 值只激活一个前置放大器,确保低功耗,与没有激活时钟逻辑的比较器相比,每次转换的能耗降低了 17%。拟议的比较器采用 65 纳米 CMOS 技术设计,电源电压为 1.2 V,工作频率为 1 GHz。我们提出了延迟和偏移的分析模型,并通过严格的布局后仿真结果进行了验证。为了验证拟议比较器的鲁棒性,我们还针对不同的 Vi、cm 进行了蒙特卡罗仿真的 PVT 角分析。
{"title":"A low-power common-mode insensitive rail-to-rail dynamic comparator for ADCs","authors":"Nidhi Sharma ,&nbsp;Rajesh Kumar Srivastava ,&nbsp;Deep Sehgal ,&nbsp;Devarshi Mrinal Das","doi":"10.1016/j.vlsi.2024.102288","DOIUrl":"10.1016/j.vlsi.2024.102288","url":null,"abstract":"<div><div>This paper presents a low-power, high-speed dynamic comparator with a rail-to-rail input common-mode (<span><math><msub><mrow><mi>V</mi></mrow><mrow><mi>i</mi><mo>,</mo><mi>c</mi><mi>m</mi></mrow></msub></math></span>) range. The proposed comparator has high-speed performance throughout the 0-Vdd <span><math><msub><mrow><mi>V</mi></mrow><mrow><mi>i</mi><mo>,</mo><mi>c</mi><mi>m</mi></mrow></msub></math></span> range, thus attributing common-mode insensitivity. This work introduces a merger of NMOS and PMOS dynamic pre-amplifiers with a modified latch to achieve the rail-to-rail <span><math><msub><mrow><mi>V</mi></mrow><mrow><mi>i</mi><mo>,</mo><mi>c</mi><mi>m</mi></mrow></msub></math></span> operation. A novel activation clock logic is also proposed, activating only one pre-amplifier based on the <span><math><msub><mrow><mi>V</mi></mrow><mrow><mi>i</mi><mo>,</mo><mi>c</mi><mi>m</mi></mrow></msub></math></span> value and ensuring low-power consumption and provides reduction of 17% in the energy per conversion as compared to the comparator without activation clock logic. The proposed comparator is designed using 65-nm CMOS technology with a 1.2 V supply voltage and is operating at 1 GHz frequency. We have presented the analytical models of the delay and offset which is verified with the rigorous post-layout simulation results. To validate the robustness of the proposed comparator, the PVT corner analysis with Monte Carlo simulation is also performed for different <span><math><msub><mrow><mi>V</mi></mrow><mrow><mi>i</mi><mo>,</mo><mi>c</mi><mi>m</mi></mrow></msub></math></span>.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"100 ","pages":"Article 102288"},"PeriodicalIF":2.2,"publicationDate":"2024-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142419985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Decomposition based estimation of distribution algorithm for high-level synthesis design space exploration 探索高级合成设计空间的基于分解的分布估计算法
IF 2.2 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-10-05 DOI: 10.1016/j.vlsi.2024.102292
Yuan Yao, Huiliang Hong, Shanshan Wang, Chenglong Xiao
High-Level Synthesis (HLS) has evolved significantly due to the increasing complexity of integrated circuit design and the demand for efficient design methodologies. HLS, which raises the abstraction level of design specification, allows designers to focus on hardware functionality, thus enhancing productivity and reducing verification efforts. However, a key challenge in HLS is efficiently exploring the vast design space to find the Pareto-optimal designs. In this paper, we introduce a novel approach for multi-objective design space exploration in HLS. Our methodology decomposes the design space exploration problem into simpler sub-problems using the Multi-Objective Evolutionary Algorithm based on Decomposition (MOEA/D) framework and utilizes the Estimation of Distribution Algorithm (EDA) to build a probabilistic model for generating candidate solutions, thereby reducing the required number of expensive synthesis runs. Experimental results show that the proposed method has a faster convergence speed and reduces the number of syntheses by 24.34% to 32.01%, which significantly outperforms state-of-the-art works. Our approach achieves superior Pareto fronts with the lowest average ADRS value, outperforming Lattice-expl, ϵ -Constraint GA, and NSGA-II by 85.64%, 39.90%, and 33.31% respectively.
由于集成电路设计的复杂性不断增加,以及对高效设计方法的需求,高层合成(HLS)得到了长足的发展。HLS 提高了设计规范的抽象水平,使设计人员能够专注于硬件功能,从而提高了生产率并减少了验证工作。然而,HLS 面临的一个关键挑战是如何有效地探索广阔的设计空间,找到帕累托最优设计。本文介绍了一种在 HLS 中进行多目标设计空间探索的新方法。我们的方法利用基于分解的多目标进化算法(MOEA/D)框架将设计空间探索问题分解为更简单的子问题,并利用分布估计算法(EDA)建立概率模型以生成候选解决方案,从而减少了所需的昂贵合成运行次数。实验结果表明,所提出的方法收敛速度更快,合成次数减少了 24.34% 到 32.01%,明显优于最先进的方法。我们的方法以最低的平均 ADRS 值实现了卓越的帕累托前沿,分别比 Lattice-expl、ϵ -Constraint GA 和 NSGA-II 高出 85.64%、39.90% 和 33.31%。
{"title":"Decomposition based estimation of distribution algorithm for high-level synthesis design space exploration","authors":"Yuan Yao,&nbsp;Huiliang Hong,&nbsp;Shanshan Wang,&nbsp;Chenglong Xiao","doi":"10.1016/j.vlsi.2024.102292","DOIUrl":"10.1016/j.vlsi.2024.102292","url":null,"abstract":"<div><div>High-Level Synthesis (HLS) has evolved significantly due to the increasing complexity of integrated circuit design and the demand for efficient design methodologies. HLS, which raises the abstraction level of design specification, allows designers to focus on hardware functionality, thus enhancing productivity and reducing verification efforts. However, a key challenge in HLS is efficiently exploring the vast design space to find the Pareto-optimal designs. In this paper, we introduce a novel approach for multi-objective design space exploration in HLS. Our methodology decomposes the design space exploration problem into simpler sub-problems using the Multi-Objective Evolutionary Algorithm based on Decomposition (MOEA/D) framework and utilizes the Estimation of Distribution Algorithm (EDA) to build a probabilistic model for generating candidate solutions, thereby reducing the required number of expensive synthesis runs. Experimental results show that the proposed method has a faster convergence speed and reduces the number of syntheses by 24.34% to 32.01%, which significantly outperforms state-of-the-art works. Our approach achieves superior Pareto fronts with the lowest average ADRS value, outperforming Lattice-expl, <span><math><mi>ϵ</mi></math></span> -Constraint GA, and NSGA-II by 85.64%, 39.90%, and 33.31% respectively.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"100 ","pages":"Article 102292"},"PeriodicalIF":2.2,"publicationDate":"2024-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142419989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning placement order for constructive floorplanning 建设性平面规划的学习安置顺序
IF 2.2 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-10-02 DOI: 10.1016/j.vlsi.2024.102293
Weiqiang Yao , Yibo Lin , Lin Li
Floorplanning is an early and essential task of physical design. Recently, there has been a surge in the application of learning-based methods to tackle floorplanning problem. A prevalent approach involves training a reinforcement learning (RL) agent to sequentially place blocks on a chip canvas. However, existing methods mainly focus on learning block placement, relying on heuristic rules for placement order determination. In contrast to previous approaches, we propose an RL-based method to determine the placement order. Based on block features and states, an agent is trained to select the block for placement. Once a block is selected, we enumerate all potential relative positions captured by sequence pairs and select the optimal placement. After establishing the layout topology, we further optimize wirelength through linear programming. Experimental results demonstrate the effectiveness of our proposed method. On the original-outline MCNC benchmarks, our method achieves a notable 25.2% average improvement in wirelength compared to a recent learning-based method. Additionally, when applied to rescaled-outline benchmarks from MCNC and GSRC, our method outperforms state-of-the-art results, resulting in an average wirelength reduction of 12.5%.
平面规划是物理设计中一项重要的早期任务。最近,应用基于学习的方法来解决平面规划问题的热潮已经兴起。一种流行的方法是训练一个强化学习(RL)代理在芯片画布上按顺序放置区块。然而,现有的方法主要侧重于学习区块的摆放,依赖启发式规则来确定摆放顺序。与之前的方法不同,我们提出了一种基于 RL 的方法来确定放置顺序。根据图块特征和状态,对代理进行训练,以选择要放置的图块。一旦选择了区块,我们就会枚举序列对捕捉到的所有潜在相对位置,并选择最佳位置。建立布局拓扑后,我们通过线性编程进一步优化线长。实验结果证明了我们所提方法的有效性。在原始线性 MCNC 基准上,与最近一种基于学习的方法相比,我们的方法显著提高了 25.2% 的平均线长。此外,当应用于 MCNC 和 GSRC 的重缩放线性基准时,我们的方法优于最先进的结果,使平均线长减少了 12.5%。
{"title":"Learning placement order for constructive floorplanning","authors":"Weiqiang Yao ,&nbsp;Yibo Lin ,&nbsp;Lin Li","doi":"10.1016/j.vlsi.2024.102293","DOIUrl":"10.1016/j.vlsi.2024.102293","url":null,"abstract":"<div><div>Floorplanning is an early and essential task of physical design. Recently, there has been a surge in the application of learning-based methods to tackle floorplanning problem. A prevalent approach involves training a reinforcement learning (RL) agent to sequentially place blocks on a chip canvas. However, existing methods mainly focus on learning block placement, relying on heuristic rules for placement order determination. In contrast to previous approaches, we propose an RL-based method to determine the placement order. Based on block features and states, an agent is trained to select the block for placement. Once a block is selected, we enumerate all potential relative positions captured by sequence pairs and select the optimal placement. After establishing the layout topology, we further optimize wirelength through linear programming. Experimental results demonstrate the effectiveness of our proposed method. On the original-outline MCNC benchmarks, our method achieves a notable 25.2% average improvement in wirelength compared to a recent learning-based method. Additionally, when applied to rescaled-outline benchmarks from MCNC and GSRC, our method outperforms state-of-the-art results, resulting in an average wirelength reduction of 12.5%.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"100 ","pages":"Article 102293"},"PeriodicalIF":2.2,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142432447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-frequency weak signal detection based on Liu-like chaotic synchronization system and its hardware circuit implementation 基于类刘混沌同步系统的多频微弱信号检测及其硬件电路实现
IF 2.2 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-09-27 DOI: 10.1016/j.vlsi.2024.102290
Shaohui Yan, Zihao Guo, Jincai Song
Considering the shortcomings of traditional chaotic systems in weak signal detection methods, such as the high threshold sensitivity requirement and the narrow detection frequency domain. This study proposes a novel three-dimensional chaotic synchronization system, and the dynamical of the system are exhaustively characterized using equilibrium points, phase diagrams, Lyapunov exponential spectra, and bifurcation diagrams. This method involves weak signal detection by means of chaotic synchronization control. Synchronization of a chaotic system using a backstepping synchronization method is used to detect weak signals by analyzing the synchronization error after the introduction of weak signals in a strong noise background. The chaotic system is implemented by hardware circuits, and the simulation of chaotic synchronization control and detection of weak signals from the perspective of circuits is carried out by circuit simulation software. Additionally, the frequency range within which the system is capable of weak signal detection is tested through extensive simulation experiments. Finally, multi-frequency signals detection experiments are performed. The experimental results demonstrate that the system can accurately detect the frequency of weak signals address the limitations of narrow-band detection and multi-frequency signal detection is possible. Meanwhile, the circuit structure proposed in this paper is simple and has some value for engineering applications.
考虑到传统混沌系统在弱信号检测方法中存在阈值灵敏度要求高、检测频域窄等缺点。本研究提出了一种新型三维混沌同步系统,并利用平衡点、相位图、Lyapunov 指数谱和分岔图详尽描述了该系统的动力学特性。这种方法涉及通过混沌同步控制进行弱信号检测。通过分析在强噪声背景下引入微弱信号后的同步误差,使用反步进同步方法对混沌系统进行同步,从而检测微弱信号。混沌系统由硬件电路实现,并通过电路仿真软件从电路的角度对混沌同步控制和弱信号检测进行仿真。此外,还通过大量仿真实验测试了系统能够检测微弱信号的频率范围。最后,还进行了多频率信号检测实验。实验结果表明,该系统能准确检测微弱信号的频率,解决了窄带检测的局限性,并能实现多频信号检测。同时,本文提出的电路结构简单,具有一定的工程应用价值。
{"title":"Multi-frequency weak signal detection based on Liu-like chaotic synchronization system and its hardware circuit implementation","authors":"Shaohui Yan,&nbsp;Zihao Guo,&nbsp;Jincai Song","doi":"10.1016/j.vlsi.2024.102290","DOIUrl":"10.1016/j.vlsi.2024.102290","url":null,"abstract":"<div><div>Considering the shortcomings of traditional chaotic systems in weak signal detection methods, such as the high threshold sensitivity requirement and the narrow detection frequency domain. This study proposes a novel three-dimensional chaotic synchronization system, and the dynamical of the system are exhaustively characterized using equilibrium points, phase diagrams, Lyapunov exponential spectra, and bifurcation diagrams. This method involves weak signal detection by means of chaotic synchronization control. Synchronization of a chaotic system using a backstepping synchronization method is used to detect weak signals by analyzing the synchronization error after the introduction of weak signals in a strong noise background. The chaotic system is implemented by hardware circuits, and the simulation of chaotic synchronization control and detection of weak signals from the perspective of circuits is carried out by circuit simulation software. Additionally, the frequency range within which the system is capable of weak signal detection is tested through extensive simulation experiments. Finally, multi-frequency signals detection experiments are performed. The experimental results demonstrate that the system can accurately detect the frequency of weak signals address the limitations of narrow-band detection and multi-frequency signal detection is possible. Meanwhile, the circuit structure proposed in this paper is simple and has some value for engineering applications.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"100 ","pages":"Article 102290"},"PeriodicalIF":2.2,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142419988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hardware acceleration of Tiny YOLO deep neural networks for sign language recognition: A comprehensive performance analysis 用于手语识别的 Tiny YOLO 深度神经网络的硬件加速:综合性能分析
IF 2.2 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-09-26 DOI: 10.1016/j.vlsi.2024.102287
Mohita Jaiswal, Abhishek Sharma, Sandeep Saini
In this paper, we benchmark two automation frameworks, Vitis AI and FINN, for sign language recognition on a Field Programmable Gate Array (FPGA). We conducted an in-depth exploration of both frameworks using Tiny YOLOv2 networks by varying design parameters such as precision, parallelism ratio, etc. Further, a fair baseline comparison is made based on accuracy, speed, and hardware resources. Experimental findings demonstrate that the Vitis AI outperforms the FINN framework and traditional GPU and CPU platforms by achieving significant improvements of 1.08x, 1.7x, and 2.9x in terms of latency. Leveraging Vitis AI, our system achieved a detection speed of 32.7 frames per second (FPS) on the Kria KV260 FPGA with a power consumption rate of 5.6 W and an impressive mean Average Precision (mAP) score of 61.2% on the Hindi Indian Sign Language (ISL) dataset.
在本文中,我们在现场可编程门阵列(FPGA)上对 Vitis AI 和 FINN 这两个自动化框架进行了手语识别基准测试。我们使用 Tiny YOLOv2 网络,通过改变精度、并行比等设计参数,对这两个框架进行了深入探讨。此外,我们还根据精度、速度和硬件资源进行了公平的基线比较。实验结果表明,Vitis AI 的性能优于 FINN 框架以及传统的 GPU 和 CPU 平台,在延迟方面分别显著提高了 1.08 倍、1.7 倍和 2.9 倍。利用 Vitis AI,我们的系统在 Kria KV260 FPGA 上实现了每秒 32.7 帧 (FPS) 的检测速度,功耗仅为 5.6 W,在印地语印度手语 (ISL) 数据集上取得了 61.2% 的平均精确度 (mAP) 高分。
{"title":"Hardware acceleration of Tiny YOLO deep neural networks for sign language recognition: A comprehensive performance analysis","authors":"Mohita Jaiswal,&nbsp;Abhishek Sharma,&nbsp;Sandeep Saini","doi":"10.1016/j.vlsi.2024.102287","DOIUrl":"10.1016/j.vlsi.2024.102287","url":null,"abstract":"<div><div>In this paper, we benchmark two automation frameworks, Vitis AI and FINN, for sign language recognition on a Field Programmable Gate Array (FPGA). We conducted an in-depth exploration of both frameworks using Tiny YOLOv2 networks by varying design parameters such as precision, parallelism ratio, etc. Further, a fair baseline comparison is made based on accuracy, speed, and hardware resources. Experimental findings demonstrate that the Vitis AI outperforms the FINN framework and traditional GPU and CPU platforms by achieving significant improvements of 1.08x, 1.7x, and 2.9x in terms of latency. Leveraging Vitis AI, our system achieved a detection speed of 32.7 frames per second (FPS) on the Kria KV260 FPGA with a power consumption rate of 5.6 W and an impressive mean Average Precision (mAP) score of 61.2% on the Hindi Indian Sign Language (ISL) dataset.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"100 ","pages":"Article 102287"},"PeriodicalIF":2.2,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142419986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A hybrid memory polynomial digital predistortion model for RF transmitters 射频发射机的混合记忆多项式数字预失真模型
IF 2.2 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-09-24 DOI: 10.1016/j.vlsi.2024.102285
Jijun Ren, Ziyang Xu, Xing Wang
The power amplifier (PA), a key component of the transmitter system, operates near the saturation region, resulting in nonlinear distortion of the output signal, which affects the quality of the transmitter system. For this, a series of linearization techniques are used to compensate for distortion, one of the most effective and widely applied is the digital predistortion (DPD) technique. The traditional DPD models can be categorized into a single model or multiple models cascade or parallel. In this letter, a hybrid memory polynomial (HMP) model is proposed to further enhance the accuracy of the model, which is composed of multiple memory polynomial (MP) models by cascading and parallelizing. The experimental results show that the HMP model has better accuracy than the traditional MP model at the same complexity.
功率放大器(PA)是发射机系统的关键部件,其工作状态接近饱和区,导致输出信号非线性失真,从而影响发射机系统的质量。为此,人们采用了一系列线性化技术来补偿失真,其中最有效、应用最广泛的是数字预失真(DPD)技术。传统的 DPD 模型可分为单一模型或多个模型级联或并联。为了进一步提高模型的精度,本文提出了一种混合存储多项式(HMP)模型,该模型由多个存储多项式(MP)模型通过级联和并行的方式组成。实验结果表明,在相同复杂度下,HMP 模型比传统 MP 模型具有更高的精度。
{"title":"A hybrid memory polynomial digital predistortion model for RF transmitters","authors":"Jijun Ren,&nbsp;Ziyang Xu,&nbsp;Xing Wang","doi":"10.1016/j.vlsi.2024.102285","DOIUrl":"10.1016/j.vlsi.2024.102285","url":null,"abstract":"<div><div>The power amplifier (PA), a key component of the transmitter system, operates near the saturation region, resulting in nonlinear distortion of the output signal, which affects the quality of the transmitter system. For this, a series of linearization techniques are used to compensate for distortion, one of the most effective and widely applied is the digital predistortion (DPD) technique. The traditional DPD models can be categorized into a single model or multiple models cascade or parallel. In this letter, a hybrid memory polynomial (HMP) model is proposed to further enhance the accuracy of the model, which is composed of multiple memory polynomial (MP) models by cascading and parallelizing. The experimental results show that the HMP model has better accuracy than the traditional MP model at the same complexity.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"100 ","pages":"Article 102285"},"PeriodicalIF":2.2,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142318588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Integration-The Vlsi Journal
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1