Pub Date : 2024-08-30DOI: 10.1016/j.vlsi.2024.102268
This paper describes a low-power, low-noise capacitively-coupled instrumentation amplifier (CCIA) designed for capturing biopotential signals. The main advantage of proposed design are as (i) CCIA based on new IA has been proposed, (ii) the lower cutoff frequency has been improved by adding MOS based resistor, (iii) gm enhancement circuit is added in operational transconductance amplifier (OTA) based fully differential difference amplifier (FDDA)to improve gain and bandwidth. The DC electrode-offset voltage is reduced and the input impedance is increased by using feedback mechanism. Cadence EDA tool is used to analyze the findings of the proposed CCIA's in 0.18 μm, CMOS technology with a 1.8 V power supply. The proposed CCIA architecture has an adjustable mid-band gain from 52.55 to 61.11 dB for bias voltage ranges from 0.1 to 0.6 V, frequency range of 0.06 Hz–1.72 kHz, and a CMRR of 122 dB. The proposed CCIA has a total power dissipation of 190.47 nW and equivalent input referred noise (IRN) of 14.4 nV/sqrtHz at 0.01 Hz. It only occupies 0.01 mm2 of core area. To assess the robustness of suggested design, PVT analysis, post layout simulation and a comparison with previously published works demonstrates the competence of the design.
{"title":"A novel tunable capacitively-copuled instrumentation amplifier with 14.4 nV/ √(H z) noise and 190.47 nW micro-power for ECG applications","authors":"","doi":"10.1016/j.vlsi.2024.102268","DOIUrl":"10.1016/j.vlsi.2024.102268","url":null,"abstract":"<div><p>This paper describes a low-power, low-noise capacitively-coupled instrumentation amplifier (CCIA) designed for capturing biopotential signals. The main advantage of proposed design are as (i) CCIA based on new IA has been proposed, (ii) the lower cutoff frequency has been improved by adding MOS based resistor, (iii) g<sub>m</sub> enhancement circuit is added in operational transconductance amplifier (OTA) based fully differential difference amplifier (FDDA)to improve gain and bandwidth. The DC electrode-offset voltage is reduced and the input impedance is increased by using feedback mechanism. Cadence EDA tool is used to analyze the findings of the proposed CCIA's in 0.18 μm, CMOS technology with a 1.8 V power supply. The proposed CCIA architecture has an adjustable mid-band gain from 52.55 to 61.11 dB for bias voltage ranges from 0.1 to 0.6 V, frequency range of 0.06 Hz–1.72 kHz, and a CMRR of 122 dB. The proposed CCIA has a total power dissipation of 190.47 nW and equivalent input referred noise (IRN) of 14.4 nV/sqrtHz at 0.01 Hz. It only occupies 0.01 mm<sup>2</sup> of core area. To assess the robustness of suggested design, PVT analysis, post layout simulation and a comparison with previously published works demonstrates the competence of the design.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142128398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-24DOI: 10.1016/j.vlsi.2024.102259
This article deals with the analysis of irregularly shaped single turn octagonal spiral inductors for millimeter-wave and sub-THz CMOS IC designs. Simulations and experimental results, along with theoretical formulations, are used to characterize these irregular structures. This article proposes a novel approach for efficient use of silicon chip area by reshaping the on-chip inductors used in millimeter wave (mm-wave) applications without compromising the performance of the inductors. Especially in CMOS RFICs when a space constraint exists in either - or -direction in their layout, such reshaping can be attempted. Moreover, two novel methods of reshaping the inductors are proposed and studied thoroughly. The study of these irregular shapes has interesting conclusions, which are validated through on-wafer measurements. Certain methods of reshaping result in inductors which do not have degradation in their quality factors (), while other approaches degrade the . Based on these insights, a design methodology is proposed for designers who need to reshape their inductors to irregular structures while not compromising on the quality factor. The measurement results agree with the simulations and prove that the proposed reshaping is practically possible.
本文分析了用于毫米波和超高频 CMOS 集成电路设计的不规则形状单匝八角螺旋电感器。仿真和实验结果以及理论公式被用来描述这些不规则结构的特征。本文提出了一种新方法,通过重塑毫米波(mm-wave)应用中使用的片上电感器,在不影响电感器性能的前提下有效利用硅芯片面积。特别是在 CMOS 射频集成电路中,当其布局的 X 或 Y 方向存在空间限制时,可以尝试这种重塑方法。此外,我们还提出并深入研究了两种重塑电感器的新方法。对这些不规则形状的研究得出了有趣的结论,并通过晶圆上的测量进行了验证。某些重塑方法会导致电感器的品质因数(Q 值)不降低,而其他方法则会降低 Q 值。基于这些见解,我们为需要将电感器重塑为不规则结构同时又不影响品质因数的设计人员提出了一种设计方法。测量结果与模拟结果一致,证明所提出的重塑方法切实可行。
{"title":"Experimental analysis of irregularly shaped octagonal on-chip inductors for improving area-efficiency in CMOS RFICs for millimeter wave applications","authors":"","doi":"10.1016/j.vlsi.2024.102259","DOIUrl":"10.1016/j.vlsi.2024.102259","url":null,"abstract":"<div><p>This article deals with the analysis of irregularly shaped single turn octagonal spiral inductors for millimeter-wave and sub-THz CMOS IC designs. Simulations and experimental results, along with theoretical formulations, are used to characterize these irregular structures. This article proposes a novel approach for efficient use of silicon chip area by reshaping the on-chip inductors used in millimeter wave (mm-wave) applications without compromising the performance of the inductors. Especially in CMOS RFICs when a space constraint exists in either <span><math><mi>X</mi></math></span>- or <span><math><mi>Y</mi></math></span>-direction in their layout, such reshaping can be attempted. Moreover, two novel methods of reshaping the inductors are proposed and studied thoroughly. The study of these irregular shapes has interesting conclusions, which are validated through on-wafer measurements. Certain methods of reshaping result in inductors which do not have degradation in their quality factors (<span><math><mi>Q</mi></math></span>), while other approaches degrade the <span><math><mi>Q</mi></math></span>. Based on these insights, a design methodology is proposed for designers who need to reshape their inductors to irregular structures while not compromising on the quality factor. The measurement results agree with the simulations and prove that the proposed reshaping is practically possible.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142095436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-22DOI: 10.1016/j.vlsi.2024.102265
Minimizing the testing cost is crucial in the context of the design for test (DFT) flow. In our observation, the test patterns generated by ATPG tools in test compression mode still contain redundancy. To tackle this obstacle, we propose a post-flow static test compaction method that utilizes a partial fault dictionary instead of a full fault dictionary to sharply reduce time and memory overhead, and leverages a dedicated Pure MaxSAT solver to re-compact the test patterns generated by ATPG tools. We also observe that ATPG tools offer a more comprehensive selection of candidate patterns for compaction in the “n-detect” mode, leading to superior compaction efficiency. In our experiments conducted on benchmark circuits ISCAS89, ITC99, and an open-source RISC-V CPU, we employed two methodologies. For commercial tool, we utilized a non-intrusive approach, while we adopted an intrusive method for open-source ATPG. Under the non-intrusive approach, our method achieved a maximum reduction of 34.69% in pattern count and a maximum 29.80% decrease in test cycles as evaluated by a leading commercial tool. Meanwhile, under the intrusive approach, our method attained a maximum 71.90% reduction in pattern count as evaluated by an open-source ATPG tool. Notably, fault coverage remained unchanged throughout the experiments. Furthermore, our approach demonstrates improved performance compared with existing methods.
{"title":"A fast test compaction method using dedicated Pure MaxSAT solver embedded in DFT flow","authors":"","doi":"10.1016/j.vlsi.2024.102265","DOIUrl":"10.1016/j.vlsi.2024.102265","url":null,"abstract":"<div><p>Minimizing the testing cost is crucial in the context of the design for test (DFT) flow. In our observation, the test patterns generated by ATPG tools in test compression mode still contain redundancy. To tackle this obstacle, we propose a post-flow static test compaction method that utilizes a partial fault dictionary instead of a full fault dictionary to sharply reduce time and memory overhead, and leverages a dedicated Pure MaxSAT solver to re-compact the test patterns generated by ATPG tools. We also observe that ATPG tools offer a more comprehensive selection of candidate patterns for compaction in the “n-detect” mode, leading to superior compaction efficiency. In our experiments conducted on benchmark circuits ISCAS89, ITC99, and an open-source RISC-V CPU, we employed two methodologies. For commercial tool, we utilized a non-intrusive approach, while we adopted an intrusive method for open-source ATPG. Under the non-intrusive approach, our method achieved a maximum reduction of 34.69% in pattern count and a maximum 29.80% decrease in test cycles as evaluated by a leading commercial tool. Meanwhile, under the intrusive approach, our method attained a maximum 71.90% reduction in pattern count as evaluated by an open-source ATPG tool. Notably, fault coverage remained unchanged throughout the experiments. Furthermore, our approach demonstrates improved performance compared with existing methods.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142095435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-19DOI: 10.1016/j.vlsi.2024.102261
In response to the evolving technological landscape, the traditional clock network architecture faces challenges in meeting the complexities of modern System-on-Chip (SoC) designs. While the clock mesh topology offers resilience against On-Chip Variation (OCV) fluctuations, its manual implementation leaves room for advancements in methodology and swift analytical techniques. This paper introduces an innovative clock mesh synthesis approach, leveraging dynamic programming algorithms and emphasizing compliance with critical physical implementation parameters. Our experimental results demonstrate a significant 26.6% reduction in power consumption compared to baseline methodologies. Moreover, it achieves an impressive average runtime reduction of 78.0% when contrasted with traditional simulation methods. These findings underscore the potential of our methodology to enhance the efficiency and power management of clock mesh designs.
{"title":"Clock mesh synthesis through dynamic programming with physical parameters consideration","authors":"","doi":"10.1016/j.vlsi.2024.102261","DOIUrl":"10.1016/j.vlsi.2024.102261","url":null,"abstract":"<div><p>In response to the evolving technological landscape, the traditional clock network architecture faces challenges in meeting the complexities of modern System-on-Chip (SoC) designs. While the clock mesh topology offers resilience against On-Chip Variation (OCV) fluctuations, its manual implementation leaves room for advancements in methodology and swift analytical techniques. This paper introduces an innovative clock mesh synthesis approach, leveraging dynamic programming algorithms and emphasizing compliance with critical physical implementation parameters. Our experimental results demonstrate a significant 26.6% reduction in power consumption compared to baseline methodologies. Moreover, it achieves an impressive average runtime reduction of 78.0% when contrasted with traditional simulation methods. These findings underscore the potential of our methodology to enhance the efficiency and power management of clock mesh designs.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142095548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-19DOI: 10.1016/j.vlsi.2024.102262
In recent years, the application of deep learning (DL) models has sparked considerable interest in timing prediction within the place-and-route (P&R) flow of IC chip design. Specifically, at the pre-route stage, an accurate prediction of post-route timing is challenging due to the lack of sufficient physical information. However, achieving precise timing prediction significantly accelerates the design closure process, saving considerable time and effort. In this work, we propose pre-route timing prediction and optimization framework with graph neural network (GNN) models combined with convolution neural network (CNN). Our framework is divided into two main stages, each of which is further subdivided into smaller steps. Precisely, our GNN-driven arc delay/slew prediction model is divided into two levels: in level-1, it predicts net resistance (net R) and net capacitance (net C) using GNN while the arc length is predicted using CNN. These predictions are hierarchically passed on to level-2 where delay/slew is estimated with our GNN based prediction model. The timing optimization model utilizes the precise delay/slew predictions obtained from the GNN-driven prediction model to accurately set the path margin during the timing optimization stage. This approach effectively reduces unnecessary turn-around iterations in the commercial EDA tools. Experimental results show that by using our proposed framework in P&R, we are able to improve the pre-route prediction accuracy by 42%/36% on average on arc delay/slew, and improve timing metrics in terms of WNS, TNS, and the number of timing violation paths by 77%, 77%, and 64%, which are an increase of 32%/35% on arc delay/slew and 30%, 20% and 31% on timing optimization compared with the existing DL prediction model.
{"title":"Pre-route timing prediction and optimization with graph neural network models","authors":"","doi":"10.1016/j.vlsi.2024.102262","DOIUrl":"10.1016/j.vlsi.2024.102262","url":null,"abstract":"<div><p>In recent years, the application of deep learning (DL) models has sparked considerable interest in timing prediction within the place-and-route (P&R) flow of IC chip design. Specifically, at the pre-route stage, an accurate prediction of post-route timing is challenging due to the lack of sufficient physical information. However, achieving precise timing prediction significantly accelerates the design closure process, saving considerable time and effort. In this work, we propose pre-route timing prediction and optimization framework with graph neural network (GNN) models combined with convolution neural network (CNN). Our framework is divided into two main stages, each of which is further subdivided into smaller steps. Precisely, our GNN-driven arc delay/slew prediction model is divided into two levels: in level-1, it predicts net resistance (net R) and net capacitance (net C) using GNN while the arc length is predicted using CNN. These predictions are hierarchically passed on to level-2 where delay/slew is estimated with our GNN based prediction model. The timing optimization model utilizes the precise delay/slew predictions obtained from the GNN-driven prediction model to accurately set the path margin during the timing optimization stage. This approach effectively reduces unnecessary turn-around iterations in the commercial EDA tools. Experimental results show that by using our proposed framework in P&R, we are able to improve the pre-route prediction accuracy by 42%/36% on average on arc delay/slew, and improve timing metrics in terms of WNS, TNS, and the number of timing violation paths by 77%, 77%, and 64%, which are an increase of 32%/35% on arc delay/slew and 30%, 20% and 31% on timing optimization compared with the existing DL prediction model.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142044946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-17DOI: 10.1016/j.vlsi.2024.102263
In global routing, congestion and running time are the key factors that affect the quality of the solution. With the rapid growth of integrated chip scale, striking a balance between running time and congestion has become a bottleneck in improving design quality. In this paper, we propose a highly efficient and effective global router to address this challenge. We first propose an efficient R-tree-based compatible routing region partitioning algorithm for collecting routable regions, which offers robust support for ideal parallel routing scheduling. Then, taking into account the effect of the barrel effect on congestion evaluation and the detrimental impact of loops, a congestion-driven initial parallel routing scheme is proposed to enhance routability in the triaxial pattern routing structure. After that, we develop an accurate congestion estimation model and an optimized path-searching scheme, which are instrumental in effectively managing smaller congestion gradient variations and guiding efficient congestion reduction. We evaluate the performance of our algorithm on the ISPD 2018 and ISPD 2019 contest benchmark suites and compare it with the state-of-the-art work. Experimental results show that our proposed algorithm significantly reduces 71% overflows, improving 65% running time, and the total wirelength is even smaller.
{"title":"A fast and high-performance global router with enhanced congestion control","authors":"","doi":"10.1016/j.vlsi.2024.102263","DOIUrl":"10.1016/j.vlsi.2024.102263","url":null,"abstract":"<div><p>In global routing, congestion and running time are the key factors that affect the quality of the solution. With the rapid growth of integrated chip scale, striking a balance between running time and congestion has become a bottleneck in improving design quality. In this paper, we propose a highly efficient and effective global router to address this challenge. We first propose an efficient R-tree-based compatible routing region partitioning algorithm for collecting routable regions, which offers robust support for ideal parallel routing scheduling. Then, taking into account the effect of the barrel effect on congestion evaluation and the detrimental impact of loops, a congestion-driven initial parallel routing scheme is proposed to enhance routability in the triaxial pattern routing structure. After that, we develop an accurate congestion estimation model and an optimized path-searching scheme, which are instrumental in effectively managing smaller congestion gradient variations and guiding efficient congestion reduction. We evaluate the performance of our algorithm on the ISPD 2018 and ISPD 2019 contest benchmark suites and compare it with the state-of-the-art work. Experimental results show that our proposed algorithm significantly reduces 71% overflows, improving 65% running time, and the total wirelength is even smaller.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142050087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-14DOI: 10.1016/j.vlsi.2024.102260
Advances in wearable technology, IoT, and mobile applications have increased the demand for ultra-low-power electronic devices. Adiabatic Logic Circuit (ALC) is a design technique utilized in digital circuits to decrease the power consumption by decreasing the dynamic power dissipation. Current technologies face challenges in achieving both high performance and ultra-low power consumption. This research work introduces a novel approach in digital circuit design, specifically the Gate All-around Carbon Nanotube Field Effect Transistor with Discrepancy Cascode Pass Transistor Adiabatic Logic (GAA-CNTFET-DCPTAL), tailored for ultra-low power applications. This design operates efficiently with a four-phase Power Clock (PC) and demonstrates remarkable performance by achieving operation frequencies of up to 1 GHz while minimizing energy dissipation. GAA-CNTFET provides superior electrostatic control and high carrier mobility, reducing leakage currents and enhancing switching speeds. Simultaneously, Discrepancy Cascode Pass Transistor Adiabatic Logic (DCPTAL) uses adiabatic logic principles and a cascode structure to minimize energy dissipation during switching events. The technology node of proposed model is 10 nm. The software used for assessment is HSPICE is used for the simulation and validation of the proposed design. The proposed GAA-design attains 25.36 %, 14.28 %, and 16.06 % lower average power analyzed with existing techniques, such as Design with Evaluation of Clocked Differential Adiabatic Logic Families for the applications of low Power (DE-CDAL-LPA), Adiabatic logic-base strong ARM comparator for ultra-low power applications (AL-SARM-ULPA) and Analysis of 2PADCL Energy Recovery Logic for Ultra Low Power VLSI Design for SOC with Embedded Applications (2PADCL-ULP-VLSI) respectively.
{"title":"Gate all around carbon nanotube field effect transistor espoused discrepancy cascode pass transistor adiabatic logic for ultra-low power application","authors":"","doi":"10.1016/j.vlsi.2024.102260","DOIUrl":"10.1016/j.vlsi.2024.102260","url":null,"abstract":"<div><p>Advances in wearable technology, IoT, and mobile applications have increased the demand for ultra-low-power electronic devices. Adiabatic Logic Circuit (ALC) is a design technique utilized in digital circuits to decrease the power consumption by decreasing the dynamic power dissipation. Current technologies face challenges in achieving both high performance and ultra-low power consumption. This research work introduces a novel approach in digital circuit design, specifically the Gate All-around Carbon Nanotube Field Effect Transistor with Discrepancy Cascode Pass Transistor Adiabatic Logic (GAA-CNTFET-DCPTAL), tailored for ultra-low power applications. This design operates efficiently with a four-phase Power Clock (PC) and demonstrates remarkable performance by achieving operation frequencies of up to 1 GHz while minimizing energy dissipation. GAA-CNTFET provides superior electrostatic control and high carrier mobility, reducing leakage currents and enhancing switching speeds. Simultaneously, Discrepancy Cascode Pass Transistor Adiabatic Logic (DCPTAL) uses adiabatic logic principles and a cascode structure to minimize energy dissipation during switching events. The technology node of proposed model is 10 nm. The software used for assessment is HSPICE is used for the simulation and validation of the proposed design. The proposed GAA-design attains 25.36 %, 14.28 %, and 16.06 % lower average power analyzed with existing techniques, such as Design with Evaluation of Clocked Differential Adiabatic Logic Families for the applications of low Power (DE-CDAL-LPA), Adiabatic logic-base strong ARM comparator for ultra-low power applications (AL-SARM-ULPA) and Analysis of 2PADCL Energy Recovery Logic for Ultra Low Power VLSI Design for SOC with Embedded Applications (2PADCL-ULP-VLSI) respectively.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142095547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-12DOI: 10.1016/j.vlsi.2024.102258
In this paper, a fully integrated memristive Chua’s chaotic circuit based on the voltage-controlled oscillator is proposed. The memristor replaces the nonlinear diode, and the VCO (voltage-controlled oscillator) replaces the LC oscillator, eliminating the need for diodes, resistors, capacitors, and other complex circuit structures. The proposed chaotic circuit occupies a small chip area, only 0.0045 , and achieves low power consumption of 2.8267 . The chaotic circuit is fabricated using the SMIC 180 nm CMOS process. The simulation results demonstrate that the VCO circuit can generate a frequency output ranging from 358 MHz to 1.1 GHz by varying Vc from 0 V to 2.8 V, with a power supply of 3.3 V. The value range of the Lyapunov index is 1.015 1.03. The circuit offers advantages such as a stable power supply, low power consumption, and a small chip area.
本文提出了一种基于压控振荡器的全集成忆阻器蔡氏混沌电路。忆阻器取代了非线性二极管,VCO(压控振荡器)取代了 LC 振荡器,从而省去了二极管、电阻器、电容器和其他复杂的电路结构。所提出的混沌电路占用芯片面积小,仅为 0.0045 mm2,功耗低,仅为 2.8267 mW。混沌电路采用中芯国际 180 纳米 CMOS 工艺制造。仿真结果表明,在 3.3 V 的电源电压下,Vc 在 0 V 至 2.8 V 之间变化时,VCO 电路可产生 358 MHz 至 1.1 GHz 的频率输出。该电路具有电源稳定、功耗低、芯片面积小等优点。
{"title":"Implementation of a fully integrated memristive Chua’s chaotic circuit with a voltage-controlled oscillator","authors":"","doi":"10.1016/j.vlsi.2024.102258","DOIUrl":"10.1016/j.vlsi.2024.102258","url":null,"abstract":"<div><p>In this paper, a fully integrated memristive Chua’s chaotic circuit based on the voltage-controlled oscillator is proposed. The memristor replaces the nonlinear diode, and the VCO (voltage-controlled oscillator) replaces the LC oscillator, eliminating the need for diodes, resistors, capacitors, and other complex circuit structures. The proposed chaotic circuit occupies a small chip area, only 0.0045 <span><math><msup><mrow><mi>mm</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span>, and achieves low power consumption of 2.8267 <span><math><mi>mW</mi></math></span>. The chaotic circuit is fabricated using the SMIC 180 nm CMOS process. The simulation results demonstrate that the VCO circuit can generate a frequency output ranging from 358 MHz to 1.1 GHz by varying Vc from 0 V to 2.8 V, with a power supply of 3.3 V. The value range of the Lyapunov index is 1.015 <span><math><mo>∼</mo></math></span>1.03. The circuit offers advantages such as a stable power supply, low power consumption, and a small chip area.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141993661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-08DOI: 10.1016/j.vlsi.2024.102256
In this paper, a fast hardware accelerator for defogging based on image fusion is proposed. This method overcomes the problem of model based defogging algorithms being unable to estimate atmospheric light in dark scenes, as well as the poor performance of learning based defogging algorithms at night. Through hardware implementation and optimization, while reducing system resources, it can meet the demand for real-time defogging. The entire algorithm consists of difference guided filtering, grayscale linear stretching, and image fusion. The difference oriented filtering algorithm can enhance edges by obtaining image information of bright and dark channels, and has better effects on night lighting. Gray-scale linear stretching can restore the overall brightness and edge information of the image, compensating for some halos and noise caused by difference guided filtering. Numerous experiments have shown that the proposed hardware accelerator for defogging performs best at night. It can also be used effectively during the day. In addition, it has the fastest processing speed, which can process the images with the size of 1920*1080 for 34.5fps in real time.
{"title":"A fast hardware accelerator for nighttime fog removal based on image fusion","authors":"","doi":"10.1016/j.vlsi.2024.102256","DOIUrl":"10.1016/j.vlsi.2024.102256","url":null,"abstract":"<div><p>In this paper, a fast hardware accelerator for defogging based on image fusion is proposed. This method overcomes the problem of model based defogging algorithms being unable to estimate atmospheric light in dark scenes, as well as the poor performance of learning based defogging algorithms at night. Through hardware implementation and optimization, while reducing system resources, it can meet the demand for real-time defogging. The entire algorithm consists of difference guided filtering, grayscale linear stretching, and image fusion. The difference oriented filtering algorithm can enhance edges by obtaining image information of bright and dark channels, and has better effects on night lighting. Gray-scale linear stretching can restore the overall brightness and edge information of the image, compensating for some halos and noise caused by difference guided filtering. Numerous experiments have shown that the proposed hardware accelerator for defogging performs best at night. It can also be used effectively during the day. In addition, it has the fastest processing speed, which can process the images with the size of 1920*1080 for 34.5fps in real time.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141997438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-08DOI: 10.1016/j.vlsi.2024.102255
FPGAs, characterized by their low power consumption and swift response, are ideally suited for parallel computations associated with object detection tasks, making them a popular choice for target detection and neural network acceleration. However, contemporary FPGA designs often come with high costs and resource demands, which limit their adoption in resource-constrained embedded and edge devices. This study presents a novel design that addresses these limitations by emphasizing cost-effectiveness, energy efficiency, and rapid performance, particularly for single-shot multi-box detectors. The design employs an Xilinx ZYNQ7020-based main control chip and leverages parallel computing models for convolution layers and feature extraction. It enhances efficiency by proposing parallel feature extraction at the network architecture level and integrates convolution activation and pooling in a single, hardware-optimized operation for convolution kernel computations. The design employs alternating memory reuse for feature layer inputs and outputs to optimize memory management, thereby reducing read/write delays and transmission times. Implemented on a PYNQ-Z2 development board and tested using Jupyter Notebook, the SSD algorithm demonstrates a 789.4 GOPS inference performance with 16-bit fixed-point quantization at a 200MHz clock frequency, achieving an average accuracy of 77.84% and an inference time of 81.4621 ms, while consuming 1.595 watts of power. This innovative design significantly boosts energy efficiency by up to 2590%, outperforming contemporary methods.
{"title":"Efficient deployment of Single Shot Multibox Detector network on FPGAs","authors":"","doi":"10.1016/j.vlsi.2024.102255","DOIUrl":"10.1016/j.vlsi.2024.102255","url":null,"abstract":"<div><p>FPGAs, characterized by their low power consumption and swift response, are ideally suited for parallel computations associated with object detection tasks, making them a popular choice for target detection and neural network acceleration. However, contemporary FPGA designs often come with high costs and resource demands, which limit their adoption in resource-constrained embedded and edge devices. This study presents a novel design that addresses these limitations by emphasizing cost-effectiveness, energy efficiency, and rapid performance, particularly for single-shot multi-box detectors. The design employs an Xilinx ZYNQ7020-based main control chip and leverages parallel computing models for convolution layers and feature extraction. It enhances efficiency by proposing parallel feature extraction at the network architecture level and integrates convolution activation and pooling in a single, hardware-optimized operation for convolution kernel computations. The design employs alternating memory reuse for feature layer inputs and outputs to optimize memory management, thereby reducing read/write delays and transmission times. Implemented on a PYNQ-Z2 development board and tested using Jupyter Notebook, the SSD algorithm demonstrates a 789.4 GOPS inference performance with 16-bit fixed-point quantization at a 200MHz clock frequency, achieving an average accuracy of 77.84% and an inference time of 81.4621 ms, while consuming 1.595 watts of power. This innovative design significantly boosts energy efficiency by up to 2590%, outperforming contemporary methods.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141979893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}