Although the use of third-party netlist IP can enhance the quality of integrated circuit products and reduce development cycles, it also introduces potential security vulnerabilities. Identifying state registers in sequential netlists is a commonly adopted technique to assist engineers in understanding the control logic of unknown gate-level netlists. Traditional graph theory-based detection methods, such as RELIC and FSMX-ultra, suffer from low accuracy and high computational complexity. Recent graph neural network-based detection methods, such as ReIGNN, also exhibit limited accuracy, with many data DFFs being misclassified as state DFFs. In this article, we propose a graph transformer-based method, FSMformer, which utilizes bidirectional message passing as the local module and direction-aware linear fast attention as the global module, to enable the simultaneous extraction of structural and functional features from sequential netlists, thereby achieving efficient and accurate detection of state DFFs in large-scale netlists. According to the experimental results, our proposed FSMformer outperforms not only the state-of-the-art graph theory-based method FSMX-ultra and the state-of-the-art GNN-based method ReIGNN, but also various advanced neural network baselines that we employed for state DFFs detection.
{"title":"FSMformer: An efficient direction-aware graph transformer for state register detection of gate-level netlist","authors":"Zongtai Li, Liang Yang, Hao Li, Mian Lou, Zeyu Yang, Weidong Xu","doi":"10.1016/j.vlsi.2026.102656","DOIUrl":"10.1016/j.vlsi.2026.102656","url":null,"abstract":"<div><div>Although the use of third-party netlist IP can enhance the quality of integrated circuit products and reduce development cycles, it also introduces potential security vulnerabilities. Identifying state registers in sequential netlists is a commonly adopted technique to assist engineers in understanding the control logic of unknown gate-level netlists. Traditional graph theory-based detection methods, such as RELIC and FSMX-ultra, suffer from low accuracy and high computational complexity. Recent graph neural network-based detection methods, such as ReIGNN, also exhibit limited accuracy, with many data DFFs being misclassified as state DFFs. In this article, we propose a graph transformer-based method, FSMformer, which utilizes bidirectional message passing as the local module and direction-aware linear fast attention as the global module, to enable the simultaneous extraction of structural and functional features from sequential netlists, thereby achieving efficient and accurate detection of state DFFs in large-scale netlists. According to the experimental results, our proposed FSMformer outperforms not only the state-of-the-art graph theory-based method FSMX-ultra and the state-of-the-art GNN-based method ReIGNN, but also various advanced neural network baselines that we employed for state DFFs detection.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"108 ","pages":"Article 102656"},"PeriodicalIF":2.5,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145928871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-06DOI: 10.1016/j.vlsi.2026.102654
George B. Nardes, Thiago H. Rausch, Bruna H. Pereira, Douglas R. Melo, Cesar A. Zeferino
Convolutional Neural Networks (CNNs) are widely used in Automatic License Plate Recognition (ALPR) systems for Optical Character Recognition (OCR). Still, their computational cost often restricts deployment on edge devices. This work presents an 8-bit quantized CNN with a hardware-oriented dataflow designed specifically for OCR of Mercosur and Brazilian license plates. The model was trained using quantization-aware techniques and implemented on two FPGA platforms from different vendors, Altera Cyclone V and AMD Zynq UltraScale+, using the same VHDL architecture. The Zynq UltraScale+ implementation achieves 97.1% OCR accuracy, 2.12 ms latency, and 922 FPS in pipelined mode, while the Cyclone V version delivers 458 FPS with reduced BRAM and DSP usage. Energy measurements show 1.62 mJ per inference on Zynq UltraScale+ and 3.32 mJ on Cyclone V, confirming suitability for low-power, real-time ALPR. The results demonstrate that a portable 8-bit design can maintain accuracy comparable to that of floating-point models while achieving substantial gains in throughput and energy efficiency across heterogeneous FPGA devices.
{"title":"A resource-constrained CNN accelerator for real-time license plate character recognition on FPGA platforms","authors":"George B. Nardes, Thiago H. Rausch, Bruna H. Pereira, Douglas R. Melo, Cesar A. Zeferino","doi":"10.1016/j.vlsi.2026.102654","DOIUrl":"10.1016/j.vlsi.2026.102654","url":null,"abstract":"<div><div>Convolutional Neural Networks (CNNs) are widely used in Automatic License Plate Recognition (ALPR) systems for Optical Character Recognition (OCR). Still, their computational cost often restricts deployment on edge devices. This work presents an 8-bit quantized CNN with a hardware-oriented dataflow designed specifically for OCR of Mercosur and Brazilian license plates. The model was trained using quantization-aware techniques and implemented on two FPGA platforms from different vendors, Altera Cyclone V and AMD Zynq UltraScale+, using the same VHDL architecture. The Zynq UltraScale+ implementation achieves 97.1% OCR accuracy, 2.12 ms latency, and 922 FPS in pipelined mode, while the Cyclone V version delivers 458 FPS with reduced BRAM and DSP usage. Energy measurements show 1.62 mJ per inference on Zynq UltraScale+ and 3.32 mJ on Cyclone V, confirming suitability for low-power, real-time ALPR. The results demonstrate that a portable 8-bit design can maintain accuracy comparable to that of floating-point models while achieving substantial gains in throughput and energy efficiency across heterogeneous FPGA devices.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"108 ","pages":"Article 102654"},"PeriodicalIF":2.5,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145928875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-06DOI: 10.1016/j.vlsi.2026.102653
Viet-Thanh Pham , Victor Kamdoum Tamba , Luigi Fortuna
Non-equilibrium oscillators are special because of their hidden attractors. This work introduces a non-equilibrium oscillator, which is implemented using a reduced number of resistors. Its implementation requires a diode instead of analog multipliers. Our oscillator is easily realized with common off-the-shelf components in the laboratory, making it suitable for educational purposes. Dynamics of the oscillator are investigated to present its special features. In addition, the usage of the oscillator for generating random signal is presented illustrating its possible application.
{"title":"Non-equilibrium oscillator with a diode: Dynamics and application","authors":"Viet-Thanh Pham , Victor Kamdoum Tamba , Luigi Fortuna","doi":"10.1016/j.vlsi.2026.102653","DOIUrl":"10.1016/j.vlsi.2026.102653","url":null,"abstract":"<div><div>Non-equilibrium oscillators are special because of their hidden attractors. This work introduces a non-equilibrium oscillator, which is implemented using a reduced number of resistors. Its implementation requires a diode instead of analog multipliers. Our oscillator is easily realized with common off-the-shelf components in the laboratory, making it suitable for educational purposes. Dynamics of the oscillator are investigated to present its special features. In addition, the usage of the oscillator for generating random signal is presented illustrating its possible application.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"108 ","pages":"Article 102653"},"PeriodicalIF":2.5,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145928874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-05DOI: 10.1016/j.vlsi.2026.102646
S. Karipidis , A. Buzo , G. Pelz , T. Noulis
Analog and Mixed Signal circuit sizing with large-scale parameters requires a lot of simulations, especially in non-linear topology where large-signal analysis is a need. Reducing the number of simulations and in general the total design cycle time, is the main objective for optimal sizing of complicated circuits. In this work a circuit sizing automated design methodology is presented using the hybrid dual annealing and Nelder–Mead algorithm, significantly reducing the design cycle time and the required number of transient simulations. A customized hybrid algorithm environment using Dual Annealing and Nelder–Mead is developed where the optimization process is divided into different optimization sub-steps. The proposed hybrid algorithm based method achieves rapid convergence to the needed circuit performance specification. It uses combinations of direct search algorithms to separate metric evaluation accelerating the performance specifications convergence speed in a large parameter space. A complicated non-linear topology like a product level low-dropout (LDO) regulator, in 180 nm process node, with 30 parameters is used as the circuit vehicle to verify the proposed methodology. The sizing process converged with less than 1700 simulations having as input just the circuit schematic with no prior sizing knowledge. Sub optimization is also performed focused on each analysis type — DC, AC and transient, with a focus on reducing the number of transient simulations. The proposed combined algorithm method achieved 31 % faster convergence speed compared to the state-of-the-art methods and handles efficiently each simulation analysis.
{"title":"Hybrid algorithm based optimization strategies for analog circuit sizing in low dropout regulators","authors":"S. Karipidis , A. Buzo , G. Pelz , T. Noulis","doi":"10.1016/j.vlsi.2026.102646","DOIUrl":"10.1016/j.vlsi.2026.102646","url":null,"abstract":"<div><div>Analog and Mixed Signal circuit sizing with large-scale parameters requires a lot of simulations, especially in non-linear topology where large-signal analysis is a need. Reducing the number of simulations and in general the total design cycle time, is the main objective for optimal sizing of complicated circuits. In this work a circuit sizing automated design methodology is presented using the hybrid dual annealing and Nelder–Mead algorithm, significantly reducing the design cycle time and the required number of transient simulations. A customized hybrid algorithm environment using Dual Annealing and Nelder–Mead is developed where the optimization process is divided into different optimization sub-steps. The proposed hybrid algorithm based method achieves rapid convergence to the needed circuit performance specification. It uses combinations of direct search algorithms to separate metric evaluation accelerating the performance specifications convergence speed in a large parameter space. A complicated non-linear topology like a product level low-dropout (LDO) regulator, in 180 nm process node, with 30 parameters is used as the circuit vehicle to verify the proposed methodology. The sizing process converged with less than 1700 simulations having as input just the circuit schematic with no prior sizing knowledge. Sub optimization is also performed focused on each analysis type — DC, AC and transient, with a focus on reducing the number of transient simulations. The proposed combined algorithm method achieved 31 % faster convergence speed compared to the state-of-the-art methods and handles efficiently each simulation analysis.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"108 ","pages":"Article 102646"},"PeriodicalIF":2.5,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-03DOI: 10.1016/j.vlsi.2025.102645
J. Banumathi, G. Karthy
Signal processing widely uses Finite Impulse Response (FIR) filters because of their stability and linear phase. However, traditional FIR filter designs are limited by multiplication operations that lead to high hardware utilization and delay. To address this, modified FIR filters with optimized multipliers and adders are being developed to improve hardware resource utilization and delay performance. This paper presents novel designs for 8-tap and 16-tap FIR filters, leveraging Brent-Kung Adders (BKA) and Pipelined (P) Urdhva Triyakbhyam Vedic multipliers (UTVM) to achieve minimal delay and enhanced performance. Three architectures—Pipelined FIR using UTVM with BKA (PFIR-UTVM-BKA), FIR using PUTVM with BKA (FIR-PUTVM-BKA), and Pipelined FIR using PUTVM with BKA (PFIR-PUTVM-BKA)—were implemented with different device specifications on Kintex-7, Virtex-7, and Zynq 7000 platforms using Xilinx Vivado 2022.2 and ASIC 45 nm, simulated in Verilog. In FPGA, the proposed multiplier reduces delay by 43 %, 61.55 %, 73.01 %, and 78.51 % across different bit widths, and power, delay, and (power delay Product) PDP were reduced by 82.16 %,94.34 % and 98.99 % respectively, in ASIC. Additionally, the proposed FIR filter architectures achieve significant improvements, including 51.88 % and 27.13 % delay reduction, 75.40 % slice improvement, and 92.53 % and 97.03 % enhancement in slice registers for 8-tap and 16-tap 8-bit designs in FPGA, and power, delay, and PDP (Power Delay Product) were reduced by 87.97 %,97.59 % and 99.71 % respectively, in ASIC. These advancements make the proposed FIR filters highly suitable for high-speed digital signal processing (DSP) applications, where efficient processing and minimized latency are crucial. Integrating PVM and BKA plays a pivotal role in achieving these performance enhancements, positioning these filter designs as promising solutions for next-generation signal processing systems.
{"title":"High-performance FIR filter designs using Brent Kung Adder and pipelined Vedic multiplier","authors":"J. Banumathi, G. Karthy","doi":"10.1016/j.vlsi.2025.102645","DOIUrl":"10.1016/j.vlsi.2025.102645","url":null,"abstract":"<div><div>Signal processing widely uses Finite Impulse Response (FIR) filters because of their stability and linear phase. However, traditional FIR filter designs are limited by multiplication operations that lead to high hardware utilization and delay. To address this, modified FIR filters with optimized multipliers and adders are being developed to improve hardware resource utilization and delay performance. This paper presents novel designs for 8-tap and 16-tap FIR filters, leveraging Brent-Kung Adders (BKA) and Pipelined (P) Urdhva Triyakbhyam Vedic multipliers (UTVM) to achieve minimal delay and enhanced performance. Three architectures—Pipelined FIR using UTVM with BKA (PFIR-UTVM-BKA), FIR using PUTVM with BKA (FIR-PUTVM-BKA), and Pipelined FIR using PUTVM with BKA (PFIR-PUTVM-BKA)—were implemented with different device specifications on Kintex-7, Virtex-7, and Zynq 7000 platforms using Xilinx Vivado 2022.2 and ASIC 45 nm, simulated in Verilog. In FPGA, the proposed multiplier reduces delay by 43 %, 61.55 %, 73.01 %, and 78.51 % across different bit widths, and power, delay, and (power delay Product) PDP were reduced by 82.16 %,94.34 % and 98.99 % respectively, in ASIC. Additionally, the proposed FIR filter architectures achieve significant improvements, including 51.88 % and 27.13 % delay reduction, 75.40 % slice improvement, and 92.53 % and 97.03 % enhancement in slice registers for 8-tap and 16-tap 8-bit designs in FPGA, and power, delay, and PDP (Power Delay Product) were reduced by 87.97 %,97.59 % and 99.71 % respectively, in ASIC. These advancements make the proposed FIR filters highly suitable for high-speed digital signal processing (DSP) applications, where efficient processing and minimized latency are crucial. Integrating PVM and BKA plays a pivotal role in achieving these performance enhancements, positioning these filter designs as promising solutions for next-generation signal processing systems.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"108 ","pages":"Article 102645"},"PeriodicalIF":2.5,"publicationDate":"2026-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145928873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-03DOI: 10.1016/j.vlsi.2025.102640
Jinlin Chen , Huaguo Liang , Yingchun Lu , Liang Yao
With the rise of the internet and electronic devices, the security of network information has garnered increasing attention. True Random Number Generators (TRNGs) play an increasingly important role in information security. TRNG entropy sources based on Ring Oscillator (RO) have attracted significant interest due to their simple circuit design and ease of implementation on FPGAs. However, most existing works suffer from high hardware overhead. A novel ultra-lightweight TRNG based on multi-mode switching of RO-like rings is proposed in this work, which can be automatically placed and routed on the Xilinx Artix-7 FPGA, using only 10 LUTs and 2 D flip-flops. The randomness of the entropy source is analyzed through a mathematical model, proving that the output sequence is an unordered random bit string under any circumstances. The output sequence of the TRNG successfully passed various tests, including autocorrelation tests, NIST SP800-22, NIST SP800-90B, AIS-31, and TESTU01, with favorable results.
{"title":"RO-like ring-based TRNG with adaptive mode switching for enhanced entropy Harvesting","authors":"Jinlin Chen , Huaguo Liang , Yingchun Lu , Liang Yao","doi":"10.1016/j.vlsi.2025.102640","DOIUrl":"10.1016/j.vlsi.2025.102640","url":null,"abstract":"<div><div>With the rise of the internet and electronic devices, the security of network information has garnered increasing attention. True Random Number Generators (TRNGs) play an increasingly important role in information security. TRNG entropy sources based on Ring Oscillator (RO) have attracted significant interest due to their simple circuit design and ease of implementation on FPGAs. However, most existing works suffer from high hardware overhead. A novel ultra-lightweight TRNG based on multi-mode switching of RO-like rings is proposed in this work, which can be automatically placed and routed on the Xilinx Artix-7 FPGA, using only 10 LUTs and 2 D flip-flops. The randomness of the entropy source is analyzed through a mathematical model, proving that the output sequence is an unordered random bit string under any circumstances. The output sequence of the TRNG successfully passed various tests, including autocorrelation tests, NIST SP800-22, NIST SP800-90B, AIS-31, and TESTU01, with favorable results.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"108 ","pages":"Article 102640"},"PeriodicalIF":2.5,"publicationDate":"2026-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-02DOI: 10.1016/j.vlsi.2025.102644
Yongliang Zhou , Yingxue Sun , Beibei Zhang , Wangyong Si , Jingxue Zhong , Weizhe Tan , Chunyu Peng , Xiulong Wu
This article primarily investigates the temperature characteristics of magnetic tunnel junctions (MTJ) and exploits their tunnel magneto-resistance (TMR) effect to compensate for temperature-induced mismatches arising from CMOS temperature variations. In this work, the beta-multiplier reference and Kuijk bandgap reference are simulated on the 28 nm node. The proposed designs achieve improvements in temperature stability while maintaining low-voltage operation. Specifically, the circuit was redesigned based on a beta-multiplier voltage-reference topology in which MTJs replace resistors, with the explicit aim of strengthening the negative-feedback loop. As a result, stable operation is achieved at a minimum supply voltage of 0.6 V over the temperature range 0 °C to 110 °C. The reference voltage exhibits a line sensitivity of 0.57%/V, a temperature coefficient of 35.2 ppm/°C, and a nominal output of 441.6 mV. Drawing on MTJ-based compensation techniques, the modified Kuijk bandgap reference replaces the BJT-based temperature-compensation element with an MTJ device, yielding a reference voltage with a line sensitivity of 0.096%/V, a temperature coefficient of 11.8 ppm/°C, and a nominal output of 1.65 V. Beyond the electrical benefits, the use of MTJs enables vertical integration of the compensating elements, delivering substantial area savings — approximately 50% reduction for the beta-multiplier implementation and about 66.5% for the modified bandgap — thereby producing a more compact and competitive solution for high-performance precision voltage references.
{"title":"Low-temperature-drift voltage reference design using magnetic tunnel junctions","authors":"Yongliang Zhou , Yingxue Sun , Beibei Zhang , Wangyong Si , Jingxue Zhong , Weizhe Tan , Chunyu Peng , Xiulong Wu","doi":"10.1016/j.vlsi.2025.102644","DOIUrl":"10.1016/j.vlsi.2025.102644","url":null,"abstract":"<div><div>This article primarily investigates the temperature characteristics of magnetic tunnel junctions (MTJ) and exploits their tunnel magneto-resistance (TMR) effect to compensate for temperature-induced mismatches arising from CMOS temperature variations. In this work, the beta-multiplier reference and Kuijk bandgap reference are simulated on the 28 nm node. The proposed designs achieve improvements in temperature stability while maintaining low-voltage operation. Specifically, the circuit was redesigned based on a beta-multiplier voltage-reference topology in which MTJs replace resistors, with the explicit aim of strengthening the negative-feedback loop. As a result, stable operation is achieved at a minimum supply voltage of 0.6 V over the temperature range 0 °C to 110 °C. The reference voltage exhibits a line sensitivity of 0.57%/V, a temperature coefficient of 35.2 ppm/°C, and a nominal output of 441.6 mV. Drawing on MTJ-based compensation techniques, the modified Kuijk bandgap reference replaces the BJT-based temperature-compensation element with an MTJ device, yielding a reference voltage with a line sensitivity of 0.096%/V, a temperature coefficient of 11.8 ppm/°C, and a nominal output of 1.65 V. Beyond the electrical benefits, the use of MTJs enables vertical integration of the compensating elements, delivering substantial area savings — approximately 50% reduction for the beta-multiplier implementation and about 66.5% for the modified bandgap — thereby producing a more compact and competitive solution for high-performance precision voltage references.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"108 ","pages":"Article 102644"},"PeriodicalIF":2.5,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145928872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-02DOI: 10.1016/j.vlsi.2025.102638
Tiago Almeida , Isaías Felzmann , Lucas Wanner
Approximate Computing can optimize resource usage in HLS (High-Level Synthesis) by mapping certain operations to components that have lower resource utilization but introduce small errors into application outputs. A fundamental challenge is identifying a set of approximate components for implementing operators in an accelerator design while achieving optimal resource utilization and accuracy. We introduce an input-aware heuristic approach that uses application inputs to model output errors more effectively. In this approach, operators in accelerators, such as adders and multipliers, are mapped to a library of precharacterized approximate components. Applications are executed with a set of training inputs and candidate solutions are selected based on a metric that combines output errors and estimated resource utilization. The results demonstrate that the approach can find appropriate approximate designs for a given error threshold. For image processing applications, the input-aware heuristic was able to save LUT and FF by up to 55% for less than 25% output degradation. Similar savings were shown for a CNN model with less than 0.8% accuracy degradation.
{"title":"A heuristic approach for near Pareto-optimal design space exploration in Approximate High-Level Synthesis","authors":"Tiago Almeida , Isaías Felzmann , Lucas Wanner","doi":"10.1016/j.vlsi.2025.102638","DOIUrl":"10.1016/j.vlsi.2025.102638","url":null,"abstract":"<div><div>Approximate Computing can optimize resource usage in HLS (High-Level Synthesis) by mapping certain operations to components that have lower resource utilization but introduce small errors into application outputs. A fundamental challenge is identifying a set of approximate components for implementing operators in an accelerator design while achieving optimal resource utilization and accuracy. We introduce an input-aware heuristic approach that uses application inputs to model output errors more effectively. In this approach, operators in accelerators, such as adders and multipliers, are mapped to a library of precharacterized approximate components. Applications are executed with a set of training inputs and candidate solutions are selected based on a metric that combines output errors and estimated resource utilization. The results demonstrate that the approach can find appropriate approximate designs for a given error threshold. For image processing applications, the input-aware heuristic was able to save LUT and FF by up to 55% for less than 25% output degradation. Similar savings were shown for a CNN model with less than 0.8% accuracy degradation.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"108 ","pages":"Article 102638"},"PeriodicalIF":2.5,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145895858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper proposes a genetic algorithm-optimized fuzzy controller for calibrating nonlinear errors in pipelined ADCs. Given the correspondence between the sub-ADC quantization codes and the errors in pipelined ADCs, this study employs a single-input single-output fuzzy controller to establish the mapping between the sub-ADC quantization codes and the errors. To fully utilize the fitting capabilities of the fuzzy controller, genetic algorithms are used to determine the optimal design parameters of the fuzzy controller. The developed single-input single-output fuzzy controller can fully achieve the fitting of nonlinear errors while maintaining a simple structure and low hardware implementation complexity. The optimal fuzzy controller is implemented on a Xilinx Kintex-7 FPGA and applied to calibrate a 14-bit 61 MS/s pipelined ADC. Experimental results demonstrate that after calibration with the optimized fuzzy controller, SNDR and SFDR are improved by 29.6 dB and 44.7 dB, respectively.
{"title":"Genetic algorithm-optimized fuzzy controller for the calibration of pipelined ADCs","authors":"Luotian Wu, Honghui Deng, Jiashen Li, Muqi Li, Long Li, Yongsheng Yin","doi":"10.1016/j.vlsi.2025.102643","DOIUrl":"10.1016/j.vlsi.2025.102643","url":null,"abstract":"<div><div>This paper proposes a genetic algorithm-optimized fuzzy controller for calibrating nonlinear errors in pipelined ADCs. Given the correspondence between the sub-ADC quantization codes and the errors in pipelined ADCs, this study employs a single-input single-output fuzzy controller to establish the mapping between the sub-ADC quantization codes and the errors. To fully utilize the fitting capabilities of the fuzzy controller, genetic algorithms are used to determine the optimal design parameters of the fuzzy controller. The developed single-input single-output fuzzy controller can fully achieve the fitting of nonlinear errors while maintaining a simple structure and low hardware implementation complexity. The optimal fuzzy controller is implemented on a Xilinx Kintex-7 FPGA and applied to calibrate a 14-bit 61 MS/s pipelined ADC. Experimental results demonstrate that after calibration with the optimized fuzzy controller, SNDR and SFDR are improved by 29.6 dB and 44.7 dB, respectively.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"107 ","pages":"Article 102643"},"PeriodicalIF":2.5,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145883657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-25DOI: 10.1016/j.vlsi.2025.102642
Mehmet Dogan , Erkan Yuce , Shahram Minaei
In this study, two first-order voltage-mode universal filters based on the plus-type second-generation current conveyors (CCII+s) are proposed. Each filter employs two CCII+s, a grounded capacitor, and three resistors. Each filter exhibits the feature of universality, i.e., they realize low-pass filter, high-pass filter, and all-pass filter (APF) responses. Additionally, the APF responses offer electronically tunable gain through grounded resistors, eliminating the need for extra amplifier stages. Total harmonic distortion variations of the APFs are low. Dynamic ranges of the proposed filters are wide. However, they require a passive element matching condition and include two floating resistors. As application examples, two quadrature oscillators are presented. Extensive SPICE simulations are conducted using 180 nm TSMC technology parameters. Experimental validations are also carried out using commercially available AD844 active devices.
{"title":"First-order universal filters with two CCII+s and a grounded capacitor: Theory and experimental validation","authors":"Mehmet Dogan , Erkan Yuce , Shahram Minaei","doi":"10.1016/j.vlsi.2025.102642","DOIUrl":"10.1016/j.vlsi.2025.102642","url":null,"abstract":"<div><div>In this study, two first-order voltage-mode universal filters based on the plus-type second-generation current conveyors (CCII+s) are proposed. Each filter employs two CCII+s, a grounded capacitor, and three resistors. Each filter exhibits the feature of universality, i.e., they realize low-pass filter, high-pass filter, and all-pass filter (APF) responses. Additionally, the APF responses offer electronically tunable gain through grounded resistors, eliminating the need for extra amplifier stages. Total harmonic distortion variations of the APFs are low. Dynamic ranges of the proposed filters are wide. However, they require a passive element matching condition and include two floating resistors. As application examples, two quadrature oscillators are presented. Extensive SPICE simulations are conducted using 180 nm TSMC technology parameters. Experimental validations are also carried out using commercially available AD844 active devices.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"107 ","pages":"Article 102642"},"PeriodicalIF":2.5,"publicationDate":"2025-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145839827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}