首页 > 最新文献

Integration-The Vlsi Journal最新文献

英文 中文
FSMformer: An efficient direction-aware graph transformer for state register detection of gate-level netlist FSMformer:一种用于门级网表状态寄存器检测的高效方向感知图形变压器
IF 2.5 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-05-01 Epub Date: 2026-01-07 DOI: 10.1016/j.vlsi.2026.102656
Zongtai Li, Liang Yang, Hao Li, Mian Lou, Zeyu Yang, Weidong Xu
Although the use of third-party netlist IP can enhance the quality of integrated circuit products and reduce development cycles, it also introduces potential security vulnerabilities. Identifying state registers in sequential netlists is a commonly adopted technique to assist engineers in understanding the control logic of unknown gate-level netlists. Traditional graph theory-based detection methods, such as RELIC and FSMX-ultra, suffer from low accuracy and high computational complexity. Recent graph neural network-based detection methods, such as ReIGNN, also exhibit limited accuracy, with many data DFFs being misclassified as state DFFs. In this article, we propose a graph transformer-based method, FSMformer, which utilizes bidirectional message passing as the local module and direction-aware linear fast attention as the global module, to enable the simultaneous extraction of structural and functional features from sequential netlists, thereby achieving efficient and accurate detection of state DFFs in large-scale netlists. According to the experimental results, our proposed FSMformer outperforms not only the state-of-the-art graph theory-based method FSMX-ultra and the state-of-the-art GNN-based method ReIGNN, but also various advanced neural network baselines that we employed for state DFFs detection.
虽然使用第三方网表IP可以提高集成电路产品的质量,缩短开发周期,但也引入了潜在的安全漏洞。识别顺序网表中的状态寄存器是帮助工程师理解未知门级网表控制逻辑的一种常用技术。传统的基于图论的检测方法,如RELIC和FSMX-ultra,准确率低,计算量大。最近基于图神经网络的检测方法,如ReIGNN,也表现出有限的准确性,许多数据dff被错误地分类为状态dff。本文提出了一种基于图变换的FSMformer方法,该方法以双向消息传递为局部模块,以方向感知线性快速注意为全局模块,能够同时从序列网络列表中提取结构特征和功能特征,从而实现大规模网络列表中状态dff的高效、准确检测。根据实验结果,我们提出的FSMformer不仅优于最先进的基于图论的方法FSMX-ultra和最先进的基于gnn的方法ReIGNN,而且优于我们用于状态dff检测的各种先进的神经网络基线。
{"title":"FSMformer: An efficient direction-aware graph transformer for state register detection of gate-level netlist","authors":"Zongtai Li,&nbsp;Liang Yang,&nbsp;Hao Li,&nbsp;Mian Lou,&nbsp;Zeyu Yang,&nbsp;Weidong Xu","doi":"10.1016/j.vlsi.2026.102656","DOIUrl":"10.1016/j.vlsi.2026.102656","url":null,"abstract":"<div><div>Although the use of third-party netlist IP can enhance the quality of integrated circuit products and reduce development cycles, it also introduces potential security vulnerabilities. Identifying state registers in sequential netlists is a commonly adopted technique to assist engineers in understanding the control logic of unknown gate-level netlists. Traditional graph theory-based detection methods, such as RELIC and FSMX-ultra, suffer from low accuracy and high computational complexity. Recent graph neural network-based detection methods, such as ReIGNN, also exhibit limited accuracy, with many data DFFs being misclassified as state DFFs. In this article, we propose a graph transformer-based method, FSMformer, which utilizes bidirectional message passing as the local module and direction-aware linear fast attention as the global module, to enable the simultaneous extraction of structural and functional features from sequential netlists, thereby achieving efficient and accurate detection of state DFFs in large-scale netlists. According to the experimental results, our proposed FSMformer outperforms not only the state-of-the-art graph theory-based method FSMX-ultra and the state-of-the-art GNN-based method ReIGNN, but also various advanced neural network baselines that we employed for state DFFs detection.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"108 ","pages":"Article 102656"},"PeriodicalIF":2.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145928871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive-precision SIMD architecture for high-throughput and resource-efficient DNN acceleration 用于高吞吐量和资源高效DNN加速的自适应精度SIMD架构
IF 2.5 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-05-01 Epub Date: 2026-01-16 DOI: 10.1016/j.vlsi.2026.102666
Vasundhara Trivedi , Harman Singh Bagga , Gopal Raut , Santosh Kumar Vishvakarma
Deep Neural Network (DNN) accelerators require high computational throughput and flexible precision support while operating under stringent resource and power constraints. To address these challenges, we propose an adaptive-precision SIMD (Single Instruction, Multiple Data) Processing Element (PE) architecture for signed integer and fixed-point operations that maximizes resource utilization and enhances parallelism in multiply–accumulate (MAC) computations. The design introduces efficient resource reuse during partial product accumulation and supports both symmetric and asymmetric precision modes. Unlike state-of-the-art approaches, the proposed PE dynamically scales computation: processing 16 operands at low precision (4-bit), four operands at medium precision (8-bit), and a single operand at high precision (16-bit). Additionally, it supports asymmetric operations such as 16 × 4-bit multiplications in parallel, enabling unique flexibility and performance gains. The architecture is implemented and tested on ASIC and FPGA platforms. Accuracy evaluations across different DNN models and datasets show very small losses at reduced precision—less than 1% for LeNet on MNIST, 2.9% for AlexNet on CIFAR-10, 2.2% for VGG16 on CIFAR-10, and 3.5% for VGG16 on ImageNet-1000 compared to float32. Hardware synthesis yields significant improvements, including 46.2% fewer LUTs and 2.45 × less power on FPGA compared to existing designs. The proposed architecture delivers 2× higher throughput, upto 4.8× energy efficiency with 28.57% less area at 65 nm, compared to existing works, making it ideal for applications with variable precision and limited resources.
深度神经网络(DNN)加速器需要高计算吞吐量和灵活的精度支持,同时在严格的资源和功率限制下运行。为了解决这些挑战,我们提出了一种自适应精度SIMD(单指令,多数据)处理元素(PE)架构,用于有符号整数和定点操作,最大限度地提高资源利用率并增强乘法累积(MAC)计算的并行性。该设计在部分产品积累过程中引入了有效的资源重用,并支持对称和非对称精度模式。与最先进的方法不同,所提出的PE动态扩展计算:以低精度(4位)处理16个操作数,以中等精度(8位)处理4个操作数,以及以高精度(16位)处理单个操作数。此外,它还支持并行16 × 4位乘法等非对称操作,从而实现了独特的灵活性和性能提升。该体系结构在ASIC和FPGA平台上进行了实现和测试。不同DNN模型和数据集的精度评估显示,与float32相比,LeNet在MNIST上的精度损失非常小,AlexNet在CIFAR-10上的精度损失不到1%,VGG16在CIFAR-10上的精度损失不到2.2%,VGG16在ImageNet-1000上的精度损失不到3.5%。硬件合成产生了显著的改进,包括与现有设计相比,FPGA的lut减少了46.2%,功耗降低了2.45倍。与现有的架构相比,该架构的吞吐量提高了2倍,能效提高了4.8倍,65nm的面积减少了28.57%,使其成为可变精度和有限资源应用的理想选择。
{"title":"Adaptive-precision SIMD architecture for high-throughput and resource-efficient DNN acceleration","authors":"Vasundhara Trivedi ,&nbsp;Harman Singh Bagga ,&nbsp;Gopal Raut ,&nbsp;Santosh Kumar Vishvakarma","doi":"10.1016/j.vlsi.2026.102666","DOIUrl":"10.1016/j.vlsi.2026.102666","url":null,"abstract":"<div><div>Deep Neural Network (DNN) accelerators require high computational throughput and flexible precision support while operating under stringent resource and power constraints. To address these challenges, we propose an adaptive-precision SIMD (Single Instruction, Multiple Data) Processing Element (PE) architecture for signed integer and fixed-point operations that maximizes resource utilization and enhances parallelism in multiply–accumulate (MAC) computations. The design introduces efficient resource reuse during partial product accumulation and supports both symmetric and asymmetric precision modes. Unlike state-of-the-art approaches, the proposed PE dynamically scales computation: processing 16 operands at low precision (4-bit), four operands at medium precision (8-bit), and a single operand at high precision (16-bit). Additionally, it supports asymmetric operations such as 16 <span><math><mo>×</mo></math></span> 4-bit multiplications in parallel, enabling unique flexibility and performance gains. The architecture is implemented and tested on ASIC and FPGA platforms. Accuracy evaluations across different DNN models and datasets show very small losses at reduced precision—less than 1% for LeNet on MNIST, 2.9% for AlexNet on CIFAR-10, 2.2% for VGG16 on CIFAR-10, and 3.5% for VGG16 on ImageNet-1000 compared to float32. Hardware synthesis yields significant improvements, including 46.2% fewer LUTs and 2.45 <span><math><mo>×</mo></math></span> less power on FPGA compared to existing designs. The proposed architecture delivers 2<span><math><mo>×</mo></math></span> higher throughput, upto 4.8<span><math><mo>×</mo></math></span> energy efficiency with 28.57% less area at 65 nm, compared to existing works, making it ideal for applications with variable precision and limited resources.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"108 ","pages":"Article 102666"},"PeriodicalIF":2.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A low overhead chemical measurement architecture with memristive sensors 一种具有忆阻传感器的低开销化学测量架构
IF 2.5 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-05-01 Epub Date: 2026-02-07 DOI: 10.1016/j.vlsi.2026.102673
Meenakshi Devi , Saurabh Khandelwal , Marek Vidiš , Tomas Plecenik , Abusaleh Jabir
Memristors, traditionally considered as non-volatile resistive memories for high-density applications, also exhibit excellent sensitivity to chemicals, making them suitable for chemical sensing with intrinsic memory capabilities. This paper introduces an innovative technique for directly measuring and digitising sensor readings, such as gas concentration, using the switching state of the device, which is influenced by the applied bias voltage or current in the presence of chemicals. When a sensor itself detects and measures a chemical property, its state changes, enabling the direct digitisation of the sensed information. The proposed memristive sensor employs a TiO2 based memristor as both the sensing and digitising element, and is evaluated using SPICE simulations with hydrogen gas (H2) at different concentrations. We present a calibration curve that establishes a reliable correlation between pulse counts and chemical concentration, highlighting the consistent relationship between switching behaviour and concentration levels. This method significantly reduces the reliance on separate analogue to digital converters (ADC), simplifying the sensor architecture in terms of power consumption and circuit complexity. Additionally, the inherent nonlinearity of the fabricated devices renders this digitisation method significantly nonlinear, which can provide an added layer of security to the measured information. This approach paves the way for compact, low-power chemical sensing nodes, making them suitable for future integrated environmental monitoring systems.
记忆电阻器,传统上被认为是高密度应用的非易失性电阻存储器,也表现出对化学品的优异灵敏度,使其适合具有固有记忆能力的化学传感。本文介绍了一种创新的技术,用于直接测量和数字化传感器读数,如气体浓度,利用设备的开关状态,这是由施加的偏置电压或电流在存在的化学物质的影响。当传感器自身检测和测量化学性质时,其状态会发生变化,从而实现感知信息的直接数字化。所提出的忆阻传感器采用TiO2基忆阻器作为传感元件和数字化元件,并使用不同浓度的氢气(H2)进行SPICE模拟。我们提出了一条校准曲线,该曲线在脉冲计数和化学浓度之间建立了可靠的相关性,突出了开关行为和浓度水平之间的一致关系。这种方法大大减少了对单独的模拟数字转换器(ADC)的依赖,在功耗和电路复杂性方面简化了传感器架构。此外,所制造的器件固有的非线性使得这种数字化方法显着非线性,这可以为测量信息提供额外的安全层。这种方法为紧凑、低功耗的化学传感节点铺平了道路,使其适用于未来的综合环境监测系统。
{"title":"A low overhead chemical measurement architecture with memristive sensors","authors":"Meenakshi Devi ,&nbsp;Saurabh Khandelwal ,&nbsp;Marek Vidiš ,&nbsp;Tomas Plecenik ,&nbsp;Abusaleh Jabir","doi":"10.1016/j.vlsi.2026.102673","DOIUrl":"10.1016/j.vlsi.2026.102673","url":null,"abstract":"<div><div>Memristors, traditionally considered as non-volatile resistive memories for high-density applications, also exhibit excellent sensitivity to chemicals, making them suitable for chemical sensing with intrinsic memory capabilities. This paper introduces an innovative technique for directly measuring and digitising sensor readings, such as gas concentration, using the switching state of the device, which is influenced by the applied bias voltage or current in the presence of chemicals. When a sensor itself detects and measures a chemical property, its state changes, enabling the direct digitisation of the sensed information. The proposed memristive sensor employs a TiO<sub>2</sub> based memristor as both the sensing and digitising element, and is evaluated using SPICE simulations with hydrogen gas (H<sub>2</sub>) at different concentrations. We present a calibration curve that establishes a reliable correlation between pulse counts and chemical concentration, highlighting the consistent relationship between switching behaviour and concentration levels. This method significantly reduces the reliance on separate analogue to digital converters (ADC), simplifying the sensor architecture in terms of power consumption and circuit complexity. Additionally, the inherent nonlinearity of the fabricated devices renders this digitisation method significantly nonlinear, which can provide an added layer of security to the measured information. This approach paves the way for compact, low-power chemical sensing nodes, making them suitable for future integrated environmental monitoring systems.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"108 ","pages":"Article 102673"},"PeriodicalIF":2.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146189105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A resource-constrained CNN accelerator for real-time license plate character recognition on FPGA platforms 基于FPGA平台的实时车牌字符识别CNN加速器
IF 2.5 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-05-01 Epub Date: 2026-01-06 DOI: 10.1016/j.vlsi.2026.102654
George B. Nardes, Thiago H. Rausch, Bruna H. Pereira, Douglas R. Melo, Cesar A. Zeferino
Convolutional Neural Networks (CNNs) are widely used in Automatic License Plate Recognition (ALPR) systems for Optical Character Recognition (OCR). Still, their computational cost often restricts deployment on edge devices. This work presents an 8-bit quantized CNN with a hardware-oriented dataflow designed specifically for OCR of Mercosur and Brazilian license plates. The model was trained using quantization-aware techniques and implemented on two FPGA platforms from different vendors, Altera Cyclone V and AMD Zynq UltraScale+, using the same VHDL architecture. The Zynq UltraScale+ implementation achieves 97.1% OCR accuracy, 2.12 ms latency, and 922 FPS in pipelined mode, while the Cyclone V version delivers 458 FPS with reduced BRAM and DSP usage. Energy measurements show 1.62 mJ per inference on Zynq UltraScale+ and 3.32 mJ on Cyclone V, confirming suitability for low-power, real-time ALPR. The results demonstrate that a portable 8-bit design can maintain accuracy comparable to that of floating-point models while achieving substantial gains in throughput and energy efficiency across heterogeneous FPGA devices.
卷积神经网络(cnn)广泛应用于车牌自动识别系统的光学字符识别(OCR)中。尽管如此,它们的计算成本往往限制了在边缘设备上的部署。这项工作提出了一个8位量化CNN,具有专门为南方共同市场和巴西车牌OCR设计的面向硬件的数据流。该模型使用量化感知技术进行训练,并在来自不同供应商的Altera Cyclone V和AMD Zynq UltraScale+两个FPGA平台上使用相同的VHDL架构实现。Zynq UltraScale+实现在流水线模式下实现97.1%的OCR精度,2.12 ms延迟和922 FPS,而Cyclone V版本提供458 FPS,减少BRAM和DSP使用。能量测量显示,Zynq UltraScale+上的每推断1.62 mJ, Cyclone V上的3.32 mJ,证实了低功耗、实时ALPR的适用性。结果表明,便携式8位设计可以保持与浮点模型相当的精度,同时在异构FPGA器件上实现吞吐量和能效的大幅提高。
{"title":"A resource-constrained CNN accelerator for real-time license plate character recognition on FPGA platforms","authors":"George B. Nardes,&nbsp;Thiago H. Rausch,&nbsp;Bruna H. Pereira,&nbsp;Douglas R. Melo,&nbsp;Cesar A. Zeferino","doi":"10.1016/j.vlsi.2026.102654","DOIUrl":"10.1016/j.vlsi.2026.102654","url":null,"abstract":"<div><div>Convolutional Neural Networks (CNNs) are widely used in Automatic License Plate Recognition (ALPR) systems for Optical Character Recognition (OCR). Still, their computational cost often restricts deployment on edge devices. This work presents an 8-bit quantized CNN with a hardware-oriented dataflow designed specifically for OCR of Mercosur and Brazilian license plates. The model was trained using quantization-aware techniques and implemented on two FPGA platforms from different vendors, Altera Cyclone V and AMD Zynq UltraScale+, using the same VHDL architecture. The Zynq UltraScale+ implementation achieves 97.1% OCR accuracy, 2.12 ms latency, and 922 FPS in pipelined mode, while the Cyclone V version delivers 458 FPS with reduced BRAM and DSP usage. Energy measurements show 1.62 mJ per inference on Zynq UltraScale+ and 3.32 mJ on Cyclone V, confirming suitability for low-power, real-time ALPR. The results demonstrate that a portable 8-bit design can maintain accuracy comparable to that of floating-point models while achieving substantial gains in throughput and energy efficiency across heterogeneous FPGA devices.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"108 ","pages":"Article 102654"},"PeriodicalIF":2.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145928875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EOHEAA: Error-Optimized Hardware-Efficient Approximate Adder for energy-aware error-resilient applications EOHEAA:用于能量感知错误弹性应用的错误优化硬件高效近似加法器
IF 2.5 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-05-01 Epub Date: 2026-01-09 DOI: 10.1016/j.vlsi.2026.102660
Prateek Goyal, Sujit Kumar Sahoo
This work introduces a novel Error-Optimized Hardware-Efficient Approximate Adder (EOHEAA) tailored for error-resilient computing tasks, where precision can be traded for improvements in energy, delay, and resource efficiency. The EOHEAA adopts a strategic method of controlled error propagation, enabling significant enhancement in accuracy metrics such as Mean Error Distance (MED), Mean Relative Error Distance (MRED), and Normalized MED (NMED), while maintaining minimal hardware overhead. Synthesized on the Artix-7 FPGA (XC7A35T1CPG236C) using Verilog HDL, EOHEAA achieves up to 38.6% reduction in power consumption, an 34% improvement in critical path delay, and notable savings in logic resources compared to conventional and state-of-the-art approximate adder designs. Comprehensive analysis across 8, 16, and 32-bit configurations further confirms its scalability and robustness, with PDP improvements reaching 71.5% in wider designs. Notably, EOHEAA outperforms several existing designs by achieving the lowest RMSE (32.21), minimum EDmax (71), and the highest accuracy-to-efficiency balance. ASIC-oriented design flow evaluation is further performed using Cadence Genus with predictive standard-cell libraries to analyze area, power, and timing behavior under advanced technology assumptions. To validate its real-world applicability, EOHEAA has been employed in Edge Detection and Color quantization using K-means clustering, both of which demonstrate high-quality outputs under relaxed accuracy constraints. Furthermore, a lightweight CNN-based validation framework is employed to examine the impact of approximate arithmetic on learning-based workloads, demonstrating that EOHEAA preserves inference accuracy while offering tangible energy and performance benefits. These results collectively position EOHEAA as a strong candidate for next-generation approximate arithmetic units in energy-aware image processing and machine-learning accelerators.
这项工作介绍了一种新颖的错误优化硬件高效近似加法器(EOHEAA),专为错误弹性计算任务量身定制,其中精度可以换取能源,延迟和资源效率的改进。EOHEAA采用了一种控制误差传播的策略方法,在保持最小硬件开销的同时,显著提高了精度指标,如平均误差距离(MED)、平均相对误差距离(MRED)和标准化误差距离(NMED)。EOHEAA在Artix-7 FPGA (XC7A35T - 1CPG236C)上使用Verilog HDL进行合成,与传统和最先进的近似加器设计相比,功耗降低38.6%,关键路径延迟提高34%,逻辑资源显著节省。对8位、16位和32位配置的综合分析进一步证实了其可扩展性和稳健性,在更宽的设计中,PDP改进达到71.5%。值得注意的是,EOHEAA通过实现最低RMSE(32.21),最小EDmax(71)和最高精度-效率平衡而优于几种现有设计。使用Cadence Genus和预测性标准单元库进一步执行面向asic的设计流程评估,以分析先进技术假设下的面积,功率和时序行为。为了验证其在现实世界中的适用性,EOHEAA被用于边缘检测和使用K-means聚类的颜色量化,两者都在宽松的精度约束下展示了高质量的输出。此外,采用轻量级的基于cnn的验证框架来检查近似算法对基于学习的工作负载的影响,证明EOHEAA在提供切实的能量和性能优势的同时保持了推理准确性。这些结果共同将EOHEAA定位为能量感知图像处理和机器学习加速器中下一代近似算术单元的强有力候选者。
{"title":"EOHEAA: Error-Optimized Hardware-Efficient Approximate Adder for energy-aware error-resilient applications","authors":"Prateek Goyal,&nbsp;Sujit Kumar Sahoo","doi":"10.1016/j.vlsi.2026.102660","DOIUrl":"10.1016/j.vlsi.2026.102660","url":null,"abstract":"<div><div>This work introduces a novel Error-Optimized Hardware-Efficient Approximate Adder (EOHEAA) tailored for error-resilient computing tasks, where precision can be traded for improvements in energy, delay, and resource efficiency. The EOHEAA adopts a strategic method of controlled error propagation, enabling significant enhancement in accuracy metrics such as Mean Error Distance (MED), Mean Relative Error Distance (MRED), and <em>Normalized MED (NMED)</em>, while maintaining minimal hardware overhead. Synthesized on the Artix-7 FPGA <span><math><mrow><mo>(</mo><mi>X</mi><mi>C</mi><mn>7</mn><mi>A</mi><mn>35</mn><mi>T</mi><mo>−</mo><mn>1</mn><mi>C</mi><mi>P</mi><mi>G</mi><mn>236</mn><mi>C</mi><mo>)</mo></mrow></math></span> using Verilog HDL, EOHEAA achieves up to 38.6% reduction in power consumption, an 34% improvement in critical path delay, and notable savings in logic resources compared to conventional and state-of-the-art approximate adder designs. Comprehensive analysis across 8, 16, and 32-bit configurations further confirms its scalability and robustness, with PDP improvements reaching 71.5% in wider designs. Notably, EOHEAA outperforms several existing designs by achieving the lowest RMSE <span><math><mrow><mo>(</mo><mn>32</mn><mo>.</mo><mn>21</mn><mo>)</mo></mrow></math></span>, minimum ED<sub>max</sub> <span><math><mrow><mo>(</mo><mn>71</mn><mo>)</mo></mrow></math></span>, and the highest accuracy-to-efficiency balance. ASIC-oriented design flow evaluation is further performed using Cadence Genus with predictive standard-cell libraries to analyze area, power, and timing behavior under advanced technology assumptions. To validate its real-world applicability, EOHEAA has been employed in Edge Detection and Color quantization using K-means clustering, both of which demonstrate high-quality outputs under relaxed accuracy constraints. Furthermore, a lightweight CNN-based validation framework is employed to examine the impact of approximate arithmetic on learning-based workloads, demonstrating that EOHEAA preserves inference accuracy while offering tangible energy and performance benefits. These results collectively position EOHEAA as a strong candidate for next-generation approximate arithmetic units in energy-aware image processing and machine-learning accelerators.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"108 ","pages":"Article 102660"},"PeriodicalIF":2.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145928870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Clock tree synthesis in modern VLSI: From foundational algorithms to AI-driven optimization 现代VLSI中的时钟树合成:从基础算法到人工智能驱动优化
IF 2.5 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-05-01 Epub Date: 2026-01-14 DOI: 10.1016/j.vlsi.2026.102665
Manikandan B , Karthikumar S
As digital architectures scale to unprecedented complexity, driven by emerging domains such as artificial intelligence, edge inference, and ultra-low-power systems, the strategic orchestration of clock signal delivery has become a cornerstone of integrated circuit design. Clock Tree Synthesis (CTS), a critical stage in the physical design flow, plays a vital role in maintaining synchronous operation, balancing timing constraints, managing dynamic and leakage power, and ensuring signal integrity under aggressive scaling and variable workloads. This review systematically dissects the evolution of CTS, beginning with classical methodologies centered on recursive trees, buffer insertion, and delay balancing, before exploring advanced solutions tailored for variability tolerance, power optimization, and architectural irregularity. In addition, the growing influence of machine learning and data-driven models that replace rigid rule sets with adaptive, layout-aware strategies offers predictive insights and multi-objective optimization throughout the design process. In addition, this study examined specialized use cases in security-conscious designs, aging-resilient circuits, photonic interconnects, and neuromorphic platforms, each demanding unique timing models and synthesis heuristics. This discussion culminates in a reflection on prevailing gaps, including the need for transparent ML integration, benchmark standardization, and holistic frameworks that bridge logical design with physical realization. This work offers a comprehensive perspective on the shifting paradigm of CTS, illuminating its central role in shaping high-performance, energy-efficient, and scalable silicon systems.
在人工智能、边缘推理和超低功耗系统等新兴领域的推动下,随着数字架构规模达到前所未有的复杂性,时钟信号传递的战略编排已成为集成电路设计的基石。时钟树合成(Clock Tree Synthesis, CTS)是物理设计流程中的关键阶段,在保持同步运行、平衡时序约束、管理动态和泄漏功率以及确保大规模缩放和可变工作负载下的信号完整性方面发挥着至关重要的作用。本文系统地剖析了CTS的发展,从以递归树、缓冲区插入和延迟平衡为中心的经典方法开始,然后探索为可变性容忍度、功率优化和架构不规则性量身定制的高级解决方案。此外,机器学习和数据驱动模型的影响力越来越大,它们用自适应的布局感知策略取代了严格的规则集,在整个设计过程中提供了预测性见解和多目标优化。此外,本研究还研究了安全意识设计、抗老化电路、光子互连和神经形态平台等方面的特殊用例,每个用例都需要独特的时序模型和综合启发式。讨论的高潮是对当前差距的反思,包括对透明ML集成、基准标准化和将逻辑设计与物理实现连接起来的整体框架的需求。这项工作为CTS的转变范式提供了一个全面的视角,阐明了其在塑造高性能,节能和可扩展的硅系统中的核心作用。
{"title":"Clock tree synthesis in modern VLSI: From foundational algorithms to AI-driven optimization","authors":"Manikandan B ,&nbsp;Karthikumar S","doi":"10.1016/j.vlsi.2026.102665","DOIUrl":"10.1016/j.vlsi.2026.102665","url":null,"abstract":"<div><div>As digital architectures scale to unprecedented complexity, driven by emerging domains such as artificial intelligence, edge inference, and ultra-low-power systems, the strategic orchestration of clock signal delivery has become a cornerstone of integrated circuit design. Clock Tree Synthesis (CTS), a critical stage in the physical design flow, plays a vital role in maintaining synchronous operation, balancing timing constraints, managing dynamic and leakage power, and ensuring signal integrity under aggressive scaling and variable workloads. This review systematically dissects the evolution of CTS, beginning with classical methodologies centered on recursive trees, buffer insertion, and delay balancing, before exploring advanced solutions tailored for variability tolerance, power optimization, and architectural irregularity. In addition, the growing influence of machine learning and data-driven models that replace rigid rule sets with adaptive, layout-aware strategies offers predictive insights and multi-objective optimization throughout the design process. In addition, this study examined specialized use cases in security-conscious designs, aging-resilient circuits, photonic interconnects, and neuromorphic platforms, each demanding unique timing models and synthesis heuristics. This discussion culminates in a reflection on prevailing gaps, including the need for transparent ML integration, benchmark standardization, and holistic frameworks that bridge logical design with physical realization. This work offers a comprehensive perspective on the shifting paradigm of CTS, illuminating its central role in shaping high-performance, energy-efficient, and scalable silicon systems.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"108 ","pages":"Article 102665"},"PeriodicalIF":2.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Low-power modified basic and feedback-bias common-gate transimpedance amplifiers with a novel bandwidth enhancement technique 一种新型带宽增强技术的低功率改进基型和反馈偏置共门跨阻放大器
IF 2.5 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-05-01 Epub Date: 2025-12-24 DOI: 10.1016/j.vlsi.2025.102641
Arash Hosseini, Shahram Mohammadnejad, Mohammad Azim Karami
This article presents two modified low-power basic and feedback-bias common-gate (CG) transimpedance amplifier topologies, which utilize a novel inductor-less current-reuse feedforward technique for 3 dB-bandwidth (BW) extension and relaxing critical trade-offs in CG-based TIAs. The topologies incorporate custom-designed biasing circuits to reduce performance variations across process and temperature. To assess the proposed topologies, mathematical and simulation analyses of both real (with and without zero) and complex conjugate pole conditions, along with noise analysis, have been conducted in modified and conventional structures. The proposed technique creates or adjusts a left-half plane zero through the feedforward path. Then, by neutralizing the dominant pole effect and generating a peaking property, the circuit's bandwidth is enhanced without reducing gain or increasing power consumption. In real pole conditions, the circuit's bandwidth increases by 1.5 2 times, while the input-referred noise is reduced by more than 2 times. In the complex conjugate pole state (specifically in feedback-bias CG), the proposed technique reduces the rate of bandwidth reduction caused by the input capacitance (Cin) increase (40 % improvement by changing the Cin from 1.5pF to 2.1 pF). Furthermore, the power consumption decreases by 2.4 times compared with the conventional feedback-bias topology. Topologies are validated in various process corners (TT, SS, FF) at different temperatures. In the worst cases, the BW variations of the modified basic and feedback-bias topologies have decreased by 42 % and 16 %, respectively. Additionally, Monte-Carlo and post-layout analysis of the proposed topologies are conducted in 0.18 μm CMOS standard technology.
本文提出了两种改进的低功率基本和反馈偏置共门(CG)跨阻放大器拓扑结构,它们利用一种新的无电感电流重用前馈技术进行3db带宽(BW)扩展和放松基于CG的tia的关键权衡。该拓扑包含定制设计的偏置电路,以减少跨工艺和温度的性能变化。为了评估所提出的拓扑结构,在改进的和传统的结构中进行了真实(有零和没有零)和复杂共轭极条件的数学和模拟分析,以及噪声分析。提出的技术通过前馈路径创建或调整左半平面零。然后,通过中和主导极效应并产生峰值特性,在不降低增益或增加功耗的情况下增强了电路的带宽。在实际极点条件下,电路的带宽增加了1.5 ~ 2倍,而输入参考噪声降低了~ 2倍以上。在复杂的共轭极态(特别是反馈偏置CG)中,所提出的技术降低了由输入电容(Cin)增加引起的带宽降低率(将Cin从1.5pF更改为2.1 pF可提高40%)。此外,与传统的反馈偏置拓扑相比,功耗降低了约2.4倍。拓扑在不同温度下的不同工艺角(TT, SS, FF)中进行验证。在最坏的情况下,改进的基本和反馈偏置拓扑的BW变化分别下降了42%和16%。此外,采用0.18 μm CMOS标准工艺对所提出的拓扑结构进行了蒙特卡罗分析和布局后分析。
{"title":"Low-power modified basic and feedback-bias common-gate transimpedance amplifiers with a novel bandwidth enhancement technique","authors":"Arash Hosseini,&nbsp;Shahram Mohammadnejad,&nbsp;Mohammad Azim Karami","doi":"10.1016/j.vlsi.2025.102641","DOIUrl":"10.1016/j.vlsi.2025.102641","url":null,"abstract":"<div><div>This article presents two modified low-power basic and feedback-bias common-gate (CG) transimpedance amplifier topologies, which utilize a novel inductor-less current-reuse feedforward technique for 3 dB-bandwidth (BW) extension and relaxing critical trade-offs in CG-based TIAs. The topologies incorporate custom-designed biasing circuits to reduce performance variations across process and temperature. To assess the proposed topologies, mathematical and simulation analyses of both real (with and without zero) and complex conjugate pole conditions, along with noise analysis, have been conducted in modified and conventional structures. The proposed technique creates or adjusts a left-half plane zero through the feedforward path. Then, by neutralizing the dominant pole effect and generating a peaking property, the circuit's bandwidth is enhanced without reducing gain or increasing power consumption. In real pole conditions, the circuit's bandwidth increases by 1.5 <span><math><mrow><mo>∼</mo></mrow></math></span> 2 times, while the input-referred noise is reduced by more than <span><math><mrow><mo>∼</mo></mrow></math></span> 2 times. In the complex conjugate pole state (specifically in feedback-bias CG), the proposed technique reduces the rate of bandwidth reduction caused by the input capacitance (Cin) increase (40 % improvement by changing the Cin from 1.5pF to 2.1 pF). Furthermore, the power consumption decreases by <span><math><mrow><mo>∼</mo></mrow></math></span> 2.4 times compared with the conventional feedback-bias topology. Topologies are validated in various process corners (TT, SS, FF) at different temperatures. In the worst cases, the BW variations of the modified basic and feedback-bias topologies have decreased by 42 % and 16 %, respectively. Additionally, Monte-Carlo and post-layout analysis of the proposed topologies are conducted in 0.18 μm CMOS standard technology.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"108 ","pages":"Article 102641"},"PeriodicalIF":2.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Algorithm-hardware co-design of binary neural network for efficient super resolution on FPGA 基于FPGA的高效超分辨率二元神经网络算法硬件协同设计
IF 2.5 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-05-01 Epub Date: 2026-02-03 DOI: 10.1016/j.vlsi.2026.102674
Yuanxin Su , Yihong Wang , Yushan Pan , Nan Xiang , Zhijie Xu , Jeremy Smith , Xinfei Guo
The modern Super resolution algorithm have been achieved a great success. However, the high computational complexity and large memory storage of these models hinder their deployment on resource-constrained devices. To address this issue, we propose a novel Binary Neural Network (BNN) architecture specifically designed for Single Image Super Resolution (SISR) tasks. Our approach introduces two key innovations: Dual-Scaling Binary Convolution (DS-BConv) and Dynamic Gradient Attention Binarizing Activation (DGABA). The DS-BConv module employs a dual-phase scaling mechanism to preserve information flow during binarization, while the DGABA function introduces a dynamic gradient attention mechanism that optimizes gradient flow during training. Together, these components enable our BNN to achieve competitive performance on SISR benchmarks while operating within the constraints of binary representations. In addition, we design a highly efficient hardware accelerator architecture for our BNN, leveraging the unique properties of DS-BConv and DGABA to maximize throughput and minimize latency. The accelerator is designed to efficiently handle the computational demands of SISR tasks while maintaining low power consumption, making it suitable for deployment on resource-constrained devices such as mobile phones and embedded systems.
现代的超分辨率算法已经取得了很大的成功。然而,这些模型的高计算复杂度和大内存存储阻碍了它们在资源受限设备上的部署。为了解决这个问题,我们提出了一种新的二元神经网络(BNN)架构,专门为单图像超分辨率(SISR)任务设计。我们的方法引入了两个关键创新:双尺度二值卷积(DS-BConv)和动态梯度注意二值化激活(DGABA)。DS-BConv模块采用双相缩放机制,在二值化过程中保持信息流,DGABA函数引入动态梯度注意机制,在训练过程中优化梯度流。总之,这些组件使我们的BNN在二进制表示的约束下运行时能够在SISR基准上实现具有竞争力的性能。此外,我们为我们的BNN设计了一个高效的硬件加速器架构,利用DS-BConv和DGABA的独特属性来最大化吞吐量和最小化延迟。该加速器旨在有效地处理SISR任务的计算需求,同时保持低功耗,使其适合部署在资源受限的设备上,如移动电话和嵌入式系统。
{"title":"Algorithm-hardware co-design of binary neural network for efficient super resolution on FPGA","authors":"Yuanxin Su ,&nbsp;Yihong Wang ,&nbsp;Yushan Pan ,&nbsp;Nan Xiang ,&nbsp;Zhijie Xu ,&nbsp;Jeremy Smith ,&nbsp;Xinfei Guo","doi":"10.1016/j.vlsi.2026.102674","DOIUrl":"10.1016/j.vlsi.2026.102674","url":null,"abstract":"<div><div>The modern Super resolution algorithm have been achieved a great success. However, the high computational complexity and large memory storage of these models hinder their deployment on resource-constrained devices. To address this issue, we propose a novel Binary Neural Network (BNN) architecture specifically designed for Single Image Super Resolution (SISR) tasks. Our approach introduces two key innovations: Dual-Scaling Binary Convolution (DS-BConv) and Dynamic Gradient Attention Binarizing Activation (DGABA). The DS-BConv module employs a dual-phase scaling mechanism to preserve information flow during binarization, while the DGABA function introduces a dynamic gradient attention mechanism that optimizes gradient flow during training. Together, these components enable our BNN to achieve competitive performance on SISR benchmarks while operating within the constraints of binary representations. In addition, we design a highly efficient hardware accelerator architecture for our BNN, leveraging the unique properties of DS-BConv and DGABA to maximize throughput and minimize latency. The accelerator is designed to efficiently handle the computational demands of SISR tasks while maintaining low power consumption, making it suitable for deployment on resource-constrained devices such as mobile phones and embedded systems.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"108 ","pages":"Article 102674"},"PeriodicalIF":2.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146189108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ring oscillator-based on-chip aging sensor for recycled IC detection 基于环形振荡器的片上老化传感器用于循环集成电路检测
IF 2.5 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-05-01 Epub Date: 2026-02-08 DOI: 10.1016/j.vlsi.2026.102677
S.S. Rekha, K. Sudeendra Kumar
The counterfeit chips are the menace in the semiconductor supply chain, which hurts the revenue and reputation of the original chip makers. The recycled ICs extracted from obsolete electronic equipment form the major part of the counterfeit chips. Recycled counterfeit chips impact the reliability of critical systems, and there is a need to mitigate the recycled chips entering the supply chain. The inclusion of aging sensors in the chips to identify the counterfeit chips is one of the most preferred and well-researched areas. To accurately predict the chip's age is challenging and Ring Oscillator (RO)-based circuits are commonly used in aging sensors. We propose the modified NOR-based Inverter RO aging sensor, which accurately detects recycled chips. The proposed RO-based aging sensor has a relatively low probability of misprediction of an aged chip as a new chip and vice versa, compared to existing RO-based aging sensor techniques. The aging prediction accuracy of the proposed NOR-based inverter RO aging sensor is higher for the chips that have undergone accelerated aging due to high temperatures during their operation. In this work, we integrate the RSA (Rivest Shamir Adleman) cryptographic block presented in the SSTF (Secure Split Test with Functional testing) to make the proposed aging sensor secure against tampering attacks. We validate the proposed aging sensor in Cadence Relxpert using the GPDK 45 nm library and present the simulation and analytical results.
假冒芯片是半导体供应链中的威胁,它损害了原始芯片制造商的收入和声誉。从废弃电子设备中提取的回收集成电路构成了假冒芯片的主要部分。回收的假冒芯片会影响关键系统的可靠性,因此有必要减少进入供应链的回收芯片。在芯片中加入老化传感器以识别假冒芯片是最受欢迎和研究最充分的领域之一。准确预测芯片的寿命是一项挑战,基于环形振荡器(RO)的电路通常用于老化传感器。我们提出了一种改进的基于no的反相RO老化传感器,可以准确地检测回收芯片。与现有的基于ro的老化传感器技术相比,本文提出的基于ro的老化传感器将老化芯片误认为新芯片的概率相对较低,反之亦然。对于在运行过程中由于高温而加速老化的芯片,本文提出的基于no的逆变RO老化传感器的老化预测精度更高。在这项工作中,我们集成了SSTF (Secure Split Test with Functional testing)中提出的RSA (Rivest Shamir Adleman)加密块,使所提出的老化传感器免受篡改攻击。我们使用GPDK 45 nm库在Cadence Relxpert中验证了所提出的老化传感器,并给出了仿真和分析结果。
{"title":"Ring oscillator-based on-chip aging sensor for recycled IC detection","authors":"S.S. Rekha,&nbsp;K. Sudeendra Kumar","doi":"10.1016/j.vlsi.2026.102677","DOIUrl":"10.1016/j.vlsi.2026.102677","url":null,"abstract":"<div><div>The counterfeit chips are the menace in the semiconductor supply chain, which hurts the revenue and reputation of the original chip makers. The recycled ICs extracted from obsolete electronic equipment form the major part of the counterfeit chips. Recycled counterfeit chips impact the reliability of critical systems, and there is a need to mitigate the recycled chips entering the supply chain. The inclusion of aging sensors in the chips to identify the counterfeit chips is one of the most preferred and well-researched areas. To accurately predict the chip's age is challenging and Ring Oscillator (RO)-based circuits are commonly used in aging sensors. We propose the modified NOR-based Inverter RO aging sensor, which accurately detects recycled chips. The proposed RO-based aging sensor has a relatively low probability of misprediction of an aged chip as a new chip and vice versa, compared to existing RO-based aging sensor techniques. The aging prediction accuracy of the proposed NOR-based inverter RO aging sensor is higher for the chips that have undergone accelerated aging due to high temperatures during their operation. In this work, we integrate the RSA (Rivest Shamir Adleman) cryptographic block presented in the SSTF (Secure Split Test with Functional testing) to make the proposed aging sensor secure against tampering attacks. We validate the proposed aging sensor in Cadence Relxpert using the GPDK 45 nm library and present the simulation and analytical results.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"108 ","pages":"Article 102677"},"PeriodicalIF":2.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146189189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A 180-nm CMOS fully digital chaotic Lorenz system 一个180纳米CMOS全数字混沌洛伦兹系统
IF 2.5 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-05-01 Epub Date: 2026-01-15 DOI: 10.1016/j.vlsi.2026.102667
Alejandro Silva-Juarez , Sergio A. Rosales-Nunez , Luis C. Alvarez-Simon , Gregorio Zamora-Mejia , Julio Hernandez-Perez , Victor H. Carbajal-Gomez , Jose M. Rocha-Perez , Alejandro I. Bautista-Castillo
This paper presents the design, Application-Specific Integrated Circuit (ASIC) fabrication, and experimental characterization of a fully digital Lorenz chaotic system in 180 nm CMOS technology. Motivated by the growing demand for high-performance chaotic generators in secure communications, the design originates from a rigorous mathematical formulation, verifying its chaotic behavior via the Kaplan–Yorke dimension estimate. The architecture is based on the forward Euler method, implemented with a 32-bit fixed-point datapath, a control finite-state machine, and integrated 32-to-1 serializers. We detail the complete digital design flow, from RTL to GDSII, using a unified synthesis and place-and-route methodology with the Synopsys Fusion Compiler toolchain. For characterization, the serializers stream the state variables x(n), y(n), and z(n) to an external FPGA, enabling subsequent transmission to a PC via a UART interface. The integrated circuit was fabricated and tested; experimental measurements successfully validated the chaotic behavior predicted by post-layout simulations. The final design occupies a cell area of 0.873 mm2 and exhibits an estimated static power of 1.02 mW. Post-layout timing reports indicate that the design meets its target frequencies, achieving a maximum operating frequency above 57 MHz (with a positive slack of +2.61 ns on the 50 MHz clock). The ASIC implementation demonstrates significant advantages in power, density, and speed potential over FPGA-based alternatives, positioning this design as a robust solution for embedded security systems.
本文介绍了180nm CMOS技术下的全数字洛伦兹混沌系统的设计、专用集成电路(ASIC)制造和实验表征。由于安全通信中对高性能混沌发生器的需求日益增长,该设计源于严格的数学公式,通过Kaplan-Yorke维数估计验证其混沌行为。该体系结构基于前向欧拉方法,使用32位定点数据路径、控制有限状态机和集成的32对1串行化器实现。我们详细介绍了完整的数字设计流程,从RTL到GDSII,使用统一的合成和放置路径方法与Synopsys Fusion Compiler工具链。为了进行特性描述,串行化器将状态变量x(n)、y(n)和z(n)流式传输到外部FPGA,以便随后通过UART接口传输到PC。完成了集成电路的制作和测试;实验测量成功地验证了后布局仿真预测的混沌行为。最终设计的电池面积为0.873 mm2,估计静态功率为1.02 mW。布局后时序报告表明,该设计满足其目标频率,实现高于57 MHz的最大工作频率(在50 MHz时钟上具有+2.61 ns的正松弛)。与基于fpga的替代方案相比,ASIC实现在功率,密度和速度潜力方面具有显着优势,将该设计定位为嵌入式安全系统的强大解决方案。
{"title":"A 180-nm CMOS fully digital chaotic Lorenz system","authors":"Alejandro Silva-Juarez ,&nbsp;Sergio A. Rosales-Nunez ,&nbsp;Luis C. Alvarez-Simon ,&nbsp;Gregorio Zamora-Mejia ,&nbsp;Julio Hernandez-Perez ,&nbsp;Victor H. Carbajal-Gomez ,&nbsp;Jose M. Rocha-Perez ,&nbsp;Alejandro I. Bautista-Castillo","doi":"10.1016/j.vlsi.2026.102667","DOIUrl":"10.1016/j.vlsi.2026.102667","url":null,"abstract":"<div><div>This paper presents the design, Application-Specific Integrated Circuit (ASIC) fabrication, and experimental characterization of a fully digital Lorenz chaotic system in 180 nm CMOS technology. Motivated by the growing demand for high-performance chaotic generators in secure communications, the design originates from a rigorous mathematical formulation, verifying its chaotic behavior via the Kaplan–Yorke dimension estimate. The architecture is based on the forward Euler method, implemented with a 32-bit fixed-point datapath, a control finite-state machine, and integrated 32-to-1 serializers. We detail the complete digital design flow, from RTL to GDSII, using a unified synthesis and place-and-route methodology with the Synopsys Fusion Compiler toolchain. For characterization, the serializers stream the state variables <span><math><mrow><mi>x</mi><mrow><mo>(</mo><mi>n</mi><mo>)</mo></mrow></mrow></math></span>, <span><math><mrow><mi>y</mi><mrow><mo>(</mo><mi>n</mi><mo>)</mo></mrow></mrow></math></span>, and <span><math><mrow><mi>z</mi><mrow><mo>(</mo><mi>n</mi><mo>)</mo></mrow></mrow></math></span> to an external FPGA, enabling subsequent transmission to a PC via a UART interface. The integrated circuit was fabricated and tested; experimental measurements successfully validated the chaotic behavior predicted by post-layout simulations. The final design occupies a cell area of 0.873<!--> <!-->mm<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span> and exhibits an estimated static power of 1.02<!--> <!-->mW. Post-layout timing reports indicate that the design meets its target frequencies, achieving a maximum operating frequency above 57<!--> <!-->MHz (with a positive slack of +2.61<!--> <!-->ns on the 50<!--> <!-->MHz clock). The ASIC implementation demonstrates significant advantages in power, density, and speed potential over FPGA-based alternatives, positioning this design as a robust solution for embedded security systems.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"108 ","pages":"Article 102667"},"PeriodicalIF":2.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146078340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Integration-The Vlsi Journal
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1