首页 > 最新文献

Integration-The Vlsi Journal最新文献

英文 中文
High speed and high performance approximate multipliers for error resilient applications 高速和高性能近似乘法器的错误弹性应用
IF 2.5 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-05-01 Epub Date: 2026-02-04 DOI: 10.1016/j.vlsi.2026.102678
Shareefa Fairoose P., Ashutosh Mishra
Approximate computing has become a promising approach for error-tolerant applications, providing significant enhancements in area, power, and delay by easing stringent accuracy constraints. This paper presents three novel high-performance approximate multiplier (AM) architectures that employ optimized approximate 4–2 compressors and a pre-computation scheme for the least significant bits (LSBs) of the final product to minimize critical path delay. The proposed multipliers are developed for both signed and unsigned arithmetic and target applications where energy efficiency and speed are prioritized over exact precision.
The underlying approximate compressors, derived using by introducing controlled inaccuracies only in rare input conditions and Karnaugh map-based logic reduction, thereby achieving a favorable trade-off between hardware cost and error. These compressors, implemented using AOI, OAI, NAND, and NOR logic primitives, achieve a 55%–72% reduction in ADP, a 52%–67% reduction in PDP, and a 53%–80% reduction in PADP compared to exact counterparts. The proposed multipliers significantly reduce the ADP, PDP, and PADP compared to the state-of-the-art.
All proposed multiplier architectures are synthesized using the Cadence® Genus in the 90 nm CMOS library and evaluated using standard design metrics. Monte Carlo simulations confirm low error rates and high computational reliability. The unsigned multipliers are applied to image blending tasks, yielding favorable visual results. At the same time, the signed variants are employed in neural network applications, achieving inference accuracy of 96%–98% with enhanced speed and energy efficiency.
近似计算已经成为容错应用的一种很有前途的方法,通过减轻严格的精度限制,在面积、功率和延迟方面提供了显著的增强。本文提出了三种新型的高性能近似乘法器(AM)架构,它们采用优化的近似4-2压缩器和最终产品的最低有效位(lbs)的预计算方案来最小化关键路径延迟。所提出的乘法器适用于有符号和无符号算法以及能效和速度优先于精确精度的目标应用。基于卡诺映射的逻辑约简和在稀有输入条件下引入控制误差的近似压缩器,从而在硬件成本和误差之间实现了良好的权衡。这些压缩器使用AOI、OAI、NAND和NOR逻辑原语实现,与精确对应的压缩器相比,ADP降低55%-72%,PDP降低52%-67%,PADP降低53%-80%。与最先进的乘数器相比,提议的乘数器显著降低了ADP、PDP和PADP。所有提出的乘法器架构都使用90nm CMOS库中的Cadence®Genus进行合成,并使用标准设计指标进行评估。蒙特卡罗模拟验证了低错误率和高计算可靠性。将无符号乘法器应用于图像混合任务,获得了良好的视觉效果。同时,将签名变体应用于神经网络,在提高速度和能效的同时,推理准确率达到96%-98%。
{"title":"High speed and high performance approximate multipliers for error resilient applications","authors":"Shareefa Fairoose P.,&nbsp;Ashutosh Mishra","doi":"10.1016/j.vlsi.2026.102678","DOIUrl":"10.1016/j.vlsi.2026.102678","url":null,"abstract":"<div><div>Approximate computing has become a promising approach for error-tolerant applications, providing significant enhancements in area, power, and delay by easing stringent accuracy constraints. This paper presents three novel high-performance approximate multiplier (AM) architectures that employ optimized approximate 4–2 compressors and a pre-computation scheme for the least significant bits (LSBs) of the final product to minimize critical path delay. The proposed multipliers are developed for both signed and unsigned arithmetic and target applications where energy efficiency and speed are prioritized over exact precision.</div><div>The underlying approximate compressors, derived using by introducing controlled inaccuracies only in rare input conditions and Karnaugh map-based logic reduction, thereby achieving a favorable trade-off between hardware cost and error. These compressors, implemented using AOI, OAI, NAND, and NOR logic primitives, achieve a 55%–72% reduction in ADP, a 52%–67% reduction in PDP, and a 53%–80% reduction in PADP compared to exact counterparts. The proposed multipliers significantly reduce the ADP, PDP, and PADP compared to the state-of-the-art.</div><div>All proposed multiplier architectures are synthesized using the Cadence® Genus in the 90 nm CMOS library and evaluated using standard design metrics. Monte Carlo simulations confirm low error rates and high computational reliability. The unsigned multipliers are applied to image blending tasks, yielding favorable visual results. At the same time, the signed variants are employed in neural network applications, achieving inference accuracy of 96%–98% with enhanced speed and energy efficiency.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"108 ","pages":"Article 102678"},"PeriodicalIF":2.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146189191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A heuristic approach for near Pareto-optimal design space exploration in Approximate High-Level Synthesis 近似高级综合中近似帕累托最优设计空间探索的启发式方法
IF 2.5 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-05-01 Epub Date: 2026-01-02 DOI: 10.1016/j.vlsi.2025.102638
Tiago Almeida , Isaías Felzmann , Lucas Wanner
Approximate Computing can optimize resource usage in HLS (High-Level Synthesis) by mapping certain operations to components that have lower resource utilization but introduce small errors into application outputs. A fundamental challenge is identifying a set of approximate components for implementing operators in an accelerator design while achieving optimal resource utilization and accuracy. We introduce an input-aware heuristic approach that uses application inputs to model output errors more effectively. In this approach, operators in accelerators, such as adders and multipliers, are mapped to a library of precharacterized approximate components. Applications are executed with a set of training inputs and candidate solutions are selected based on a metric that combines output errors and estimated resource utilization. The results demonstrate that the approach can find appropriate approximate designs for a given error threshold. For image processing applications, the input-aware heuristic was able to save LUT and FF by up to 55% for less than 25% output degradation. Similar savings were shown for a CNN model with less than 0.8% accuracy degradation.
通过将某些操作映射到资源利用率较低但在应用程序输出中引入小错误的组件,Approximate Computing可以优化HLS(高级综合)中的资源使用情况。一个基本的挑战是在实现最佳资源利用率和精度的同时,确定一组近似组件,以实现加速器设计中的操作。我们引入了一种输入感知启发式方法,该方法使用应用程序输入更有效地建模输出错误。在这种方法中,加速器中的运算符,如加法器和乘法器,被映射到预表征的近似组件库。应用程序使用一组训练输入来执行,候选解决方案是根据结合输出误差和估计资源利用率的度量来选择的。结果表明,该方法可以在给定误差阈值的情况下找到合适的近似设计。对于图像处理应用程序,输入感知启发式能够将LUT和FF节省高达55%,而输出退化低于25%。对于准确度下降小于0.8%的CNN模型,也显示了类似的节省。
{"title":"A heuristic approach for near Pareto-optimal design space exploration in Approximate High-Level Synthesis","authors":"Tiago Almeida ,&nbsp;Isaías Felzmann ,&nbsp;Lucas Wanner","doi":"10.1016/j.vlsi.2025.102638","DOIUrl":"10.1016/j.vlsi.2025.102638","url":null,"abstract":"<div><div>Approximate Computing can optimize resource usage in HLS (High-Level Synthesis) by mapping certain operations to components that have lower resource utilization but introduce small errors into application outputs. A fundamental challenge is identifying a set of approximate components for implementing operators in an accelerator design while achieving optimal resource utilization and accuracy. We introduce an input-aware heuristic approach that uses application inputs to model output errors more effectively. In this approach, operators in accelerators, such as adders and multipliers, are mapped to a library of precharacterized approximate components. Applications are executed with a set of training inputs and candidate solutions are selected based on a metric that combines output errors and estimated resource utilization. The results demonstrate that the approach can find appropriate approximate designs for a given error threshold. For image processing applications, the input-aware heuristic was able to save LUT and FF by up to 55% for less than 25% output degradation. Similar savings were shown for a CNN model with less than 0.8% accuracy degradation.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"108 ","pages":"Article 102638"},"PeriodicalIF":2.5,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145895858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An innovative HLS framework for all network architectures: From Python to SoC 一个适用于所有网络架构的创新HLS框架:从Python到SoC
IF 2.5 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-03-01 Epub Date: 2025-12-13 DOI: 10.1016/j.vlsi.2025.102626
Thi Diem Tran , Minh Tan Ha , Xuan Thao Tran , Ngoc Quoc Tran , Vu Trung Duong Le , Hoai Luan Pham , Van Tinh Nguyen
Deep Neural Networks (DNNs) have achieved remarkable success in diverse applications such as image classification, signal processing, and video analysis. Despite their effectiveness, these models require substantial computational resources, making FPGA-based hardware acceleration a critical enabler for real-time deployment. However, current methods for mapping DNNs to hardware have experienced limited adoption, mainly because software developers often lack the specialized hardware expertise needed for efficient implementation. High-Level Synthesis (HLS) tools were introduced to bridge this gap, but they typically confine designs to fixed platforms and simple network structures. Most existing tools support only standard architectures like VGG or ResNet with predefined parameters, offering little flexibility for customization and restricting deployment to specific FPGA devices. To address these limitations, we introduce Py2C, an automated framework that converts AI models from Python to C. Py2C supports a wide range of DNN architectures, from basic convolutional and pooling layers with variable window sizes to advanced models such as VGG, ResNet, InceptionNet, ShuffleNet, NambaNet, and YOLO. Integrated with Xilinx’s Vitis HLS, Py2C forms the Py2RTL flow, enabling register-transfer level (RTL) generation with custom-precision arithmetic and cross-platform verification. Validated on multiple networks, Py2C has demonstrated superior hardware efficiency and power reduction, particularly in QRS detection for ECG signals. By streamlining the AI-to-RTL conversion process, Py2C makes FPGA-based AI deployment both high-performance and accessible.
深度神经网络(dnn)在图像分类、信号处理和视频分析等多种应用中取得了显著的成功。尽管这些模型很有效,但它们需要大量的计算资源,这使得基于fpga的硬件加速成为实时部署的关键推动者。然而,目前将dnn映射到硬件的方法采用有限,主要是因为软件开发人员通常缺乏有效实现所需的专门硬件专业知识。高级综合(High-Level Synthesis, HLS)工具的引入弥补了这一差距,但它们通常将设计局限于固定的平台和简单的网络结构。大多数现有工具只支持标准架构,如VGG或ResNet,具有预定义的参数,提供很少的定制灵活性,并且限制部署到特定的FPGA设备。为了解决这些限制,我们引入了Py2C,一个将AI模型从Python转换为c的自动化框架。Py2C支持广泛的DNN架构,从基本的卷积层和具有可变窗口大小的池化层到高级模型,如VGG, ResNet, InceptionNet, ShuffleNet, NambaNet和YOLO。与Xilinx的Vitis HLS集成,Py2C形成Py2RTL流,支持使用自定义精度算法和跨平台验证生成寄存器传输级别(RTL)。在多个网络上验证,Py2C具有卓越的硬件效率和功耗降低,特别是在ECG信号的QRS检测方面。通过简化AI到rtl的转换过程,Py2C使基于fpga的AI部署既高性能又易于访问。
{"title":"An innovative HLS framework for all network architectures: From Python to SoC","authors":"Thi Diem Tran ,&nbsp;Minh Tan Ha ,&nbsp;Xuan Thao Tran ,&nbsp;Ngoc Quoc Tran ,&nbsp;Vu Trung Duong Le ,&nbsp;Hoai Luan Pham ,&nbsp;Van Tinh Nguyen","doi":"10.1016/j.vlsi.2025.102626","DOIUrl":"10.1016/j.vlsi.2025.102626","url":null,"abstract":"<div><div>Deep Neural Networks (DNNs) have achieved remarkable success in diverse applications such as image classification, signal processing, and video analysis. Despite their effectiveness, these models require substantial computational resources, making FPGA-based hardware acceleration a critical enabler for real-time deployment. However, current methods for mapping DNNs to hardware have experienced limited adoption, mainly because software developers often lack the specialized hardware expertise needed for efficient implementation. High-Level Synthesis (HLS) tools were introduced to bridge this gap, but they typically confine designs to fixed platforms and simple network structures. Most existing tools support only standard architectures like VGG or ResNet with predefined parameters, offering little flexibility for customization and restricting deployment to specific FPGA devices. To address these limitations, we introduce Py2C, an automated framework that converts AI models from Python to C. Py2C supports a wide range of DNN architectures, from basic convolutional and pooling layers with variable window sizes to advanced models such as VGG, ResNet, InceptionNet, ShuffleNet, NambaNet, and YOLO. Integrated with Xilinx’s Vitis HLS, Py2C forms the Py2RTL flow, enabling register-transfer level (RTL) generation with custom-precision arithmetic and cross-platform verification. Validated on multiple networks, Py2C has demonstrated superior hardware efficiency and power reduction, particularly in QRS detection for ECG signals. By streamlining the AI-to-RTL conversion process, Py2C makes FPGA-based AI deployment both high-performance and accessible.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"107 ","pages":"Article 102626"},"PeriodicalIF":2.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145839830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Low power post quantum cryptography on reconfigurable devices for cyber physical system in IoT 物联网网络物理系统可重构设备的低功耗后量子加密
IF 2.5 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-03-01 Epub Date: 2025-12-22 DOI: 10.1016/j.vlsi.2025.102636
Ankita Sarkar, Mansi Jhamb
The rapid increase in health monitoring data in the digital era necessitates advanced security measures to protect sensitive medical information. While current encryption methods are effective, the looming threat of quantum computing, capable of breaking existing cryptographic protocols, demands a shift towards more resilient solutions. This paper introduces a novel security framework that integrates post-quantum cryptography with modern encryption to ensure robust protection. The proposed framework achieves high information entropy of 7.99, indicating strong security through unpredictability. It is designed for efficiency, with an average execution time of just 0.2856 μs, ensuring quick data processing without compromising security. Additionally, the framework operates with minimal power consumption, requiring only 1.4 mA, making it suitable for IoT-based medical systems where resource efficiency is critical. This approach not only secures current health monitoring scenario but also prepares them for future quantum threats, offering a comprehensive, efficient, and forward-looking solution to protect sensitive medical data in an increasingly interconnected world.
数字时代健康监测数据的快速增长需要先进的安全措施来保护敏感的医疗信息。虽然目前的加密方法是有效的,但量子计算的威胁迫在眉睫,能够打破现有的加密协议,需要转向更有弹性的解决方案。本文介绍了一种将后量子加密技术与现代加密技术相结合的新型安全框架,以保证系统的鲁棒性。该框架实现了7.99的高信息熵,通过不可预测性表明了较强的安全性。它专为效率而设计,平均执行时间仅为0.2856 μs,确保在不影响安全性的情况下快速处理数据。此外,该框架以最小的功耗运行,仅需要1.4 mA,使其适用于资源效率至关重要的基于物联网的医疗系统。这种方法不仅可以保护当前的健康监测场景,还可以为未来的量子威胁做好准备,提供全面、高效和前瞻性的解决方案,以在日益互联的世界中保护敏感的医疗数据。
{"title":"Low power post quantum cryptography on reconfigurable devices for cyber physical system in IoT","authors":"Ankita Sarkar,&nbsp;Mansi Jhamb","doi":"10.1016/j.vlsi.2025.102636","DOIUrl":"10.1016/j.vlsi.2025.102636","url":null,"abstract":"<div><div>The rapid increase in health monitoring data in the digital era necessitates advanced security measures to protect sensitive medical information. While current encryption methods are effective, the looming threat of quantum computing, capable of breaking existing cryptographic protocols, demands a shift towards more resilient solutions. This paper introduces a novel security framework that integrates post-quantum cryptography with modern encryption to ensure robust protection. The proposed framework achieves high information entropy of 7.99, indicating strong security through unpredictability. It is designed for efficiency, with an average execution time of just 0.2856 μs, ensuring quick data processing without compromising security. Additionally, the framework operates with minimal power consumption, requiring only 1.4 mA, making it suitable for IoT-based medical systems where resource efficiency is critical. This approach not only secures current health monitoring scenario but also prepares them for future quantum threats, offering a comprehensive, efficient, and forward-looking solution to protect sensitive medical data in an increasingly interconnected world.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"107 ","pages":"Article 102636"},"PeriodicalIF":2.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145839826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hessian-driven N:M sparsity and quantization co-optimization for edge device deployment 边缘设备部署的hessian驱动N:M稀疏和量化协同优化
IF 2.5 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-03-01 Epub Date: 2025-12-11 DOI: 10.1016/j.vlsi.2025.102629
Minghao Tang , Ming Ling , Minhua Ren , Zhihua Cai , Zhen Liu , Shidi Tang , Jianjun Li
To reduce computational demands on neural networks, pruning and quantization are commonly employed to lightweight models. These two approaches are typically viewed as orthogonal. However, this perception is limited, and further exploration of their intrinsic connection is required. Consequently, a heuristic algorithm, referred to as Hessian-Cooptimized Sparsity-Quantization (HCSQ), is proposed. This is the first algorithm to unify the intrinsic connection between quantization and semi-structured pruning through second-order Hessian information. Our algorithm introduces the concept of sensitivity through Hessian information, with further fine-tuning of layer-level sensitivity by adjusting the N:M sparsity ratio within layers, and it maximizes the utilization of quantization bit width. The evaluation of three lightweight models (ResNet20, ResNet18 and MobileNetV2) is conducted in four datasets (ImageNet, Tiny-ImageNet, CIFAR-10 and CIFAR-100), reaching a maximum compression ratio of ranges from 14.96× to 28.58× without reducing original accuracy(<1% loss), better than state-of-the-art performance under the comparable accuracy loss. Furthermore, ablation experiments are conducted within the open source processor. In some layers, it achieves an acceleration of up to 4.79×, and the entire model’s inference cycle time is reduced to 45%, compared to the ablation experiment. This demonstrates that the efficacy of the proposed algorithm extends beyond mere model compression; it also enhances hardware utilization when focus on the specific hardware designs.
为了减少对神经网络的计算需求,通常在轻量级模型中使用剪枝和量化。这两种方法通常被认为是正交的。然而,这种认识是有限的,需要进一步探索它们之间的内在联系。因此,提出了一种启发式算法,称为hessian - cooptimization sparsity - quantiization (HCSQ)。这是第一个通过二阶Hessian信息统一量化与半结构化剪枝之间内在联系的算法。我们的算法通过Hessian信息引入灵敏度的概念,通过调整层内N:M稀疏比进一步微调层级灵敏度,最大限度地利用量化比特宽度。在四个数据集(ImageNet、Tiny-ImageNet、CIFAR-10和CIFAR-100)上对3个轻量级模型(ResNet20、ResNet18和MobileNetV2)进行了评估,在不降低原始精度(损失<;1%)的情况下,达到了14.96× ~ 28.58×的最大压缩比,在同等精度损失下优于目前的性能。此外,在开源处理器内进行了烧蚀实验。与烧蚀实验相比,在某些层中实现了高达4.79倍的加速,整个模型的推理周期时间减少到45%。这表明该算法的有效性超越了单纯的模型压缩;当专注于特定的硬件设计时,它还可以提高硬件利用率。
{"title":"Hessian-driven N:M sparsity and quantization co-optimization for edge device deployment","authors":"Minghao Tang ,&nbsp;Ming Ling ,&nbsp;Minhua Ren ,&nbsp;Zhihua Cai ,&nbsp;Zhen Liu ,&nbsp;Shidi Tang ,&nbsp;Jianjun Li","doi":"10.1016/j.vlsi.2025.102629","DOIUrl":"10.1016/j.vlsi.2025.102629","url":null,"abstract":"<div><div>To reduce computational demands on neural networks, pruning and quantization are commonly employed to lightweight models. These two approaches are typically viewed as orthogonal. However, this perception is limited, and further exploration of their intrinsic connection is required. Consequently, a heuristic algorithm, referred to as Hessian-Cooptimized Sparsity-Quantization (HCSQ), is proposed. This is the first algorithm to unify the intrinsic connection between quantization and semi-structured pruning through second-order Hessian information. Our algorithm introduces the concept of sensitivity through Hessian information, with further fine-tuning of layer-level sensitivity by adjusting the N:M sparsity ratio within layers, and it maximizes the utilization of quantization bit width. The evaluation of three lightweight models (ResNet20, ResNet18 and MobileNetV2) is conducted in four datasets (ImageNet, Tiny-ImageNet, CIFAR-10 and CIFAR-100), reaching a maximum compression ratio of ranges from 14.96<span><math><mo>×</mo></math></span> to 28.58<span><math><mo>×</mo></math></span> without reducing original accuracy(<span><math><mrow><mo>&lt;</mo><mn>1</mn><mtext>%</mtext></mrow></math></span> loss), better than state-of-the-art performance under the comparable accuracy loss. Furthermore, ablation experiments are conducted within the open source processor. In some layers, it achieves an acceleration of up to 4.79<span><math><mo>×</mo></math></span>, and the entire model’s inference cycle time is reduced to 45%, compared to the ablation experiment. This demonstrates that the efficacy of the proposed algorithm extends beyond mere model compression; it also enhances hardware utilization when focus on the specific hardware designs.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"107 ","pages":"Article 102629"},"PeriodicalIF":2.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145839825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An area-efficient 1st order noise shaping SAR using C-2C ladder DAC for biomedical applications 使用C-2C梯形DAC的生物医学应用的面积高效一阶噪声整形SAR
IF 2.5 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-03-01 Epub Date: 2025-11-19 DOI: 10.1016/j.vlsi.2025.102598
Mauricio Velázquez Díaz , Victor R. Gonzalez-Diaz , Gisela De La Fuente-Cortes , Guillermo Espinosa Flores-Verdad , Roberto S. Murphy-Arteaga
This article presents the design and implementation of a fully differential Successive Approximation Register (SAR) analog-to-digital converter (ADC) in 65 nm UMC technology, specifically targeting biomedical applications where area efficiency is a critical requirement. The design prioritizes achieving clean and precise first-order Noise Shaping (NS) by integrating a switched-capacitor-based integrator with our proposed C-2C ladder DAC topology, which is instrumental in significantly reducing area consumption. Noise performance is optimized by carefully correlating the capacitances of the integrator and DAC, ensuring precision and stability. To achieve robust operation, the design incorporates a process, voltage, and temperature (PVT)-resilient methodology for all system blocks, providing consistent performance and reliability under challenging conditions and variations in fabrication. The implemented prototype achieves an area efficiency of 0.058 mm2, 10.37 ENOB over a 20 kHz Bandwidth, and operates at a 1 MHz sampling rate with a power consumption of 448μW.
本文介绍了65纳米UMC技术中全差分逐次逼近寄存器(SAR)模数转换器(ADC)的设计和实现,特别是针对区域效率是关键要求的生物医学应用。该设计通过将基于开关电容的积分器与我们提出的C-2C梯形DAC拓扑集成在一起,优先实现清洁和精确的一阶噪声整形(NS),这有助于显着降低面积消耗。通过仔细关联积分器和DAC的电容,优化了噪声性能,确保了精度和稳定性。为了实现稳健的运行,该设计为所有系统模块集成了工艺、电压和温度(PVT)弹性方法,在具有挑战性的条件和制造变化下提供一致的性能和可靠性。所实现的样机在20khz带宽下的面积效率为0.058 mm2, ENOB为10.37,采样率为1 MHz,功耗为448μW。
{"title":"An area-efficient 1st order noise shaping SAR using C-2C ladder DAC for biomedical applications","authors":"Mauricio Velázquez Díaz ,&nbsp;Victor R. Gonzalez-Diaz ,&nbsp;Gisela De La Fuente-Cortes ,&nbsp;Guillermo Espinosa Flores-Verdad ,&nbsp;Roberto S. Murphy-Arteaga","doi":"10.1016/j.vlsi.2025.102598","DOIUrl":"10.1016/j.vlsi.2025.102598","url":null,"abstract":"<div><div>This article presents the design and implementation of a fully differential Successive Approximation Register (SAR) analog-to-digital converter (ADC) in 65 nm UMC technology, specifically targeting biomedical applications where area efficiency is a critical requirement. The design prioritizes achieving clean and precise first-order Noise Shaping (NS) by integrating a switched-capacitor-based integrator with our proposed C-2C ladder DAC topology, which is instrumental in significantly reducing area consumption. Noise performance is optimized by carefully correlating the capacitances of the integrator and DAC, ensuring precision and stability. To achieve robust operation, the design incorporates a process, voltage, and temperature (PVT)-resilient methodology for all system blocks, providing consistent performance and reliability under challenging conditions and variations in fabrication. The implemented prototype achieves an area efficiency of 0.058 <span><math><msup><mrow><mi>mm</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span>, 10.37 ENOB over a 20 kHz Bandwidth, and operates at a 1 MHz sampling rate with a power consumption of <span><math><mrow><mn>448</mn><mspace></mspace><mi>μ</mi><mi>W</mi></mrow></math></span>.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"107 ","pages":"Article 102598"},"PeriodicalIF":2.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145618432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced fault detection in digital VLSI circuits using convolutional autoencoders 利用卷积自编码器增强数字VLSI电路的故障检测
IF 2.5 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-03-01 Epub Date: 2025-11-25 DOI: 10.1016/j.vlsi.2025.102608
Chandrasekhar Savalam , Sanjay Medisetti , Prasanti Korapati
As Very Large-Scale Integration (VLSI) technology advances, the demand for reliable and scalable pre-silicon fault detection (FD) techniques continues to grow. Conventional diagnostic methods often face limitations in identifying subtle stuck-at faults within complex and high-dimensional test data. This study proposes a deep learning-based fault detection framework that integrates unsupervised and supervised learning to enhance fault identification and classification in combinational circuits. A Convolutional Autoencoder (CAE) is employed to extract spatial and structural features from circuit test patterns, effectively reducing dimensionality while preserving fault-related information. The encoded features are then classified using a Random Forest model for precise fault localization. The proposed framework is validated on ISCAS’85 benchmark circuits of different sizes and complexities, achieving fault detection accuracies ranging from 93 % to 100 %. Notably, when compared to existing models such as SSAE, VAE, and CEAE, which recorded accuracies between 83 % to 98 %, the proposed CAE-Random Forest framework consistently outperformed them across all benchmarks. Furthermore, the model exhibited stable convergence, low reconstruction error, and efficient memory usage of about 380–403 MB, ensuring reliable and scalable performance. Overall, these results demonstrate that the framework offers a robust, high-accuracy, and resource-efficient solution for automatic fault detection in digital VLSI circuits. It can also be effectively extended to more complex architectures for improved diagnostic reliability.
随着超大规模集成电路(VLSI)技术的进步,对可靠和可扩展的预硅故障检测(FD)技术的需求不断增长。传统的诊断方法在识别复杂和高维测试数据中的细微卡滞故障时往往面临局限性。本文提出了一种基于深度学习的故障检测框架,该框架将无监督学习和有监督学习相结合,以增强组合电路的故障识别和分类能力。采用卷积自编码器(CAE)从电路测试图中提取空间和结构特征,有效地降低了维数,同时保留了故障相关信息。然后使用随机森林模型对编码特征进行分类,以实现精确的故障定位。该框架在不同尺寸和复杂程度的ISCAS’85基准电路上进行了验证,实现了93% ~ 100%的故障检测准确率。值得注意的是,与现有的模型(如SSAE、VAE和CEAE)相比,所提出的cae -随机森林框架在所有基准测试中始终优于它们,这些模型记录的准确率在83%到98%之间。此外,该模型收敛稳定,重构误差低,内存利用率约为380-403 MB,具有可靠的可扩展性。总体而言,这些结果表明,该框架为数字VLSI电路中的自动故障检测提供了鲁棒、高精度和资源高效的解决方案。它还可以有效地扩展到更复杂的体系结构,以提高诊断可靠性。
{"title":"Enhanced fault detection in digital VLSI circuits using convolutional autoencoders","authors":"Chandrasekhar Savalam ,&nbsp;Sanjay Medisetti ,&nbsp;Prasanti Korapati","doi":"10.1016/j.vlsi.2025.102608","DOIUrl":"10.1016/j.vlsi.2025.102608","url":null,"abstract":"<div><div>As Very Large-Scale Integration (VLSI) technology advances, the demand for reliable and scalable pre-silicon fault detection (FD) techniques continues to grow. Conventional diagnostic methods often face limitations in identifying subtle stuck-at faults within complex and high-dimensional test data. This study proposes a deep learning-based fault detection framework that integrates unsupervised and supervised learning to enhance fault identification and classification in combinational circuits. A Convolutional Autoencoder (CAE) is employed to extract spatial and structural features from circuit test patterns, effectively reducing dimensionality while preserving fault-related information. The encoded features are then classified using a Random Forest model for precise fault localization. The proposed framework is validated on ISCAS’85 benchmark circuits of different sizes and complexities, achieving fault detection accuracies ranging from 93 % to 100 %. Notably, when compared to existing models such as SSAE, VAE, and CEAE, which recorded accuracies between 83 % to 98 %, the proposed CAE-Random Forest framework consistently outperformed them across all benchmarks. Furthermore, the model exhibited stable convergence, low reconstruction error, and efficient memory usage of about 380–403 MB, ensuring reliable and scalable performance. Overall, these results demonstrate that the framework offers a robust, high-accuracy, and resource-efficient solution for automatic fault detection in digital VLSI circuits. It can also be effectively extended to more complex architectures for improved diagnostic reliability.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"107 ","pages":"Article 102608"},"PeriodicalIF":2.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145618437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An efficient open-source design and implementation framework for non-quantized CNNs on FPGAs fpga上非量化cnn的高效开源设计与实现框架
IF 2.5 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-03-01 Epub Date: 2025-12-02 DOI: 10.1016/j.vlsi.2025.102625
Angelos Athanasiadis , Nikolaos Tampouratzis , Ioannis Papaefstathiou
The growing demand for real-time processing in artificial intelligence applications, particularly those involving Convolutional Neural Networks (CNNs), has highlighted the need for efficient computational solutions. Conventional processors and graphical processing units (GPUs), very often, fall short in balancing performance, power consumption, and latency, especially in embedded systems and edge computing platforms. Field-Programmable Gate Arrays (FPGAs) offer a promising alternative, combining high performance with energy efficiency and reconfigurability. This paper presents a design and implementation framework for implementing CNNs seamlessly on FPGAs that maintains full precision in all neural network parameters thus addressing a niche, that of non-quantized NNs. The presented framework extends Darknet, which is very widely used for the design of CNNs, and allows the designer, by effectively using a Darknet NN description, to efficiently implement CNNs in a heterogeneous system comprising of CPUs and FPGAs. Our framework is evaluated on the implementation of a number of different CNNs and as part of a real world application utilizing UAVs; in all cases it outperforms the CPU and GPU systems in terms of performance and/or power consumption. When compared with the FPGA frameworks that support quantization, our solution offers similar performance and/or energy efficiency without any degradation on the NN accuracy.
人工智能应用中对实时处理的需求日益增长,特别是涉及卷积神经网络(cnn)的应用,突出了对高效计算解决方案的需求。传统的处理器和图形处理单元(gpu)通常无法平衡性能、功耗和延迟,特别是在嵌入式系统和边缘计算平台中。现场可编程门阵列(fpga)提供了一种很有前途的替代方案,结合了高性能、能效和可重构性。本文提出了一种在fpga上无缝实现cnn的设计和实现框架,该框架在所有神经网络参数中保持完全的精度,从而解决了非量化nn的利基问题。该框架扩展了目前广泛应用于cnn设计的Darknet,并允许设计者通过有效地使用Darknet NN描述,在由cpu和fpga组成的异构系统中有效地实现cnn。我们的框架在许多不同cnn的实施上进行评估,并作为利用无人机的现实世界应用的一部分;在所有情况下,它在性能和/或功耗方面都优于CPU和GPU系统。与支持量化的FPGA框架相比,我们的解决方案提供了类似的性能和/或能源效率,而不会降低神经网络的精度。
{"title":"An efficient open-source design and implementation framework for non-quantized CNNs on FPGAs","authors":"Angelos Athanasiadis ,&nbsp;Nikolaos Tampouratzis ,&nbsp;Ioannis Papaefstathiou","doi":"10.1016/j.vlsi.2025.102625","DOIUrl":"10.1016/j.vlsi.2025.102625","url":null,"abstract":"<div><div>The growing demand for real-time processing in artificial intelligence applications, particularly those involving Convolutional Neural Networks (CNNs), has highlighted the need for efficient computational solutions. Conventional processors and graphical processing units (GPUs), very often, fall short in balancing performance, power consumption, and latency, especially in embedded systems and edge computing platforms. Field-Programmable Gate Arrays (FPGAs) offer a promising alternative, combining high performance with energy efficiency and reconfigurability. This paper presents a design and implementation framework for implementing CNNs seamlessly on FPGAs that maintains full precision in all neural network parameters thus addressing a niche, that of non-quantized NNs. The presented framework extends Darknet, which is very widely used for the design of CNNs, and allows the designer, by effectively using a Darknet NN description, to efficiently implement CNNs in a heterogeneous system comprising of CPUs and FPGAs. Our framework is evaluated on the implementation of a number of different CNNs and as part of a real world application utilizing UAVs; in all cases it outperforms the CPU and GPU systems in terms of performance and/or power consumption. When compared with the FPGA frameworks that support quantization, our solution offers similar performance and/or energy efficiency without any degradation on the NN accuracy.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"107 ","pages":"Article 102625"},"PeriodicalIF":2.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145684628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A differential in-memory computing 12T SRAM macro with enhanced flexibility and reliability for XNOR-network 一个差分内存计算12T SRAM宏,增强了xnor网络的灵活性和可靠性
IF 2.5 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-03-01 Epub Date: 2025-11-20 DOI: 10.1016/j.vlsi.2025.102604
Dekai Sun , Zhang Zhang , Wenyan Liu , Hongbin Yang , Yi Lu , Yonghong Zeng , Biao Zhang , Lianjie Lu
Implementing an artificial intelligence algorithm requires a lot of calculation, but the calculation process needs a lot of data migration, which consumes a lot of energy and time. In-memory computing is a promising paradigm to ease this limitation. XNOR-Network is an effective acceleration technique and has been widely applied in in-memory computing SRAM macro. Current in-memory computing SRAM macro for XNOR-Network has challenges in flexibility and reliability. To overcome these challenges, this paper proposes a differential in-memory computing 12T SRAM macro for XNOR-Network. The proposed SRAM macro eliminates the issue of memory information flipping that occurs during XNOR-and-accumulate operations. Moreover, it is capable of supporting XNOR-and-accumulate operations of varying sizes. Additionally, the XNOR-and-accumulate result can be read out quickly by the sensitive amplifier for its sign or read out by the Flash ADC for its multi-bit quantized value. The proposed architecture has an energy efficiency of 98.6TOPS/W and a recognition rate of 97.06% for MNIST data set.
实现人工智能算法需要大量的计算,但计算过程需要大量的数据迁移,消耗大量的能量和时间。内存计算是缓解这种限制的一个很有前途的范例。xnor网络是一种有效的加速技术,在内存中计算SRAM宏中得到了广泛的应用。当前用于xnor网络的内存计算SRAM宏在灵活性和可靠性方面面临挑战。为了克服这些挑战,本文提出了一种用于xnor网络的差分内存计算12T SRAM宏。所提出的SRAM宏消除了在xnor和累加操作期间发生的内存信息翻转问题。此外,它还能够支持不同大小的xnor和accumulate操作。此外,xnor和累加结果可以由敏感放大器快速读出其符号或由Flash ADC读出其多位量化值。该架构对MNIST数据集的能量效率为98.6TOPS/W,识别率为97.06%。
{"title":"A differential in-memory computing 12T SRAM macro with enhanced flexibility and reliability for XNOR-network","authors":"Dekai Sun ,&nbsp;Zhang Zhang ,&nbsp;Wenyan Liu ,&nbsp;Hongbin Yang ,&nbsp;Yi Lu ,&nbsp;Yonghong Zeng ,&nbsp;Biao Zhang ,&nbsp;Lianjie Lu","doi":"10.1016/j.vlsi.2025.102604","DOIUrl":"10.1016/j.vlsi.2025.102604","url":null,"abstract":"<div><div>Implementing an artificial intelligence algorithm requires a lot of calculation, but the calculation process needs a lot of data migration, which consumes a lot of energy and time. In-memory computing is a promising paradigm to ease this limitation. XNOR-Network is an effective acceleration technique and has been widely applied in in-memory computing SRAM macro. Current in-memory computing SRAM macro for XNOR-Network has challenges in flexibility and reliability. To overcome these challenges, this paper proposes a differential in-memory computing 12T SRAM macro for XNOR-Network. The proposed SRAM macro eliminates the issue of memory information flipping that occurs during XNOR-and-accumulate operations. Moreover, it is capable of supporting XNOR-and-accumulate operations of varying sizes. Additionally, the XNOR-and-accumulate result can be read out quickly by the sensitive amplifier for its sign or read out by the Flash ADC for its multi-bit quantized value. The proposed architecture has an energy efficiency of 98.6TOPS/W and a recognition rate of 97.06% for MNIST data set.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"107 ","pages":"Article 102604"},"PeriodicalIF":2.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145571986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A comparative study on formal verification techniques to verify large integer multiplier circuits 大整数乘法器电路形式化验证技术的比较研究
IF 2.5 3区 工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-03-01 Epub Date: 2025-11-20 DOI: 10.1016/j.vlsi.2025.102606
Jitendra Kumar , Asutosh Srivastava , Masahiro Fujita
Arithmetic circuits are the fundamental building blocks of circuitry with applications including digital signal processing, cryptography processors, and multimedia. Integer multiplier circuits with high bit width of the operands dominate the extensive circuitry area of new-generation technologies. Traditionally, various multiplication algorithms are available to generate multiplier circuits considering area, delay, and power. Custom optimization is performed to reduce the circuit size, which increases the probability of logical bugs in the design. In the past over thirty years, prominent formal verification techniques such as Satisfiability (SAT) checking, Binary Decision Diagram (BDD), and Symbolic Computer Algebra (SCA) made massive progress in analyzing the correctness of the circuits. In this paper, we study the best state-of-the-art techniques from each method available in the academic domain and perform a comparative analysis to verify integer multiplier circuits with different architectures after logic optimization. Although the complexity of BDDs is constantly exponential with the input size of the circuit, and BDDs can be constructed only up to 18 bits, the method is robust to verify a variety of multiplier structures. Algebraic backward rewriting based on Symbolic Computer Algebra (SCA) facilitates the formal verification of high-bit-width multiplier circuits. Conventional approaches that leverage hierarchical structural information are constrained to algebraic-friendly multipliers, wherein adder sub-circuits are preserved in their canonical form, an assumption often invalidated post logic synthesis and optimization. In contrast, advanced algebraic techniques that operate directly on flattened net-lists demonstrate scalability and robustness in verifying large multiplier designs. Formal analysis with straightforward SAT techniques does not work well for comparing two structural non-similar circuits, which is often the case after applying logic optimization. If the degree of similarity is not excessively low, SAT-Sweeping can effectively reduce structural non-similarity, and SAT techniques can verify multipliers up to 512 bits. However, the verification of complex circuits, characterized by their non-algebraic-friendly nature, near-zero similarity to reference circuits, and larger input sizes, remains an open challenge.
算术电路是电路的基本组成部分,其应用包括数字信号处理、密码处理器和多媒体。具有高操作数位宽的整数乘法器电路在新一代技术的广泛电路领域占据主导地位。传统上,考虑到面积、延迟和功耗,各种乘法算法可用于生成乘法器电路。执行自定义优化以减小电路尺寸,这增加了设计中逻辑错误的概率。在过去的三十多年里,主要的形式化验证技术如可满足性(Satisfiability, SAT)检查、二进制决策图(Binary Decision Diagram, BDD)和符号计算机代数(Symbolic Computer Algebra, SCA)在分析电路正确性方面取得了巨大的进步。在本文中,我们研究了学术领域中每种方法中最先进的技术,并进行了比较分析,以验证逻辑优化后具有不同架构的整数乘法器电路。虽然bdd的复杂度随电路的输入尺寸呈指数增长,并且bdd最多只能构造18位,但该方法对于验证各种乘法器结构具有鲁棒性。基于符号计算机代数(SCA)的代数反向重写简化了高位宽乘法器电路的形式化验证。利用分层结构信息的传统方法仅限于代数友好的乘法器,其中加法子电路以其规范形式保存,这一假设通常在逻辑综合和优化后无效。相比之下,直接在扁平网表上操作的先进代数技术在验证大型乘法器设计方面表现出可扩展性和鲁棒性。使用直接的SAT技术进行形式分析,对于比较两个结构上不相似的电路,效果并不好,这是应用逻辑优化后经常出现的情况。如果相似度不是太低,SAT- sweep可以有效地减少结构不相似,并且SAT技术可以验证最多512位的乘法器。然而,复杂电路的验证,其特点是非代数友好的性质,与参考电路接近零的相似性,更大的输入尺寸,仍然是一个开放的挑战。
{"title":"A comparative study on formal verification techniques to verify large integer multiplier circuits","authors":"Jitendra Kumar ,&nbsp;Asutosh Srivastava ,&nbsp;Masahiro Fujita","doi":"10.1016/j.vlsi.2025.102606","DOIUrl":"10.1016/j.vlsi.2025.102606","url":null,"abstract":"<div><div>Arithmetic circuits are the fundamental building blocks of circuitry with applications including digital signal processing, cryptography processors, and multimedia. Integer multiplier circuits with high bit width of the operands dominate the extensive circuitry area of new-generation technologies. Traditionally, various multiplication algorithms are available to generate multiplier circuits considering area, delay, and power. Custom optimization is performed to reduce the circuit size, which increases the probability of logical bugs in the design. In the past over thirty years, prominent formal verification techniques such as Satisfiability (SAT) checking, Binary Decision Diagram (BDD), and Symbolic Computer Algebra (SCA) made massive progress in analyzing the correctness of the circuits. In this paper, we study the best state-of-the-art techniques from each method available in the academic domain and perform a comparative analysis to verify integer multiplier circuits with different architectures after logic optimization. Although the complexity of BDDs is constantly exponential with the input size of the circuit, and BDDs can be constructed only up to 18 bits, the method is robust to verify a variety of multiplier structures. Algebraic backward rewriting based on Symbolic Computer Algebra (SCA) facilitates the formal verification of high-bit-width multiplier circuits. Conventional approaches that leverage hierarchical structural information are constrained to algebraic-friendly multipliers, wherein adder sub-circuits are preserved in their canonical form, an assumption often invalidated post logic synthesis and optimization. In contrast, advanced algebraic techniques that operate directly on flattened net-lists demonstrate scalability and robustness in verifying large multiplier designs. Formal analysis with straightforward SAT techniques does not work well for comparing two structural non-similar circuits, which is often the case after applying logic optimization. If the degree of similarity is not excessively low, SAT-Sweeping can effectively reduce structural non-similarity, and SAT techniques can verify multipliers up to 512 bits. However, the verification of complex circuits, characterized by their non-algebraic-friendly nature, near-zero similarity to reference circuits, and larger input sizes, remains an open challenge.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"107 ","pages":"Article 102606"},"PeriodicalIF":2.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145572012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Integration-The Vlsi Journal
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1