首页 > 最新文献

IEEE open journal of circuits and systems最新文献

英文 中文
A Lightweight Hybrid Random Number Generator With Dynamic Entropy Injection 具有动态熵注入的轻量级混合随机数生成器
IF 2.4 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-08-01 DOI: 10.1109/OJCAS.2025.3582975
Sonia Akter;Shelby Williams;Prosen Kirtonia;Magdy Bayoumi;Kasem Khalil
This paper presents a lightweight hybrid random number generator (HRNG), implemented and evaluated on a Field-Programmable Gate Array (FPGA). The proposed design enhances security and randomness by synergizing jitter and metastability using a feedforward topology, which achieves a near-perfect Shannon entropy. Moreover, it is validated using three distinct entropy metrics, guaranteeing statistically robust random numbers for security-sensitive applications. In addition to entropy evaluations, this design is also rigorously analyzed using multiple industry-standard randomness test suites. Beyond the FPGA implementation, this work presents performance metrics, including area utilization, power consumption, maximum frequency, and energy usage per random bit, which are synthesized across three different technology nodes in Synopsys Design Compiler (SDC). All of the results from the FPGA and the SDC implementations demonstrate significant improvements. These results confirm the design’s scalability to advance technology nodes and its suitability for applications that require secure and reliable random number generation, such as resource-efficient Internet of Things (IoT) devices.
本文提出了一种轻量级混合随机数发生器(HRNG),并在现场可编程门阵列(FPGA)上实现和评估。提出的设计通过使用前馈拓扑协同抖动和亚稳态来增强安全性和随机性,从而实现近乎完美的香农熵。此外,它使用三个不同的熵度量进行验证,保证对安全敏感的应用程序具有统计上健壮的随机数。除了熵评估之外,该设计还使用多个行业标准随机测试套件进行了严格分析。除了FPGA实现之外,这项工作还提供了性能指标,包括面积利用率、功耗、最大频率和每个随机比特的能量使用,这些指标是在Synopsys设计编译器(SDC)中的三个不同技术节点上合成的。FPGA和SDC实现的所有结果都显示出显着的改进。这些结果证实了该设计的可扩展性,以推进技术节点,并适用于需要安全可靠的随机数生成的应用,如资源节约型物联网(IoT)设备。
{"title":"A Lightweight Hybrid Random Number Generator With Dynamic Entropy Injection","authors":"Sonia Akter;Shelby Williams;Prosen Kirtonia;Magdy Bayoumi;Kasem Khalil","doi":"10.1109/OJCAS.2025.3582975","DOIUrl":"https://doi.org/10.1109/OJCAS.2025.3582975","url":null,"abstract":"This paper presents a lightweight hybrid random number generator (HRNG), implemented and evaluated on a Field-Programmable Gate Array (FPGA). The proposed design enhances security and randomness by synergizing jitter and metastability using a feedforward topology, which achieves a near-perfect Shannon entropy. Moreover, it is validated using three distinct entropy metrics, guaranteeing statistically robust random numbers for security-sensitive applications. In addition to entropy evaluations, this design is also rigorously analyzed using multiple industry-standard randomness test suites. Beyond the FPGA implementation, this work presents performance metrics, including area utilization, power consumption, maximum frequency, and energy usage per random bit, which are synthesized across three different technology nodes in Synopsys Design Compiler (SDC). All of the results from the FPGA and the SDC implementations demonstrate significant improvements. These results confirm the design’s scalability to advance technology nodes and its suitability for applications that require secure and reliable random number generation, such as resource-efficient Internet of Things (IoT) devices.","PeriodicalId":93442,"journal":{"name":"IEEE open journal of circuits and systems","volume":"6 ","pages":"257-269"},"PeriodicalIF":2.4,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11106931","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144758423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A New Ultralow-Voltage Retention SRAM Cell Enhancing Noise Immunity 一种增强抗噪性的超低电压保持SRAM单元
IF 2.4 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-07-31 DOI: 10.1109/OJCAS.2025.3594022
Katsutoshi Ito;Yusaku Shiotsu;Satoshi Sugahara
A new ultralow-voltage retention (ULVR) SRAM cell is proposed, which can highly enhance the noise margin (NM) for the ULVR mode at ultralow voltages $(V_{mathrm { UL}})$ . This 8T cell is configured with new-type Schmitt-trigger (ST) inverters that can nearly maximize the hysteresis width of the voltage transfer characteristics (VTC). The design methodology of the cell is developed with careful consideration for the process variation of the constituent transistors, and the optimally designed cell can ensure sufficient NMs that satisfy the $6sigma $ failure probability for all the operating modes. In particular, for the ULVR mode at $V_{mathrm { UL}} {=} 0.2$ V, the proposed 8T cell can exhibit much stronger noise immunity than previously proposed various low-voltage cells. In addition, the proposed 8T cell can achieve stable data retention even at $V_{mathrm { UL}} {=} 0.16$ V with sufficient noise immunity satisfying the $6sigma $ failure probability. An 8kB ULVR-SRAM macro configured with the proposed-8T-cell array is also developed. Using the ULVR mode, the macro can reduce the standby power by ~93% compared with the standby mode of a conventional 6T-SRAM macro.
提出了一种新的超低电压保持(ULVR) SRAM单元,它可以在超低电压$(V_{math {UL}})$下显著提高ULVR模式的噪声余量(NM)。这个8T电池配置了新型的施密特触发(ST)逆变器,可以最大限度地提高电压转移特性(VTC)的滞后宽度。在开发电池的设计方法时,仔细考虑了各组成晶体管的工艺变化,优化设计的电池可以确保足够的NMs,满足所有工作模式下$6sigma $的失效概率。特别是,对于$V_{math} {UL}} {=} 0.2$ V的ULVR模式,所提出的8T电池比以前提出的各种低压电池具有更强的抗噪声能力。此外,所提出的8T单元即使在$V_{ mathm {UL}} {=} 0.16$ V时也能保持稳定的数据保留,并且具有足够的抗噪性,满足$6sigma $的失效概率。本文还开发了一个8kB的ULVR-SRAM宏,该宏配置了所提出的8t -cell阵列。使用ULVR模式,与传统的6T-SRAM宏的待机模式相比,宏的待机功耗可降低约93%。
{"title":"A New Ultralow-Voltage Retention SRAM Cell Enhancing Noise Immunity","authors":"Katsutoshi Ito;Yusaku Shiotsu;Satoshi Sugahara","doi":"10.1109/OJCAS.2025.3594022","DOIUrl":"https://doi.org/10.1109/OJCAS.2025.3594022","url":null,"abstract":"A new ultralow-voltage retention (ULVR) SRAM cell is proposed, which can highly enhance the noise margin (NM) for the ULVR mode at ultralow voltages <inline-formula> <tex-math>$(V_{mathrm { UL}})$ </tex-math></inline-formula>. This 8T cell is configured with new-type Schmitt-trigger (ST) inverters that can nearly maximize the hysteresis width of the voltage transfer characteristics (VTC). The design methodology of the cell is developed with careful consideration for the process variation of the constituent transistors, and the optimally designed cell can ensure sufficient NMs that satisfy the <inline-formula> <tex-math>$6sigma $ </tex-math></inline-formula> failure probability for all the operating modes. In particular, for the ULVR mode at <inline-formula> <tex-math>$V_{mathrm { UL}} {=} 0.2$ </tex-math></inline-formula> V, the proposed 8T cell can exhibit much stronger noise immunity than previously proposed various low-voltage cells. In addition, the proposed 8T cell can achieve stable data retention even at <inline-formula> <tex-math>$V_{mathrm { UL}} {=} 0.16$ </tex-math></inline-formula> V with sufficient noise immunity satisfying the <inline-formula> <tex-math>$6sigma $ </tex-math></inline-formula> failure probability. An 8kB ULVR-SRAM macro configured with the proposed-8T-cell array is also developed. Using the ULVR mode, the macro can reduce the standby power by ~93% compared with the standby mode of a conventional 6T-SRAM macro.","PeriodicalId":93442,"journal":{"name":"IEEE open journal of circuits and systems","volume":"6 ","pages":"370-382"},"PeriodicalIF":2.4,"publicationDate":"2025-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11106369","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144990264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Complex Synchronization Dynamics of Electronic Oscillators–Part I: A Time-Domain Approach via Phase-Amplitude Reduced Models 电子振荡器的复杂同步动力学-第一部分:通过减相幅模型的时域方法
IF 2.4 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-07-25 DOI: 10.1109/OJCAS.2025.3592773
Konstantinos Metaxas;Paul P. Sotiriadis;Yannis Kominis
This work introduces a rigorous time-domain approach for studying the complex synchronization dynamics of periodically forced electronic oscillators, based on the well-developed theories of Phase-Amplitude reduction via the Koopman operator and dynamics of circle maps. The paper is structured in two parts. Part I presents the theoretical foundation and the numerical application of the theory. Under suitable forcing, the reduced equations simplify to a one-dimensional phase model—represented by a circle map—whose bifurcations are determined by the Phase Response Curves. This map efficiently captures the oscillator’s dynamics and enables accurate computation of resonance regions in the forcing parameter space. The influence of global isochron geometry on the map validates their critical role in phase locking, extending previous results in the theory of electronic oscillators. For more general forcing scenarios, the full Phase-Amplitude reduction effectively describes the synchronization dynamics. The developed time-domain approach demonstrates that the same limit cycle oscillator can produce periodic output with tunable spectral characteristics, operating as a frequency divider, or function as a chaotic or quasiperiodic signal generator, depending on the driving signal. As an illustrative example, the synchronization dynamics of differential LC oscillators is studied in detail. Part II is dedicated to confirming the validity, generality, and robustness of the introduced approach, which is first presented as a detailed step-by-step methodology, suitable for direct application to any oscillator. The Colpitts and ring oscillators are analyzed theoretically, and their resonance diagrams are numerically computed, following the approach established in Part I. Simulations of realistically implemented models in the Cadence IC Suite show that both synchronized and chaotic/quasiperiodic states are accurately predicted by the reduced circle map. Notably, despite the use of simplified analytical models, the theoretical framework effectively captures the qualitative behavior observed in simulation. The consistency between the theoretical and simulation results confirms both the robustness and general applicability of the proposed approach.
这项工作介绍了一种严格的时域方法来研究周期性强迫电子振荡器的复杂同步动力学,该方法基于通过Koopman算子和圆映射动力学的相幅减少理论。本文的结构分为两部分。第一部分介绍了该理论的理论基础和数值应用。在适当的强迫作用下,将简化方程简化为一个由相位响应曲线决定分岔的一维相位模型,该模型用圆图表示。该图有效地捕获了振荡器的动力学,并能够在强迫参数空间中精确计算共振区域。全局等时线几何对图的影响验证了它们在锁相中的关键作用,扩展了电子振荡器理论中的先前结果。对于更一般的强迫情景,完整的相位幅度减小有效地描述了同步动力学。所开发的时域方法表明,相同的极限环振荡器可以产生具有可调谐频谱特性的周期输出,作为分频器,或作为混沌或准周期信号发生器,取决于驱动信号。作为一个示例,详细研究了差分LC振荡器的同步动力学。第二部分致力于确认所引入方法的有效性,通用性和鲁棒性,该方法首先作为详细的一步一步的方法提出,适用于直接应用于任何振荡器。根据第一部分建立的方法,对Colpitts和环振子进行了理论分析,并对它们的谐振图进行了数值计算。Cadence IC Suite中实际实现模型的仿真表明,通过简化的圆映射可以准确地预测同步和混沌/准周期状态。值得注意的是,尽管使用了简化的分析模型,理论框架有效地捕获了在模拟中观察到的定性行为。理论和仿真结果的一致性验证了所提方法的鲁棒性和通用性。
{"title":"Complex Synchronization Dynamics of Electronic Oscillators–Part I: A Time-Domain Approach via Phase-Amplitude Reduced Models","authors":"Konstantinos Metaxas;Paul P. Sotiriadis;Yannis Kominis","doi":"10.1109/OJCAS.2025.3592773","DOIUrl":"https://doi.org/10.1109/OJCAS.2025.3592773","url":null,"abstract":"This work introduces a rigorous time-domain approach for studying the complex synchronization dynamics of periodically forced electronic oscillators, based on the well-developed theories of Phase-Amplitude reduction via the Koopman operator and dynamics of circle maps. The paper is structured in two parts. Part I presents the theoretical foundation and the numerical application of the theory. Under suitable forcing, the reduced equations simplify to a one-dimensional phase model—represented by a circle map—whose bifurcations are determined by the Phase Response Curves. This map efficiently captures the oscillator’s dynamics and enables accurate computation of resonance regions in the forcing parameter space. The influence of global isochron geometry on the map validates their critical role in phase locking, extending previous results in the theory of electronic oscillators. For more general forcing scenarios, the full Phase-Amplitude reduction effectively describes the synchronization dynamics. The developed time-domain approach demonstrates that the same limit cycle oscillator can produce periodic output with tunable spectral characteristics, operating as a frequency divider, or function as a chaotic or quasiperiodic signal generator, depending on the driving signal. As an illustrative example, the synchronization dynamics of differential LC oscillators is studied in detail. Part II is dedicated to confirming the validity, generality, and robustness of the introduced approach, which is first presented as a detailed step-by-step methodology, suitable for direct application to any oscillator. The Colpitts and ring oscillators are analyzed theoretically, and their resonance diagrams are numerically computed, following the approach established in Part I. Simulations of realistically implemented models in the Cadence IC Suite show that both synchronized and chaotic/quasiperiodic states are accurately predicted by the reduced circle map. Notably, despite the use of simplified analytical models, the theoretical framework effectively captures the qualitative behavior observed in simulation. The consistency between the theoretical and simulation results confirms both the robustness and general applicability of the proposed approach.","PeriodicalId":93442,"journal":{"name":"IEEE open journal of circuits and systems","volume":"6 ","pages":"329-342"},"PeriodicalIF":2.4,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11096569","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144868343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Complex Synchronization Dynamics of Electronic Oscillators–Part II: Simulations and Validation of Phase-Amplitude Reduced Models 电子振荡器的复杂同步动力学-第二部分:相位幅度减小模型的仿真和验证
IF 2.4 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-07-25 DOI: 10.1109/OJCAS.2025.3592750
Konstantinos Metaxas;Nikolaos P. Eleftheriou;Yannis Kominis;Paul P. Sotiriadis
This work introduces a rigorous time-domain approach for studying the complex synchronization dynamics of periodically forced electronic oscillators, based on the well-developed theories of Phase-Amplitude reduction via the Koopman operator and dynamics of circle maps. The paper is structured in two parts. Part I presents the theoretical foundation and the numerical application of the theory. Under suitable forcing, the reduced equations simplify to a one-dimensional phase model—represented by a circle map—whose bifurcations are determined by the Phase Response Curves. This map efficiently captures the oscillator’s dynamics and enables accurate computation of resonance regions in the forcing parameter space. The influence of global isochron geometry on the map validates their critical role in phase locking, extending previous results in the theory of electronic oscillators. For more general forcing scenarios, the full Phase-Amplitude reduction effectively describes the synchronization dynamics. The developed time-domain approach demonstrates that the same limit cycle oscillator can produce periodic output with tunable spectral characteristics, operating as a frequency divider, or function as a chaotic or quasiperiodic signal generator, depending on the driving signal. As an illustrative example, the synchronization dynamics of differential LC oscillators is studied in detail. Part II is dedicated to confirming the validity, generality, and robustness of the introduced approach, which is first presented as a detailed step-by-step methodology, suitable for direct application to any oscillator. The Colpitts and ring oscillators are analyzed theoretically, and their resonance diagrams are numerically computed, following the approach established in Part I. Simulations of realistically implemented models in the Cadence IC Suite show that both synchronized and chaotic/quasiperiodic states are accurately predicted by the reduced circle map. Notably, despite the use of simplified analytical models, the theoretical framework effectively captures the qualitative behavior observed in simulation. The consistency between the theoretical and simulation results confirms both the robustness and general applicability of the proposed approach.
这项工作介绍了一种严格的时域方法来研究周期性强迫电子振荡器的复杂同步动力学,该方法基于通过Koopman算子和圆映射动力学的相幅减少理论。本文的结构分为两部分。第一部分介绍了该理论的理论基础和数值应用。在适当的强迫作用下,将简化方程简化为一个由相位响应曲线决定分岔的一维相位模型,该模型用圆图表示。该图有效地捕获了振荡器的动力学,并能够在强迫参数空间中精确计算共振区域。全局等时线几何对图的影响验证了它们在锁相中的关键作用,扩展了电子振荡器理论中的先前结果。对于更一般的强迫情景,完整的相位幅度减小有效地描述了同步动力学。所开发的时域方法表明,相同的极限环振荡器可以产生具有可调谐频谱特性的周期输出,作为分频器,或作为混沌或准周期信号发生器,取决于驱动信号。作为一个示例,详细研究了差分LC振荡器的同步动力学。第二部分致力于确认所引入方法的有效性,通用性和鲁棒性,该方法首先作为详细的一步一步的方法提出,适用于直接应用于任何振荡器。根据第一部分建立的方法,对Colpitts和环振子进行了理论分析,并对它们的谐振图进行了数值计算。Cadence IC Suite中实际实现模型的仿真表明,通过简化的圆映射可以准确地预测同步和混沌/准周期状态。值得注意的是,尽管使用了简化的分析模型,理论框架有效地捕获了在模拟中观察到的定性行为。理论和仿真结果的一致性验证了所提方法的鲁棒性和通用性。
{"title":"Complex Synchronization Dynamics of Electronic Oscillators–Part II: Simulations and Validation of Phase-Amplitude Reduced Models","authors":"Konstantinos Metaxas;Nikolaos P. Eleftheriou;Yannis Kominis;Paul P. Sotiriadis","doi":"10.1109/OJCAS.2025.3592750","DOIUrl":"https://doi.org/10.1109/OJCAS.2025.3592750","url":null,"abstract":"This work introduces a rigorous time-domain approach for studying the complex synchronization dynamics of periodically forced electronic oscillators, based on the well-developed theories of Phase-Amplitude reduction via the Koopman operator and dynamics of circle maps. The paper is structured in two parts. Part I presents the theoretical foundation and the numerical application of the theory. Under suitable forcing, the reduced equations simplify to a one-dimensional phase model—represented by a circle map—whose bifurcations are determined by the Phase Response Curves. This map efficiently captures the oscillator’s dynamics and enables accurate computation of resonance regions in the forcing parameter space. The influence of global isochron geometry on the map validates their critical role in phase locking, extending previous results in the theory of electronic oscillators. For more general forcing scenarios, the full Phase-Amplitude reduction effectively describes the synchronization dynamics. The developed time-domain approach demonstrates that the same limit cycle oscillator can produce periodic output with tunable spectral characteristics, operating as a frequency divider, or function as a chaotic or quasiperiodic signal generator, depending on the driving signal. As an illustrative example, the synchronization dynamics of differential LC oscillators is studied in detail. Part II is dedicated to confirming the validity, generality, and robustness of the introduced approach, which is first presented as a detailed step-by-step methodology, suitable for direct application to any oscillator. The Colpitts and ring oscillators are analyzed theoretically, and their resonance diagrams are numerically computed, following the approach established in Part I. Simulations of realistically implemented models in the Cadence IC Suite show that both synchronized and chaotic/quasiperiodic states are accurately predicted by the reduced circle map. Notably, despite the use of simplified analytical models, the theoretical framework effectively captures the qualitative behavior observed in simulation. The consistency between the theoretical and simulation results confirms both the robustness and general applicability of the proposed approach.","PeriodicalId":93442,"journal":{"name":"IEEE open journal of circuits and systems","volume":"6 ","pages":"343-355"},"PeriodicalIF":2.4,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11096566","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144868342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Highly-Efficient Hardware Architecture for ML-KEM PQC Standard ML-KEM PQC标准的高效硬件架构
IF 2.4 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-07-22 DOI: 10.1109/OJCAS.2025.3591136
Haesung Jung;Quang Dang Truong;Hanho Lee
The advent of quantum computers, with their immense computational potential, poses significant threats to traditional cryptographic systems. In response, NIST announced the quantum-resistant Module Lattice-based Key Encapsulation Mechanism (ML-KEM) standard in 2024. This paper presents an efficient hardware architecture for the ML-KEM scheme, capable of supporting all algorithms and flexibly adapting to different security levels. The proposed design achieves a balance between high performance and low hardware resource consumption, making it suitable for deployment across various FPGA platforms. Key innovations include the Unified Polynomial Arithmetic Module (UniPAM), capable of handling all polynomial arithmetic operations, and an optimized hash module for the SHA-3 variants integral to ML-KEM. Additionally, the design introduces an efficient timing diagram and conflict-free memory management strategy, enabling seamless parallelism and reducing execution time while minimizing hardware resource consumption. Furthermore, the implementation incorporates several methods to effectively mitigate side-channel attacks, a common concern in hardware-based cryptosystem deployments. The proposed architecture is validated through implementation on an Artix-7 FPGA and Synopsys 14nm ASIC technology. Compared to state-of-the-art designs, our approach demonstrates superior performance while maintaining comparable hardware resource efficiency. Specifically, the hardware implementation on the Xilinx Artix-7 utilizes 12k LUTs, 6.9k FFs, 4 DSPs, and 9 BRAMs at clock frequency of 220 MHz.
量子计算机的出现,以其巨大的计算潜力,对传统的密码系统构成了重大威胁。作为回应,NIST在2024年宣布了抗量子模块晶格密钥封装机制(ML-KEM)标准。本文提出了一种高效的ML-KEM方案硬件架构,能够支持所有算法,并灵活适应不同的安全级别。提出的设计实现了高性能和低硬件资源消耗之间的平衡,使其适合在各种FPGA平台上部署。关键的创新包括能够处理所有多项式算术运算的统一多项式算术模块(UniPAM),以及用于ML-KEM中不可分割的SHA-3变体的优化哈希模块。此外,该设计还引入了高效的时序图和无冲突的内存管理策略,支持无缝并行并减少执行时间,同时最大限度地减少硬件资源消耗。此外,该实现结合了几种方法来有效减轻侧信道攻击,这是基于硬件的密码系统部署中常见的问题。通过在Artix-7 FPGA和Synopsys 14nm ASIC技术上的实现,验证了所提出的架构。与最先进的设计相比,我们的方法在保持相当的硬件资源效率的同时展示了卓越的性能。具体来说,Xilinx Artix-7上的硬件实现在220 MHz时钟频率下使用12k lut, 6.9k ff, 4个dsp和9个bram。
{"title":"Highly-Efficient Hardware Architecture for ML-KEM PQC Standard","authors":"Haesung Jung;Quang Dang Truong;Hanho Lee","doi":"10.1109/OJCAS.2025.3591136","DOIUrl":"https://doi.org/10.1109/OJCAS.2025.3591136","url":null,"abstract":"The advent of quantum computers, with their immense computational potential, poses significant threats to traditional cryptographic systems. In response, NIST announced the quantum-resistant Module Lattice-based Key Encapsulation Mechanism (ML-KEM) standard in 2024. This paper presents an efficient hardware architecture for the ML-KEM scheme, capable of supporting all algorithms and flexibly adapting to different security levels. The proposed design achieves a balance between high performance and low hardware resource consumption, making it suitable for deployment across various FPGA platforms. Key innovations include the Unified Polynomial Arithmetic Module (UniPAM), capable of handling all polynomial arithmetic operations, and an optimized hash module for the SHA-3 variants integral to ML-KEM. Additionally, the design introduces an efficient timing diagram and conflict-free memory management strategy, enabling seamless parallelism and reducing execution time while minimizing hardware resource consumption. Furthermore, the implementation incorporates several methods to effectively mitigate side-channel attacks, a common concern in hardware-based cryptosystem deployments. The proposed architecture is validated through implementation on an Artix-7 FPGA and Synopsys 14nm ASIC technology. Compared to state-of-the-art designs, our approach demonstrates superior performance while maintaining comparable hardware resource efficiency. Specifically, the hardware implementation on the Xilinx Artix-7 utilizes 12k LUTs, 6.9k FFs, 4 DSPs, and 9 BRAMs at clock frequency of 220 MHz.","PeriodicalId":93442,"journal":{"name":"IEEE open journal of circuits and systems","volume":"6 ","pages":"356-369"},"PeriodicalIF":2.4,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11088254","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144893917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
L-Sort: On-Chip Spike Sorting With Efficient Median-of-Median Detection and Localization-Based Clustering L-Sort:片上尖峰排序与高效中位数检测和基于定位的聚类
IF 2.4 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-07-08 DOI: 10.1109/OJCAS.2025.3584317
Yuntao Han;Yihan Pan;Xiongfei Jiang;Cristian Sestito;Shady Agwa;Themis Prodromakis;Shiwei Wang
Spike sorting is a critical process for decoding large-scale neural activity from extracellular recordings. The advancement of neural probes facilitates the recording of a high number of neurons with an increase in channel counts, arising a higher data volume and challenging the current on-chip spike sorters. This paper introduces L-Sort, a novel on-chip spike sorting solution featuring median-of-median spike detection and localization-based clustering. By combining the median-of-median approximation and the proposed incremental median calculation scheme, our detection module achieves a reduction in memory consumption. Moreover, the localization-based clustering utilizes geometric features instead of morphological features, thus eliminating the memory-consuming buffer for containing the spike waveform during feature extraction. Evaluation using Neuropixels datasets demonstrates that L-Sort achieves competitive sorting accuracy with reduced hardware resource consumption. Implementations on FPGA and ASIC (180 nm technology) demonstrate significant improvements in area and power efficiency compared to state-of-the-art designs while maintaining comparable accuracy. If normalized to 22 nm technology, our design can achieve roughly $times 10$ area and power efficiency with similar accuracy, compared with the state-of-the-art design evaluated with the same dataset. Therefore, L-Sort is a promising solution for real-time, high-channel-count neural processing in implantable devices.
脉冲分类是解码大规模神经活动的关键过程,从细胞外记录。神经探针的进步促进了大量神经元的记录,增加了通道数,产生了更高的数据量,挑战了当前的片上尖峰分拣器。L-Sort是一种新颖的芯片上尖峰排序方法,具有中位数尖峰检测和基于定位的聚类功能。通过结合中位数近似和提出的增量中位数计算方案,我们的检测模块实现了内存消耗的减少。此外,基于定位的聚类利用几何特征而不是形态特征,从而消除了特征提取过程中包含尖峰波形的内存消耗缓冲。使用Neuropixels数据集的评估表明,L-Sort在减少硬件资源消耗的情况下实现了具有竞争力的排序精度。与最先进的设计相比,FPGA和ASIC(180纳米技术)上的实现在面积和功率效率方面有了显着改善,同时保持了相当的精度。如果归一化到22纳米技术,与使用相同数据集评估的最先进设计相比,我们的设计可以实现大约$ × 10$的面积和功率效率,精度相似。因此,L-Sort是可植入设备中实时、高通道计数神经处理的一个很有前途的解决方案。
{"title":"L-Sort: On-Chip Spike Sorting With Efficient Median-of-Median Detection and Localization-Based Clustering","authors":"Yuntao Han;Yihan Pan;Xiongfei Jiang;Cristian Sestito;Shady Agwa;Themis Prodromakis;Shiwei Wang","doi":"10.1109/OJCAS.2025.3584317","DOIUrl":"https://doi.org/10.1109/OJCAS.2025.3584317","url":null,"abstract":"Spike sorting is a critical process for decoding large-scale neural activity from extracellular recordings. The advancement of neural probes facilitates the recording of a high number of neurons with an increase in channel counts, arising a higher data volume and challenging the current on-chip spike sorters. This paper introduces L-Sort, a novel on-chip spike sorting solution featuring median-of-median spike detection and localization-based clustering. By combining the median-of-median approximation and the proposed incremental median calculation scheme, our detection module achieves a reduction in memory consumption. Moreover, the localization-based clustering utilizes geometric features instead of morphological features, thus eliminating the memory-consuming buffer for containing the spike waveform during feature extraction. Evaluation using Neuropixels datasets demonstrates that L-Sort achieves competitive sorting accuracy with reduced hardware resource consumption. Implementations on FPGA and ASIC (180 nm technology) demonstrate significant improvements in area and power efficiency compared to state-of-the-art designs while maintaining comparable accuracy. If normalized to 22 nm technology, our design can achieve roughly <inline-formula> <tex-math>$times 10$ </tex-math></inline-formula> area and power efficiency with similar accuracy, compared with the state-of-the-art design evaluated with the same dataset. Therefore, L-Sort is a promising solution for real-time, high-channel-count neural processing in implantable devices.","PeriodicalId":93442,"journal":{"name":"IEEE open journal of circuits and systems","volume":"6 ","pages":"205-216"},"PeriodicalIF":2.4,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11072521","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144758375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BAG3++: An Extensible Generator Framework for Automated Layout-Aware AMS Design bag3++:用于自动布局感知AMS设计的可扩展生成器框架
IF 2.4 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-06-26 DOI: 10.1109/OJCAS.2024.3502641
Felicia Guo;Bob Zhou;Ayan Biswas;Paul Kwon;Zhaokai Liu;Ken Ho;Vladimir Stojanović;Borivoje Nikolić
We present BAG $3{++}$ , an extensible analog/mixed-signal (AMS) design framework for layout-aware design. BAG $3{++}$ realizes a unified design environment that merges schematic, layout, and verification views into a single development interface. We further introduce new automated design features that enable rapid automation and optimization across a range of performance specifications, processes, and applications. We demonstrate the practical use of these features through (a) a bit-reconfigurable successive-approximation-register (SAR) analog-to-digital converter (ADC) implemented in the open-source Skywater 130nm process and (b) an ultra-high speed output driver optimized in two modern processes. BAG $3{++}$ interfaces with both commercial and open-source design frameworks, and the extensibility of BAG $3{++}$ is further illustrated through the integration of an open-source simulator.
我们提出了BAG $3{++}$,一个可扩展的模拟/混合信号(AMS)设计框架,用于布局感知设计。BAG $3{++}$实现了一个统一的设计环境,它将原理图、布局和验证视图合并到一个单独的开发界面中。我们进一步引入新的自动化设计功能,使一系列性能规范、流程和应用程序能够快速自动化和优化。我们通过(a)在开源Skywater 130nm工艺中实现的位可重构连续逼近寄存器(SAR)模数转换器(ADC)和(b)在两个现代工艺中优化的超高速输出驱动器演示了这些功能的实际使用。BAG $3{++}$与商业和开源设计框架接口,并且通过集成开源模拟器进一步说明了BAG $3{++}$的可扩展性。
{"title":"BAG3++: An Extensible Generator Framework for Automated Layout-Aware AMS Design","authors":"Felicia Guo;Bob Zhou;Ayan Biswas;Paul Kwon;Zhaokai Liu;Ken Ho;Vladimir Stojanović;Borivoje Nikolić","doi":"10.1109/OJCAS.2024.3502641","DOIUrl":"https://doi.org/10.1109/OJCAS.2024.3502641","url":null,"abstract":"We present BAG<inline-formula> <tex-math>$3{++}$ </tex-math></inline-formula>, an extensible analog/mixed-signal (AMS) design framework for layout-aware design. BAG<inline-formula> <tex-math>$3{++}$ </tex-math></inline-formula> realizes a unified design environment that merges schematic, layout, and verification views into a single development interface. We further introduce new automated design features that enable rapid automation and optimization across a range of performance specifications, processes, and applications. We demonstrate the practical use of these features through (a) a bit-reconfigurable successive-approximation-register (SAR) analog-to-digital converter (ADC) implemented in the open-source Skywater 130nm process and (b) an ultra-high speed output driver optimized in two modern processes. BAG<inline-formula> <tex-math>$3{++}$ </tex-math></inline-formula> interfaces with both commercial and open-source design frameworks, and the extensibility of BAG<inline-formula> <tex-math>$3{++}$ </tex-math></inline-formula> is further illustrated through the integration of an open-source simulator.","PeriodicalId":93442,"journal":{"name":"IEEE open journal of circuits and systems","volume":"6 ","pages":"181-191"},"PeriodicalIF":2.4,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11052889","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144492372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Revolutionize 3D-Chip Design With Open3DFlow, an Open-Source AI-Enhanced Solution 使用开源ai增强解决方案Open3DFlow革新3d芯片设计
IF 2.4 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-06-26 DOI: 10.1109/OJCAS.2024.3518754
Yifei Zhu;Zhenxuan Luan;Dawei Feng;Weiwei Chen;Lei Ren;Zhangxi Tan
The escalating demand for high-performance and energy-efficient electronics has propelled 3D integrated circuits (3D ICs) as a promising solution. However, major obstacles have been the lack of specialized electronic design automation (EDA) software and standardized design flows for 3D chiplets. To bridge the gap, we introduce Open3DFlow,1 an open-source design platform for 3D ICs. It is a seven-step workflow that incorporates essential ASIC back-end processes while supporting multi-physics analysis, such as through silicon via (TSV) modeling, thermal analysis, and signal integrity (SI) evaluations. To illustrate all functionalities of Open3DFlow, we use it to implement a 3D RISC-V CPU design with a vertically stacked L2 cache on a separated die. We harden both CPU logic and 3D-cache die in a GlobalFoundries $0.18mu $ m (GF180) process with open-source PDK support. We enable face-to-face (F2F) coupling of the top and bottom die by constructing a bonding layer based on the original technology file. Open3DFlow’s open-source nature allows seamless integration of custom AI optimization algorithms. As a showcase, we leverage large language models (LLMs) to help the bonding pad placement. In addition, we apply LLM on back-end Tcl script generations to improve design productivity. We expect Open3DFlow to open up a brand-new paradigm for future 3D IC innovations.
对高性能和节能电子产品不断增长的需求推动了3D集成电路(3D ic)作为一个有前途的解决方案。然而,主要的障碍是缺乏专门的电子设计自动化(EDA)软件和3D小芯片的标准化设计流程。为了弥补这一差距,我们引入了Open3DFlow,一个3D ic的开源设计平台。这是一个七步工作流程,结合了基本的ASIC后端流程,同时支持多物理场分析,如通过硅孔(TSV)建模、热分析和信号完整性(SI)评估。为了说明Open3DFlow的所有功能,我们用它来实现一个3D RISC-V CPU设计,在一个独立的die上有一个垂直堆叠的L2缓存。我们在GlobalFoundries $0.18mu $ m (GF180)进程中强化CPU逻辑和3d缓存芯片,并支持开源PDK。我们通过基于原始技术文件构建键合层,实现了上下模具的面对面(F2F)耦合。Open3DFlow的开源特性允许自定义AI优化算法的无缝集成。作为展示,我们利用大型语言模型(llm)来帮助键合垫的放置。此外,我们将LLM应用于后端Tcl脚本生成,以提高设计效率。我们期待Open3DFlow为未来的3D集成电路创新开辟一个全新的范例。
{"title":"Revolutionize 3D-Chip Design With Open3DFlow, an Open-Source AI-Enhanced Solution","authors":"Yifei Zhu;Zhenxuan Luan;Dawei Feng;Weiwei Chen;Lei Ren;Zhangxi Tan","doi":"10.1109/OJCAS.2024.3518754","DOIUrl":"https://doi.org/10.1109/OJCAS.2024.3518754","url":null,"abstract":"The escalating demand for high-performance and energy-efficient electronics has propelled 3D integrated circuits (3D ICs) as a promising solution. However, major obstacles have been the lack of specialized electronic design automation (EDA) software and standardized design flows for 3D chiplets. To bridge the gap, we introduce Open3DFlow,<xref>1</xref> an open-source design platform for 3D ICs. It is a seven-step workflow that incorporates essential ASIC back-end processes while supporting multi-physics analysis, such as through silicon via (TSV) modeling, thermal analysis, and signal integrity (SI) evaluations. To illustrate all functionalities of <italic>Open3DFlow</i>, we use it to implement a 3D RISC-V CPU design with a vertically stacked L2 cache on a separated die. We harden both CPU logic and 3D-cache die in a GlobalFoundries <inline-formula> <tex-math>$0.18mu $ </tex-math></inline-formula>m (GF180) process with open-source PDK support. We enable face-to-face (F2F) coupling of the top and bottom die by constructing a bonding layer based on the original technology file. <italic>Open3DFlow</i>’s open-source nature allows seamless integration of custom AI optimization algorithms. As a showcase, we leverage large language models (LLMs) to help the bonding pad placement. In addition, we apply LLM on back-end Tcl script generations to improve design productivity. We expect <italic>Open3DFlow</i> to open up a brand-new paradigm for future 3D IC innovations.","PeriodicalId":93442,"journal":{"name":"IEEE open journal of circuits and systems","volume":"6 ","pages":"169-180"},"PeriodicalIF":2.4,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11052893","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144492383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated Fixed-Point Precision Optimization for FPGA Synthesis FPGA合成的自动定点精度优化
IF 2.4 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-06-18 DOI: 10.1109/OJCAS.2025.3580744
Inès Winandy;Arnaud Dion;Florent Manni;Pierre-Loïc Garoche;Dorra Ben Khalifa;Matthieu Martel
Precision tuning of fixed-point arithmetic is a powerful technique for optimizing hardware designs on, where computing resources and memory are often severely constrained. While fixed-point arithmetic offers significant performance and area advantages over floating-point implementations, deriving an appropriate fixed-point representation remains a challenging task. In particular, developers must carefully select the number of bits assigned to the integer and fractional parts of each variable to balance accuracy and resource consumption. In this article, we introduce an original precision tuning technique for synthesizing fixed-point programs from floating-point code, specifically targeting platforms. The distinguishing feature of our technique lies in its formal approach to error analysis: it systematically propagates numerical errors through computations to infer variable-specific fixed-point formats that guarantee user-specified accuracy bounds. Unlike heuristic or ad-hoc methods, our technique provides formal guarantees on the final accuracy of the generated code, ensuring safe deployment on hardware platforms. To enable hardware-friendly implementations, the resulting fixed-point programs use the ap_fixed data types provided by High Level Synthesis (HLS) tools, allowing fine-grained control over the precision of each variable. Our method has been implemented within the POPiX 2.0 framework, which automatically generates optimized fixed-point code ready for synthesis. Experimental results on a set of embedded benchmarks show that our fixed-point codes use predominantly fewer machine cycles than floating-point codes when compiled on an with the state-of-the-art HLS compiler by AMD. Also, our generated fixed-point codes reduce hardware resource usage, such as LUTs, flip-flops, and DSP blocks, with typical reductions ranging from 67% to 83% compared to double precision floating-point codes, depending on the application.
定点算法的精确调优是优化硬件设计的一种强大技术,在这种情况下,计算资源和内存通常受到严重限制。虽然定点算法比浮点实现具有显著的性能和面积优势,但推导适当的定点表示仍然是一项具有挑战性的任务。特别是,开发人员必须仔细选择分配给每个变量的整数和小数部分的位数,以平衡准确性和资源消耗。在本文中,我们将介绍一种原始的精确调优技术,用于从浮点代码合成定点程序,特别是针对平台。我们的技术的显著特点在于其误差分析的形式化方法:它通过计算系统地传播数值误差,以推断变量特定的定点格式,从而保证用户指定的精度界限。与启发式或特别方法不同,我们的技术为生成的代码的最终准确性提供了正式的保证,确保在硬件平台上的安全部署。为了实现对硬件友好的实现,得到的定点程序使用高级综合(High Level Synthesis, HLS)工具提供的ap_fixed数据类型,允许对每个变量的精度进行细粒度控制。我们的方法已经在POPiX 2.0框架中实现,该框架自动生成优化的定点代码,准备进行合成。在一组嵌入式基准测试上的实验结果表明,当使用AMD最先进的HLS编译器在计算机上编译时,我们的定点代码使用的机器周期明显少于浮点代码。此外,我们生成的定点代码减少了硬件资源的使用,如lut、触发器和DSP块,与双精度浮点代码相比,典型的减少幅度从67%到83%不等,具体取决于应用程序。
{"title":"Automated Fixed-Point Precision Optimization for FPGA Synthesis","authors":"Inès Winandy;Arnaud Dion;Florent Manni;Pierre-Loïc Garoche;Dorra Ben Khalifa;Matthieu Martel","doi":"10.1109/OJCAS.2025.3580744","DOIUrl":"https://doi.org/10.1109/OJCAS.2025.3580744","url":null,"abstract":"Precision tuning of fixed-point arithmetic is a powerful technique for optimizing hardware designs on, where computing resources and memory are often severely constrained. While fixed-point arithmetic offers significant performance and area advantages over floating-point implementations, deriving an appropriate fixed-point representation remains a challenging task. In particular, developers must carefully select the number of bits assigned to the integer and fractional parts of each variable to balance accuracy and resource consumption. In this article, we introduce an original precision tuning technique for synthesizing fixed-point programs from floating-point code, specifically targeting platforms. The distinguishing feature of our technique lies in its formal approach to error analysis: it systematically propagates numerical errors through computations to infer variable-specific fixed-point formats that guarantee user-specified accuracy bounds. Unlike heuristic or ad-hoc methods, our technique provides formal guarantees on the final accuracy of the generated code, ensuring safe deployment on hardware platforms. To enable hardware-friendly implementations, the resulting fixed-point programs use the ap_fixed data types provided by High Level Synthesis (HLS) tools, allowing fine-grained control over the precision of each variable. Our method has been implemented within the <sc>POPiX 2.0</small> framework, which automatically generates optimized fixed-point code ready for synthesis. Experimental results on a set of embedded benchmarks show that our fixed-point codes use predominantly fewer machine cycles than floating-point codes when compiled on an with the state-of-the-art HLS compiler by AMD. Also, our generated fixed-point codes reduce hardware resource usage, such as LUTs, flip-flops, and DSP blocks, with typical reductions ranging from 67% to 83% compared to double precision floating-point codes, depending on the application.","PeriodicalId":93442,"journal":{"name":"IEEE open journal of circuits and systems","volume":"6 ","pages":"192-204"},"PeriodicalIF":2.4,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11039693","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144581739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
End-to-End Neural Video Compression: A Review 端到端神经网络视频压缩:综述
IF 2.4 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-04-10 DOI: 10.1109/OJCAS.2025.3559774
Jiovana S. Gomes;Mateus Grellert;Fábio L. L. Ramos;Sergio Bampi
The pervasive presence of video content has spurred the development of advanced technologies to manage, process, and deliver high-quality content efficiently. Video compression is crucial in providing high-quality video services under limited network and storage capacities, traditionally achieved through hybrid codecs. However, as these frameworks reach a performance bottleneck with compression gains becoming harder to achieve with conventional methods, Deep Neural Networks (DNNs) offer a promising alternative. By leveraging DNNs’ nonlinear representation capacity, these networks can enhance compression efficiency and visual quality. Neural Video Coding (NVC) has recently received significant attention, with Neural Image Coding models surpassing traditional codecs in compression ratios. Therefore, this survey explores the state-of-the-art in NVC, examining recent works, frameworks, and the potential of this innovative approach to revolutionize video compression. We identify that NVC models have come a long way since the first proposals and currently are on par in compression efficiency with the latest hybrid codec, VVC. Still, many improvements are required to enable the practical usage of NVC, such as hardware-friendly development to enable faster inference and execution on mobile and energy-constrained devices.
视频内容的普遍存在刺激了先进技术的发展,以有效地管理、处理和交付高质量的内容。视频压缩是在有限的网络和存储容量下提供高质量视频服务的关键,传统上是通过混合编解码器实现的。然而,随着这些框架达到性能瓶颈,压缩增益变得越来越难以用传统方法实现,深度神经网络(dnn)提供了一个有前途的替代方案。通过利用深度神经网络的非线性表示能力,这些网络可以提高压缩效率和视觉质量。神经图像编码(Neural Image Coding, NVC)模型在压缩比方面优于传统的编解码器,近年来备受关注。因此,本调查探讨了NVC的最新技术,研究了最近的作品、框架以及这种革新视频压缩方法的潜力。我们发现,自第一个提案以来,NVC模型已经取得了长足的进步,目前在压缩效率方面与最新的混合编解码器VVC相当。尽管如此,要实现NVC的实际使用,还需要进行许多改进,例如硬件友好型开发,以便在移动设备和能源受限的设备上实现更快的推理和执行。
{"title":"End-to-End Neural Video Compression: A Review","authors":"Jiovana S. Gomes;Mateus Grellert;Fábio L. L. Ramos;Sergio Bampi","doi":"10.1109/OJCAS.2025.3559774","DOIUrl":"https://doi.org/10.1109/OJCAS.2025.3559774","url":null,"abstract":"The pervasive presence of video content has spurred the development of advanced technologies to manage, process, and deliver high-quality content efficiently. Video compression is crucial in providing high-quality video services under limited network and storage capacities, traditionally achieved through hybrid codecs. However, as these frameworks reach a performance bottleneck with compression gains becoming harder to achieve with conventional methods, Deep Neural Networks (DNNs) offer a promising alternative. By leveraging DNNs’ nonlinear representation capacity, these networks can enhance compression efficiency and visual quality. Neural Video Coding (NVC) has recently received significant attention, with Neural Image Coding models surpassing traditional codecs in compression ratios. Therefore, this survey explores the state-of-the-art in NVC, examining recent works, frameworks, and the potential of this innovative approach to revolutionize video compression. We identify that NVC models have come a long way since the first proposals and currently are on par in compression efficiency with the latest hybrid codec, VVC. Still, many improvements are required to enable the practical usage of NVC, such as hardware-friendly development to enable faster inference and execution on mobile and energy-constrained devices.","PeriodicalId":93442,"journal":{"name":"IEEE open journal of circuits and systems","volume":"6 ","pages":"120-134"},"PeriodicalIF":2.4,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10962175","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143848781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE open journal of circuits and systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1