IEEE Journal on Emerging and Selected Topics in Circuits and Systems最新文献_第9页

Low-Loss and Compact Millimeter-Wave Silicon-Based Filters: Overview, New Developments in Silicon-on-Insulator Technology, and Future Trends 低损耗和紧凑型毫米波硅基滤波器：概览、绝缘体上硅技术的新发展和未来趋势

IF 4.6 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Journal on Emerging and Selected Topics in Circuits and Systems

Pub Date : 2023-12-21 DOI: 10.1109/JETCAS.2023.3345476

Robert Nericua;Ke Wang;He Zhu;Roberto Gómez-García;Xi Zhu

This paper presents an overview of Silicon-based millimeter-wave (mm-wave) passive devices for bandpass and bandstop filtering applications, while also reporting originally-conceived filter developments and future trends. First of all, the state-of-the-art on mm-wave low-loss bandpass filters (BPFs) is covered, and new BPF designs are shown. The engineered BPFs employ a center-tapped ring architecture with shunt-connected capacitors to realize a standard 2nd-order baseline BPF design, which is subsequently scaled to 30-GHz and 60-GHz operational frequencies. To increase the selectivity as well as the stopband rejection levels of this baseline BPF, the in-series cascade connection of the baseline BPF units is used for a higher-order BPF realization. For experimental-validation purposes, a total of four mm-wave BPFs based on these design strategies are implemented, fabricated in 45-nm Silicon-on-Insulator (SOI) complementary-metal-oxide-semiconductor-(CMOS) technology, and tested. Afterward, a review of Silicon-based-integrated bandstop filters (BSFs) operating in the mm-wave region is provided, which includes both reflective-type and reflectionless/absorptive filter realizations for RF-interference-suppression in highly-congested electromagnetic (EM) environments. Finally, future research trends in the Silicon-based-integrated filter area are discussed. They are expected to play a key role in the development of modern radio-frequency (RF) front-ends for emerging beyond 5G and EM-sensing scenarios.

本文概述了用于带通和带阻滤波器应用的硅基毫米波（mm-wave）无源器件，同时还报告了最初设想的滤波器发展情况和未来趋势。首先，介绍了毫米波低损耗带通滤波器（BPF）的最新技术，并展示了新的 BPF 设计。所设计的带通滤波器采用了带并联电容器的中心抽头环形结构，实现了标准的二阶基线带通滤波器设计，随后将其扩展到 30 千兆赫和 60 千兆赫的工作频率。为了提高基线 BPF 的选择性和阻带抑制水平，基线 BPF 单元的串联级联被用于实现高阶 BPF。为了进行实验验证，基于这些设计策略共实现了四个毫米波 BPF，并在 45 纳米硅绝缘体（SOI）互补金属氧化物半导体（CMOS）技术中进行了制造和测试。随后，回顾了在毫米波区域工作的硅基集成带阻滤波器（BSF），包括反射型和无反射/吸收型滤波器，用于在高度拥挤的电磁（EM）环境中抑制射频干扰。最后，讨论了硅基集成滤波器领域的未来研究趋势。预计它们将在为新兴的 5G 和电磁传感场景开发现代射频（RF）前端中发挥关键作用。

{"title":"Low-Loss and Compact Millimeter-Wave Silicon-Based Filters: Overview, New Developments in Silicon-on-Insulator Technology, and Future Trends","authors":"Robert Nericua;Ke Wang;He Zhu;Roberto Gómez-García;Xi Zhu","doi":"10.1109/JETCAS.2023.3345476","DOIUrl":"https://doi.org/10.1109/JETCAS.2023.3345476","url":null,"abstract":"This paper presents an overview of Silicon-based millimeter-wave (mm-wave) passive devices for bandpass and bandstop filtering applications, while also reporting originally-conceived filter developments and future trends. First of all, the state-of-the-art on mm-wave low-loss bandpass filters (BPFs) is covered, and new BPF designs are shown. The engineered BPFs employ a center-tapped ring architecture with shunt-connected capacitors to realize a standard 2nd-order baseline BPF design, which is subsequently scaled to 30-GHz and 60-GHz operational frequencies. To increase the selectivity as well as the stopband rejection levels of this baseline BPF, the in-series cascade connection of the baseline BPF units is used for a higher-order BPF realization. For experimental-validation purposes, a total of four mm-wave BPFs based on these design strategies are implemented, fabricated in 45-nm Silicon-on-Insulator (SOI) complementary-metal-oxide-semiconductor-(CMOS) technology, and tested. Afterward, a review of Silicon-based-integrated bandstop filters (BSFs) operating in the mm-wave region is provided, which includes both reflective-type and reflectionless/absorptive filter realizations for RF-interference-suppression in highly-congested electromagnetic (EM) environments. Finally, future research trends in the Silicon-based-integrated filter area are discussed. They are expected to play a key role in the development of modern radio-frequency (RF) front-ends for emerging beyond 5G and EM-sensing scenarios.","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"14 1","pages":"30-40"},"PeriodicalIF":4.6,"publicationDate":"2023-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140123558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A 340-GHz THz Amplifier-Frequency-Multiplier Chain With 360° Phase-Shifting Range and its Phase Characterization 具有 360° 相移范围的 340 GHz 太赫兹放大器-频率-倍增器链及其相位特性分析

IF 4.6 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Journal on Emerging and Selected Topics in Circuits and Systems

Pub Date : 2023-12-21 DOI: 10.1109/JETCAS.2023.3345358

Chun Wang;Pin-Chun Chiu;Chun-Lin Ko;Sheng-Hsiang Tseng;Chun-Hsing Li

A 340-GHz compact terahertz (THz) amplifier-frequency-multiplier chain (AMC) offering a full 360° phase shifting range for phased-array applications is proposed in this paper. The AMC comprises an 85 -GHz phase-shifter-embedded (

$Delta varphi $

-embedded) power amplifier (PA) and a high-output-power frequency quadrupler (FQ). The PA is equipped with multifunctional impedance matching networks (M-IMNs) that can simultaneously provide balun, impedance transformation, and phase-shifting functions. Analytic expressions have been derived to provide design guidelines for the M-IMNs. With the integrated M-IMNs, the proposed PA can concurrently deliver high output power and a phase shift exceeding 90° in a compact chip area. The proposed FQ can achieve optimal impedance matching at second and fourth harmonic frequencies, leading to the output power enhancement of 2.6 dB. Furthermore, the output phase of the PA is quadrupled by the FQ, resulting in the output signal of the AMC with a full 360° phase-shifting capability. A measurement setup for characterizing the phase of a THz signal is also presented. Implemented in a 40-nm CMOS technology without ultra-thick metal layers available, the proposed THz AMC achieves a peak output power of −3.5 dBm at 368 GHz with a conversion gain of 1.8 dB and a 3-dB bandwidth from 340 to 376 GHz. The output phase can continuously vary over 360° within the 324 to 346 GHz frequency range. The phase noise of the output signal at 346 GHz is −105 dBc/Hz at a 10-MHz offset frequency. The proposed 340-GHz AMC consumes 215.1 mW from a 0.9-V supply.

本文提出了一种 340 GHz 的紧凑型太赫兹（THz）放大器-频率-倍增器链（AMC），可为相控阵应用提供完整的 360° 相移范围。AMC 由一个嵌入式 85 GHz 移相器（$Delta varphi $-embedded）功率放大器（PA）和一个高输出功率频率四倍频器（FQ）组成。功率放大器配有多功能阻抗匹配网络（M-IMN），可同时提供平衡、阻抗变换和移相功能。分析表达式为 M-IMN 的设计提供了指导。利用集成的 M-IMNs，拟议的功率放大器可以在紧凑的芯片面积内同时提供高输出功率和超过 90° 的相移。所提出的 FQ 可以在二次和四次谐波频率上实现最佳阻抗匹配，从而使输出功率增强 2.6 dB。此外，FQ 还能将功率放大器的输出相位提高四倍，从而使 AMC 的输出信号具有 360° 的完全移相能力。此外，还介绍了表征太赫兹信号相位的测量装置。所提出的太赫兹 AMC 采用 40 纳米 CMOS 技术，没有超厚金属层，在 368 GHz 频率下的峰值输出功率为 -3.5 dBm，转换增益为 1.8 dB，带宽为 3dB，频率范围为 340 至 376 GHz。在 324 至 346 GHz 频率范围内，输出相位可连续变化 360°。在 10 兆赫偏移频率下，346 千兆赫输出信号的相位噪声为-105 dBc/Hz。拟议的 340 GHz AMC 采用 0.9 V 电源，功耗为 215.1 mW。

{"title":"A 340-GHz THz Amplifier-Frequency-Multiplier Chain With 360° Phase-Shifting Range and its Phase Characterization","authors":"Chun Wang;Pin-Chun Chiu;Chun-Lin Ko;Sheng-Hsiang Tseng;Chun-Hsing Li","doi":"10.1109/JETCAS.2023.3345358","DOIUrl":"https://doi.org/10.1109/JETCAS.2023.3345358","url":null,"abstract":"A 340-GHz compact terahertz (THz) amplifier-frequency-multiplier chain (AMC) offering a full 360° phase shifting range for phased-array applications is proposed in this paper. The AMC comprises an 85 -GHz phase-shifter-embedded (\u0000<inline-formula> <tex-math>$Delta varphi $ </tex-math></inline-formula>\u0000-embedded) power amplifier (PA) and a high-output-power frequency quadrupler (FQ). The PA is equipped with multifunctional impedance matching networks (M-IMNs) that can simultaneously provide balun, impedance transformation, and phase-shifting functions. Analytic expressions have been derived to provide design guidelines for the M-IMNs. With the integrated M-IMNs, the proposed PA can concurrently deliver high output power and a phase shift exceeding 90° in a compact chip area. The proposed FQ can achieve optimal impedance matching at second and fourth harmonic frequencies, leading to the output power enhancement of 2.6 dB. Furthermore, the output phase of the PA is quadrupled by the FQ, resulting in the output signal of the AMC with a full 360° phase-shifting capability. A measurement setup for characterizing the phase of a THz signal is also presented. Implemented in a 40-nm CMOS technology without ultra-thick metal layers available, the proposed THz AMC achieves a peak output power of −3.5 dBm at 368 GHz with a conversion gain of 1.8 dB and a 3-dB bandwidth from 340 to 376 GHz. The output phase can continuously vary over 360° within the 324 to 346 GHz frequency range. The phase noise of the output signal at 346 GHz is −105 dBc/Hz at a 10-MHz offset frequency. The proposed 340-GHz AMC consumes 215.1 mW from a 0.9-V supply.","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"14 1","pages":"52-66"},"PeriodicalIF":4.6,"publicationDate":"2023-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140123406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A 11.3–16.6-GHz VCO With Constructive Switched Magnetic Coupling in 65-nm CMOS 采用 65 纳米 CMOS 的 11.3-16.6-GHz VCO，具有构造性开关磁耦合功能

IF 4.6 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Journal on Emerging and Selected Topics in Circuits and Systems

Pub Date : 2023-12-18 DOI: 10.1109/JETCAS.2023.3344510

Yuetong Lyu;Changwenquan Song;Pei Qin;Liang Wu

Conventional transformer-based magnetic tuning has demonstrated dual-band or even multi-band operation for voltage-controlled oscillators (VCOs). However, the destructive magnetic coupling employed introduces implicit loss to the transformer thus degrading its quality factor (Q), and achieves a continuous frequency coverage resulting in inferior performance. To address this issue, this paper proposes a constructive switched magnetic coupling (CSMC) technique, realizing dual-band operation with the Q improvement into one band due to the in-phase coupling and the explicit switch. For validation, a transformer employing the CSMC technique is designed and deployed in a dual-band VCO design. Fabricated in a 65-nm CMOS process, the VCO is measured with an oscillation frequency range of 37.8%, from 11.3 to 16.6 GHz, while consuming 2.5-mW from a 0.65-V voltage supply. Within the entire frequency coverage, the measured phase noise ranges from −129.6 to −123.7 at 10-MHz offset, resulting in FoM of 186-192.1 dBc/Hz. The core area of the chip is

$0.43times 0.25$

mm2 excluding pads.

传统的基于变压器的磁性调谐已经证明了压控振荡器（VCO）的双频甚至多频运行。然而，所采用的破坏性磁耦合会给变压器带来隐含损耗，从而降低其品质因数（Q），并且无法实现连续的频率覆盖，导致性能低下。为解决这一问题，本文提出了一种建设性开关磁耦合（CSMC）技术，通过同相耦合和显式开关实现双频操作，并将 Q 值提高到一个频段。为进行验证，设计了采用 CSMC 技术的变压器，并将其应用于双频 VCO 设计中。VCO 采用 65 纳米 CMOS 工艺制造，测量结果表明振荡频率范围为 37.8%，从 11.3 GHz 到 16.6 GHz，0.65 V 电压电源的功耗为 2.5 mW。在整个频率覆盖范围内，测量到的相位噪声范围为-129.6 至-123.7（偏移 10 MHz），FoM 为 186-192.1 dBc/Hz。芯片的核心面积为 0.43 美元乘以 0.25 美元 mm2（不包括焊盘）。

{"title":"A 11.3–16.6-GHz VCO With Constructive Switched Magnetic Coupling in 65-nm CMOS","authors":"Yuetong Lyu;Changwenquan Song;Pei Qin;Liang Wu","doi":"10.1109/JETCAS.2023.3344510","DOIUrl":"https://doi.org/10.1109/JETCAS.2023.3344510","url":null,"abstract":"Conventional transformer-based magnetic tuning has demonstrated dual-band or even multi-band operation for voltage-controlled oscillators (VCOs). However, the destructive magnetic coupling employed introduces implicit loss to the transformer thus degrading its quality factor (Q), and achieves a continuous frequency coverage resulting in inferior performance. To address this issue, this paper proposes a constructive switched magnetic coupling (CSMC) technique, realizing dual-band operation with the Q improvement into one band due to the in-phase coupling and the explicit switch. For validation, a transformer employing the CSMC technique is designed and deployed in a dual-band VCO design. Fabricated in a 65-nm CMOS process, the VCO is measured with an oscillation frequency range of 37.8%, from 11.3 to 16.6 GHz, while consuming 2.5-mW from a 0.65-V voltage supply. Within the entire frequency coverage, the measured phase noise ranges from −129.6 to −123.7 at 10-MHz offset, resulting in FoM of 186-192.1 dBc/Hz. The core area of the chip is \u0000<inline-formula> <tex-math>$0.43times 0.25$ </tex-math></inline-formula>\u0000 mm2 excluding pads.","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"14 1","pages":"133-141"},"PeriodicalIF":4.6,"publicationDate":"2023-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140123365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Compact Transverse-Resonance Low-Pass Filter With Wide Stop-Band Rejection Implemented in Gallium Arsenide Technology 利用砷化镓技术实现具有宽带抑制功能的紧凑型横向谐振低通滤波器

IF 4.6 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Journal on Emerging and Selected Topics in Circuits and Systems

Pub Date : 2023-12-07 DOI: 10.1109/JETCAS.2023.3340957

Sudipta Chakraborty;Gayatri Neeharika Sreepada;Michael Heimlich;Anand K. Verma

This work reports three designs of transverse resonance (TR)-based high-performance compact 5-pole Butterworth low-pass filters (TR-LPFs) at the cut-off frequency (

$f_{c}$

) 10.5 GHz in

$0.15~mu text{m}$

Gallium Arsenide (GaAs) pHEMT technology, with a chip size of 0.82 mm

$times0.87$

mm. Two fabricated TR-LPFs have 20 dB, 30 dB, 40 dB, and 50 dB attenuation levels with rejection bandwidths of (54 GHz, 54 GHz), (32 GHz, 52 GHz), (31 GHz, 50 GHz), and (18.5 GHz, 27 GHz) respectively, and insertion loss of 0.5 dB and 0.6 dB. The TR-LPF is a microstrip-based design, so unlike the lumped elements-based design, it could be designed and fabricated in the GaAs, and other technologies even at millimeter-wave frequencies. Such high performance LPF, using microstrip on a GaAs chip is not reported in the open literature.

这项工作报告了三种基于横向谐振（TR）的高性能紧凑型 5 极巴特沃斯低通滤波器（TR-LPF）的设计，其截止频率（$f_{c}$ ）为 10.5 GHz，采用 0.15~mu text{m}$ 的砷化镓（GaAs）pHEMT 技术，芯片尺寸为 0.82 mm $times0.87$ mm。制造的两个 TR-LPF 衰减水平分别为 20 dB、30 dB、40 dB 和 50 dB，抑制带宽分别为（54 GHz、54 GHz）、（32 GHz、52 GHz）、（31 GHz、50 GHz）和（18.5 GHz、27 GHz），插入损耗分别为 0.5 dB 和 0.6 dB。TR-LPF 是一种基于微带的设计，因此与基于叠加元件的设计不同，它甚至可以在毫米波频率下使用砷化镓和其他技术进行设计和制造。这种在砷化镓芯片上使用微带的高性能 LPF 在公开文献中尚未见报道。

引用次数: 0

Building Time-Surfaces by Exploiting the Complex Volatility of an ECRAM Memristor 利用 ECRAM Memristor 的复杂波动性构建时间曲面

IF 4.6 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Journal on Emerging and Selected Topics in Circuits and Systems

Pub Date : 2023-11-16 DOI: 10.1109/JETCAS.2023.3330832

Marco Rasetto;Qingzhou Wan;Himanshu Akolkar;Feng Xiong;Bertram Shi;Ryad Benosman

Memristors have emerged as a promising technology for efficient neuromorphic architectures owing to their ability to act as programmable synapses, combining processing and memory into a single device. Although they are most commonly used for static encoding of synaptic weights, recent work has begun to investigate the use of their dynamical properties, such as Short Term Plasticity (STP), to integrate events over time in event-based architectures. However, we are still far from completely understanding the range of possible behaviors and how they might be exploited in neuromorphic computation. This work focuses on a newly developed Li

$_{text {x}}$

WO

$_{text {3}}$

-based three-terminal memristor that exhibits tunable STP and a conductance response modeled by a double exponential decay. We derive a stochastic model of the device from experimental data and investigate how device stochasticity, STP, and the double exponential decay affect accuracy in a hierarchy of time-surfaces (HOTS) architecture. We found that the device’s stochasticity does not affect accuracy, that STP can reduce the effect of salt and pepper noise in signals from event-based sensors, and that the double exponential decay improves accuracy by integrating temporal information over multiple time scales. Our approach can be generalized to study other memristive devices to build a better understanding of how control over temporal dynamics can enable neuromorphic engineers to fine-tune devices and architectures to fit their problems at hand.

忆阻器能够充当可编程的突触，将处理和记忆功能整合到单个设备中，因此已成为高效神经形态架构的一项前景广阔的技术。虽然晶闸管最常用于对突触权重进行静态编码，但最近的研究工作已开始研究如何利用其动态特性，如短期可塑性（STP），在基于事件的架构中整合随时间变化的事件。然而，我们还远远没有完全了解可能的行为范围以及如何在神经形态计算中利用这些行为。这项工作的重点是新开发的基于 Li $_{text {x}}$ WO $_{text {3}}$ 的三端忆阻器，它具有可调的 STP 和以双指数衰减为模型的电导响应。我们根据实验数据推导出了该器件的随机模型，并研究了器件随机性、STP 和双指数衰减如何影响分层时间表面（HOTS）架构的精度。我们发现，设备的随机性不会影响准确性，STP 可以减少基于事件的传感器信号中的椒盐噪声的影响，而双指数衰减则可以通过整合多个时间尺度上的时间信息来提高准确性。我们的方法可以推广到对其他记忆器件的研究中，从而更好地理解对时间动态的控制如何使神经形态工程师能够对器件和架构进行微调，以适应手头的问题。

{"title":"Building Time-Surfaces by Exploiting the Complex Volatility of an ECRAM Memristor","authors":"Marco Rasetto;Qingzhou Wan;Himanshu Akolkar;Feng Xiong;Bertram Shi;Ryad Benosman","doi":"10.1109/JETCAS.2023.3330832","DOIUrl":"https://doi.org/10.1109/JETCAS.2023.3330832","url":null,"abstract":"Memristors have emerged as a promising technology for efficient neuromorphic architectures owing to their ability to act as programmable synapses, combining processing and memory into a single device. Although they are most commonly used for static encoding of synaptic weights, recent work has begun to investigate the use of their dynamical properties, such as Short Term Plasticity (STP), to integrate events over time in event-based architectures. However, we are still far from completely understanding the range of possible behaviors and how they might be exploited in neuromorphic computation. This work focuses on a newly developed Li\u0000<inline-formula> <tex-math>$_{text {x}}$ </tex-math></inline-formula>\u0000WO\u0000<inline-formula> <tex-math>$_{text {3}}$ </tex-math></inline-formula>\u0000-based three-terminal memristor that exhibits tunable STP and a conductance response modeled by a double exponential decay. We derive a stochastic model of the device from experimental data and investigate how device stochasticity, STP, and the double exponential decay affect accuracy in a hierarchy of time-surfaces (HOTS) architecture. We found that the device’s stochasticity does not affect accuracy, that STP can reduce the effect of salt and pepper noise in signals from event-based sensors, and that the double exponential decay improves accuracy by integrating temporal information over multiple time scales. Our approach can be generalized to study other memristive devices to build a better understanding of how control over temporal dynamics can enable neuromorphic engineers to fine-tune devices and architectures to fit their problems at hand.","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"13 4","pages":"877-888"},"PeriodicalIF":4.6,"publicationDate":"2023-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10320285","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139060149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Low-Area and Low-Power VLSI Architectures for Long Short-Term Memory Networks 短时长内存网络的低面积、低功耗 VLSI 架构

IF 4.6 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Journal on Emerging and Selected Topics in Circuits and Systems

Pub Date : 2023-11-06 DOI: 10.1109/JETCAS.2023.3330428

Mohammed A. Alhartomi;Mohd Tasleem Khan;Saeed Alzahrani;Ahmed Alzahmi;Rafi Ahamed Shaik;Jinti Hazarika;Ruwaybih Alsulami;Abdulaziz Alotaibi;Meshal Al-Harthi

Long short-term memory (LSTM) networks are extensively used in various sequential learning tasks, including speech recognition. Their significance in real-world applications has prompted the demand for cost-effective and power-efficient designs. This paper introduces LSTM architectures based on distributed arithmetic (DA), utilizing circulant and block-circulant matrix-vector multiplications (MVMs) for network compression. The quantized weights-oriented approach for training circulant and block-circulant matrices is considered. By formulating fixed-point circulant/block-circulant MVMs, we explore the impact of kernel size on accuracy. Our DA-based approach employs shared full and partial methods of add-store/store-add followed by a select unit to realize an MVM. It is then coupled with a multi-partial strategy to reduce complexity for larger kernel sizes. Further complexity reduction is achieved by optimizing decoders of multiple select units. Pipelining in add-store enhances speed at the expense of a few pipelined registers. The results of the field-programmable gate array showcase the superiority of our proposed architectures based on the partial store-add method, delivering reductions of 98.71% in DSP slices, 33.59% in slice look-up tables, 13.43% in flip-flops, and 29.76% in power compared to the state-of-the-art.

长短期记忆（LSTM）网络被广泛应用于各种顺序学习任务，包括语音识别。它们在实际应用中的重要性促使人们对高性价比、高能效的设计提出了更高的要求。本文介绍了基于分布式运算（DA）的 LSTM 架构，利用环形和块环形矩阵向量乘法（MVM）进行网络压缩。研究考虑了以量化权重为导向的环形矩阵和块环形矩阵训练方法。通过制定定点环形/块环形 MVM，我们探索了内核大小对准确性的影响。我们基于数模转换的方法采用了共享的加-存/存-加全方法和部分方法，然后通过一个选择单元来实现 MVM。然后，它与多部分策略相结合，降低了更大内核尺寸的复杂性。通过优化多选择单元的解码器，进一步降低了复杂性。加法存储中的流水线设计以牺牲几个流水线寄存器为代价提高了速度。现场可编程门阵列的结果表明，我们提出的基于部分存储-添加方法的体系结构具有优越性，与最先进的体系结构相比，DSP 片数减少了 98.71%，片数查找表减少了 33.59%，触发器减少了 13.43%，功耗减少了 29.76%。

{"title":"Low-Area and Low-Power VLSI Architectures for Long Short-Term Memory Networks","authors":"Mohammed A. Alhartomi;Mohd Tasleem Khan;Saeed Alzahrani;Ahmed Alzahmi;Rafi Ahamed Shaik;Jinti Hazarika;Ruwaybih Alsulami;Abdulaziz Alotaibi;Meshal Al-Harthi","doi":"10.1109/JETCAS.2023.3330428","DOIUrl":"10.1109/JETCAS.2023.3330428","url":null,"abstract":"Long short-term memory (LSTM) networks are extensively used in various sequential learning tasks, including speech recognition. Their significance in real-world applications has prompted the demand for cost-effective and power-efficient designs. This paper introduces LSTM architectures based on distributed arithmetic (DA), utilizing circulant and block-circulant matrix-vector multiplications (MVMs) for network compression. The quantized weights-oriented approach for training circulant and block-circulant matrices is considered. By formulating fixed-point circulant/block-circulant MVMs, we explore the impact of kernel size on accuracy. Our DA-based approach employs shared full and partial methods of add-store/store-add followed by a select unit to realize an MVM. It is then coupled with a multi-partial strategy to reduce complexity for larger kernel sizes. Further complexity reduction is achieved by optimizing decoders of multiple select units. Pipelining in add-store enhances speed at the expense of a few pipelined registers. The results of the field-programmable gate array showcase the superiority of our proposed architectures based on the partial store-add method, delivering reductions of 98.71% in DSP slices, 33.59% in slice look-up tables, 13.43% in flip-flops, and 29.76% in power compared to the state-of-the-art.","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"13 4","pages":"1000-1014"},"PeriodicalIF":4.6,"publicationDate":"2023-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135503788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

To Spike or Not to Spike: A Digital Hardware Perspective on Deep Learning Acceleration "秒杀 "还是 "不秒杀"？深度学习加速的数字硬件视角

IF 4.6 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Journal on Emerging and Selected Topics in Circuits and Systems

Pub Date : 2023-11-06 DOI: 10.1109/JETCAS.2023.3330432

Fabrizio Ottati;Chang Gao;Qinyu Chen;Giovanni Brignone;Mario R. Casu;Jason K. Eshraghian;Luciano Lavagno

As deep learning models scale, they become increasingly competitive from domains spanning from computer vision to natural language processing; however, this happens at the expense of efficiency since they require increasingly more memory and computing power. The power efficiency of the biological brain outperforms any large-scale deep learning (DL) model; thus, neuromorphic computing tries to mimic the brain operations, such as spike-based information processing, to improve the efficiency of DL models. Despite the benefits of the brain, such as efficient information transmission, dense neuronal interconnects, and the co-location of computation and memory, the available biological substrate has severely constrained the evolution of biological brains. Electronic hardware does not have the same constraints; therefore, while modeling spiking neural networks (SNNs) might uncover one piece of the puzzle, the design of efficient hardware backends for SNNs needs further investigation, potentially taking inspiration from the available work done on the artificial neural networks (ANNs) side. As such, when is it wise to look at the brain while designing new hardware, and when should it be ignored? To answer this question, we quantitatively compare the digital hardware acceleration techniques and platforms of ANNs and SNNs. As a result, we provide the following insights: (i) ANNs currently process static data more efficiently, (ii) applications targeting data produced by neuromorphic sensors, such as event-based cameras and silicon cochleas, need more investigation since the behavior of these sensors might naturally fit the SNN paradigm, and (iii) hybrid approaches combining SNNs and ANNs might lead to the best solutions and should be investigated further at the hardware level, accounting for both efficiency and loss optimization.

随着深度学习模型规模的扩大，它们在从计算机视觉到自然语言处理等各个领域的竞争力日益增强；然而，这是以牺牲效率为代价的，因为它们需要越来越多的内存和计算能力。生物大脑的能效优于任何大规模深度学习（DL）模型；因此，神经形态计算试图模仿大脑的操作，如基于尖峰的信息处理，以提高 DL 模型的效率。尽管大脑具有高效的信息传输、密集的神经元互连以及计算与记忆共存等优点，但现有的生物基质严重制约了生物大脑的进化。电子硬件没有同样的限制；因此，尽管尖峰神经网络（SNN）建模可能会揭开谜题的一角，但为尖峰神经网络设计高效的硬件后端还需要进一步研究，有可能从人工神经网络（ANN）方面的现有工作中汲取灵感。因此，在设计新硬件时，什么时候研究大脑是明智之举，什么时候应该忽略它？为了回答这个问题，我们对 ANN 和 SNN 的数字硬件加速技术和平台进行了定量比较。结果，我们提出了以下见解：(i) 目前，ANNs 处理静态数据的效率更高；(ii) 针对神经形态传感器（如基于事件的摄像头和硅耳蜗）产生的数据的应用需要更多研究，因为这些传感器的行为可能自然地符合 SNNs 范式；(iii) SNNs 和 ANNs 的混合方法可能会带来最佳解决方案，应在硬件层面进一步研究，同时考虑效率和损耗优化。

{"title":"To Spike or Not to Spike: A Digital Hardware Perspective on Deep Learning Acceleration","authors":"Fabrizio Ottati;Chang Gao;Qinyu Chen;Giovanni Brignone;Mario R. Casu;Jason K. Eshraghian;Luciano Lavagno","doi":"10.1109/JETCAS.2023.3330432","DOIUrl":"10.1109/JETCAS.2023.3330432","url":null,"abstract":"As deep learning models scale, they become increasingly competitive from domains spanning from computer vision to natural language processing; however, this happens at the expense of efficiency since they require increasingly more memory and computing power. The power efficiency of the biological brain outperforms any large-scale deep learning (DL) model; thus, neuromorphic computing tries to mimic the brain operations, such as spike-based information processing, to improve the efficiency of DL models. Despite the benefits of the brain, such as efficient information transmission, dense neuronal interconnects, and the co-location of computation and memory, the available biological substrate has severely constrained the evolution of biological brains. Electronic hardware does not have the same constraints; therefore, while modeling spiking neural networks (SNNs) might uncover one piece of the puzzle, the design of efficient hardware backends for SNNs needs further investigation, potentially taking inspiration from the available work done on the artificial neural networks (ANNs) side. As such, when is it wise to look at the brain while designing new hardware, and when should it be ignored? To answer this question, we quantitatively compare the digital hardware acceleration techniques and platforms of ANNs and SNNs. As a result, we provide the following insights: (i) ANNs currently process static data more efficiently, (ii) applications targeting data produced by neuromorphic sensors, such as event-based cameras and silicon cochleas, need more investigation since the behavior of these sensors might naturally fit the SNN paradigm, and (iii) hybrid approaches combining SNNs and ANNs might lead to the best solutions and should be investigated further at the hardware level, accounting for both efficiency and loss optimization.","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"13 4","pages":"1015-1025"},"PeriodicalIF":4.6,"publicationDate":"2023-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135500935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

BioNN: Bio-Mimetic Neural Networks on Hardware Using Nonlinear Multi-Timescale Mixed-Feedback Control for Neuromodulatory Bursting Rhythms BioNN：利用非线性多时间尺度混合反馈控制硬件上的生物仿真神经网络，实现神经调节爆发节奏

IF 4.6 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Journal on Emerging and Selected Topics in Circuits and Systems

Pub Date : 2023-11-03 DOI: 10.1109/JETCAS.2023.3330084

Kangni Liu;Shahin Hashemkhani;Jonathan Rubin;Rajkumar Kubendran

Biological neurons exhibit rich and complex nonlinear dynamics, which are computationally expensive and area/power hungry for hardware implementation. This paper presents a mathematical analysis and hardware realization of neural networks using a nonlinear neuron model that utilizes two excitable systems operating at different timescales. The neuron consists of a mixed-feedback system operating at multiple timescales to exhibit a variety of modalities that resemble the biophysical mechanisms found in neurophysiology. The single neuron dynamics emerge from four voltage-controlled current sources and feature spiking and bursting output modes that can be controlled using tunable parameters. The bifurcation structures of the neuron, modeled as a 4D dynamical system, illustrate the roles of sources acting on different timescales in shaping neural dynamics. A comprehensive understanding of the system’s dynamic behavior is obtained by studying the state space variables and performing bifurcation analysis on the different parameters. The model is implemented to a 1mm x 2mm prototype chip utilizing the 180nm CMOS process. Each neural network consists of 1 isolated test neuron and 5 fully connected neurons using 20 synapses. By carefully selecting bias voltages according to the I-V characterization curves, the neurons are shown to exhibit spike, burst, and burst excitable behavior. Multiple small-scale neural networks with inhibitory or excitatory synapses were verified to achieve coupled rhythms with neuron bursts in-phase or out-of-phase. To demonstrate an application, the generated burst waveforms from the 4-neuron network were used to form a Central Pattern Generator (CPG) for locomotion control of the four legs of the Petoi, a quadruped robot, enabling the bot to jump successfully.

生物神经元表现出丰富而复杂的非线性动态，在硬件实现方面计算成本高、面积/功耗大。本文介绍了神经网络的数学分析和硬件实现，采用了一个非线性神经元模型，利用两个在不同时间尺度上运行的可兴奋系统。神经元由一个在多个时间尺度上运行的混合反馈系统组成，表现出与神经生理学中发现的生物物理机制相似的各种模式。单个神经元的动态来自四个电压控制电流源，具有尖峰和猝发输出模式，可使用可调参数进行控制。神经元的分叉结构被建模为一个四维动态系统，它说明了不同时间尺度的电流源在塑造神经动力学中的作用。通过研究状态空间变量和对不同参数进行分岔分析，可以全面了解系统的动态行为。该模型利用 180 纳米 CMOS 工艺在 1 毫米 x 2 毫米的原型芯片上实现。每个神经网络由 1 个孤立的测试神经元和 5 个完全连接的神经元（使用 20 个突触）组成。通过根据 I-V 特性曲线仔细选择偏置电压，神经元表现出尖峰、突发性和突发性兴奋行为。经过验证，具有抑制性或兴奋性突触的多个小规模神经网络可实现神经元猝发同相或异相的耦合节律。为了演示应用，4 个神经元网络产生的突发性波形被用来形成中央模式发生器（CPG），用于控制四足机器人 Petoi 的四条腿的运动，使机器人能够成功跳跃。

{"title":"BioNN: Bio-Mimetic Neural Networks on Hardware Using Nonlinear Multi-Timescale Mixed-Feedback Control for Neuromodulatory Bursting Rhythms","authors":"Kangni Liu;Shahin Hashemkhani;Jonathan Rubin;Rajkumar Kubendran","doi":"10.1109/JETCAS.2023.3330084","DOIUrl":"10.1109/JETCAS.2023.3330084","url":null,"abstract":"Biological neurons exhibit rich and complex nonlinear dynamics, which are computationally expensive and area/power hungry for hardware implementation. This paper presents a mathematical analysis and hardware realization of neural networks using a nonlinear neuron model that utilizes two excitable systems operating at different timescales. The neuron consists of a mixed-feedback system operating at multiple timescales to exhibit a variety of modalities that resemble the biophysical mechanisms found in neurophysiology. The single neuron dynamics emerge from four voltage-controlled current sources and feature spiking and bursting output modes that can be controlled using tunable parameters. The bifurcation structures of the neuron, modeled as a 4D dynamical system, illustrate the roles of sources acting on different timescales in shaping neural dynamics. A comprehensive understanding of the system’s dynamic behavior is obtained by studying the state space variables and performing bifurcation analysis on the different parameters. The model is implemented to a 1mm x 2mm prototype chip utilizing the 180nm CMOS process. Each neural network consists of 1 isolated test neuron and 5 fully connected neurons using 20 synapses. By carefully selecting bias voltages according to the I-V characterization curves, the neurons are shown to exhibit spike, burst, and burst excitable behavior. Multiple small-scale neural networks with inhibitory or excitatory synapses were verified to achieve coupled rhythms with neuron bursts in-phase or out-of-phase. To demonstrate an application, the generated burst waveforms from the 4-neuron network were used to form a Central Pattern Generator (CPG) for locomotion control of the four legs of the Petoi, a quadruped robot, enabling the bot to jump successfully.","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"13 4","pages":"914-926"},"PeriodicalIF":4.6,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134982249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards Scalable Digital Modeling of Networks of Biorealistic Silicon Neurons 实现生物现实硅神经元网络的可扩展数字建模

IF 4.6 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Journal on Emerging and Selected Topics in Circuits and Systems

Pub Date : 2023-11-02 DOI: 10.1109/JETCAS.2023.3330069

Swagat Bhattacharyya;Praveen Raj Ayyappan;Jennifer O. Hasler

The study of biorealistic neuron circuits has been limited by the efficiency of digital implementations. Efficient digital approaches generally use I&F variants, losing important neural properties for network computation. In contrast, accurate neuron ODEs tend to utilize computationally intensive operations, causing the overhead to become prohibitive for large spiking neural network applications. This effort presents efficient digital approximations for coupled HH neurons derived from transistor-channel neural modeling. Neuron models are implemented in C using floating-point and 32-bit fixed-point arithmetic, and small networks are simulated using a fixed-step ODE solver. Our approach enables large network simulation of HH-like neurons, facilitating scalable digital modeling while also providing a direct path towards a framework for analog computation.

对生物现实神经元电路的研究一直受到数字实现效率的限制。高效的数字方法通常使用 I&F 变体，从而丧失了网络计算的重要神经特性。与此相反，精确的神经元 ODE 往往使用计算密集型操作，导致大型尖峰神经网络应用的开销过大。这项研究提出了从晶体管通道神经建模中得出的耦合 HH 神经元的高效数字近似值。神经元模型使用 C 语言浮点运算和 32 位定点运算实现，并使用固定步长的 ODE 求解器模拟小型网络。我们的方法实现了类似 HH 神经元的大型网络模拟，促进了可扩展的数字建模，同时也为模拟计算框架提供了一条直接途径。

引用次数: 0

Design and Analysis of a V-Band CMOS Sextuple SILVCO Using Transformer and Cascade-Series Coupling With a Frequency-Tracking Loop 利用变压器和级联串联耦合以及频率跟踪环路设计和分析 V 波段 CMOS 六倍 SILVCO

IF 4.6 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Journal on Emerging and Selected Topics in Circuits and Systems

Pub Date : 2023-11-01 DOI: 10.1109/JETCAS.2023.3329430

Wei-Cheng Chen;Hong-Yeh Chang

A low-phase-noise local oscillator (LO) is a crucial component in communication systems. However, the design challenge of the LO significantly increases as the operating frequency rises. This paper focuses on the design and analysis of a

$V$

-band CMOS sextuple sub-harmonically injection-locked voltage-controlled oscillator (SILVCO) with a frequency-tracking loop (FTL). To further enhance the locking range and efficiently generate high-order harmonic components, a cascade-series coupling injector is proposed for employment in the SILVCO. The design methodology of the proposed circuit is thoroughly presented, accompanied by analysis and calculated results. The SILVCO with FTL is implemented using a 90-nm CMOS process. With a sub-harmonic number of 6 and a dc power consumption of 23 mW, the measured output frequency ranges from 50.8 to 53.4 GHz, achieving a differential output power close to 0 dBm. The measured phase noise at a 1 MHz offset and the rms jitter integrated from 1 kHz to 10 MHz are both lower than −109.4 dBc/Hz and 43 fs, respectively.

低相噪本地振荡器（LO）是通信系统中的关键部件。然而，随着工作频率的提高，LO 的设计难度也大大增加。本文的重点是设计和分析带有频率跟踪环路（FTL）的 V$ 带 CMOS 六次谐波注入锁定压控振荡器（SILVCO）。为了进一步提高锁定范围并有效生成高阶谐波成分，我们提出了一种级联串联耦合注入器，用于 SILVCO。本文全面介绍了所提电路的设计方法，并附有分析和计算结果。带有 FTL 的 SILVCO 采用 90 纳米 CMOS 工艺实现。次谐波数为 6，直流功耗为 23 mW，测量输出频率范围为 50.8 至 53.4 GHz，差分输出功率接近 0 dBm。1 MHz 偏移时的实测相位噪声和 1 kHz 至 10 MHz 的均方根抖动积分分别低于 -109.4 dBc/Hz 和 43 fs。

{"title":"Design and Analysis of a V-Band CMOS Sextuple SILVCO Using Transformer and Cascade-Series Coupling With a Frequency-Tracking Loop","authors":"Wei-Cheng Chen;Hong-Yeh Chang","doi":"10.1109/JETCAS.2023.3329430","DOIUrl":"10.1109/JETCAS.2023.3329430","url":null,"abstract":"A low-phase-noise local oscillator (LO) is a crucial component in communication systems. However, the design challenge of the LO significantly increases as the operating frequency rises. This paper focuses on the design and analysis of a \u0000<inline-formula> <tex-math>$V$ </tex-math></inline-formula>\u0000-band CMOS sextuple sub-harmonically injection-locked voltage-controlled oscillator (SILVCO) with a frequency-tracking loop (FTL). To further enhance the locking range and efficiently generate high-order harmonic components, a cascade-series coupling injector is proposed for employment in the SILVCO. The design methodology of the proposed circuit is thoroughly presented, accompanied by analysis and calculated results. The SILVCO with FTL is implemented using a 90-nm CMOS process. With a sub-harmonic number of 6 and a dc power consumption of 23 mW, the measured output frequency ranges from 50.8 to 53.4 GHz, achieving a differential output power close to 0 dBm. The measured phase noise at a 1 MHz offset and the rms jitter integrated from 1 kHz to 10 MHz are both lower than −109.4 dBc/Hz and 43 fs, respectively.","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"14 1","pages":"75-87"},"PeriodicalIF":4.6,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135361182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0