首页 > 最新文献

IEEE Transactions on Circuits and Systems II: Express Briefs最新文献

英文 中文
IEEE Circuits and Systems Society Information 电气和电子工程师学会电路与系统协会信息
IF 4 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-26 DOI: 10.1109/TCSII.2024.3462215
{"title":"IEEE Circuits and Systems Society Information","authors":"","doi":"10.1109/TCSII.2024.3462215","DOIUrl":"https://doi.org/10.1109/TCSII.2024.3462215","url":null,"abstract":"","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"71 10","pages":"C3-C3"},"PeriodicalIF":4.0,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10695787","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142328374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Transactions on Circuits and Systems--II: Express Briefs Publication Information 电气和电子工程师学会电路与系统论文集--II:特快摘要》出版信息
IF 4 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-26 DOI: 10.1109/TCSII.2024.3462211
{"title":"IEEE Transactions on Circuits and Systems--II: Express Briefs Publication Information","authors":"","doi":"10.1109/TCSII.2024.3462211","DOIUrl":"https://doi.org/10.1109/TCSII.2024.3462211","url":null,"abstract":"","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"71 10","pages":"C2-C2"},"PeriodicalIF":4.0,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10695795","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142324351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Calibration-Free 9.3-ENOB 1-GS/s Pipelined ADC With PVT-Insensitive Nested Ring Amplifiers
IF 4 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-24 DOI: 10.1109/TCSII.2024.3466902
Chao-Yen Hsu;Tai-Cheng Lee
This brief presents a nested ring amplifier with dynamic-cascode-bias and gain-boosting techniques. The proposed amplifier achieves a gain of 90 dB while preserving the high-slew capability. The amplifier is employed in a MDAC for a calibration-free 11-bit 1-GS/s single-channel pipelined ADC. Furthermore, the proposed biasing circuits are utilized to alleviate PVT sensitivity. Fabricated in a 28-nm CMOS technology, the ADC achieves a 61.72-dB SFDR and 53.52-dB SNDR at a Nyquist input, while consuming 14.7 mW from a 1-V supply and yielding Schreier and Walden figure-of-merit (FoM) values of 159 dB and 37.9 fJ/conv.-step, respectively.
{"title":"A Calibration-Free 9.3-ENOB 1-GS/s Pipelined ADC With PVT-Insensitive Nested Ring Amplifiers","authors":"Chao-Yen Hsu;Tai-Cheng Lee","doi":"10.1109/TCSII.2024.3466902","DOIUrl":"https://doi.org/10.1109/TCSII.2024.3466902","url":null,"abstract":"This brief presents a nested ring amplifier with dynamic-cascode-bias and gain-boosting techniques. The proposed amplifier achieves a gain of 90 dB while preserving the high-slew capability. The amplifier is employed in a MDAC for a calibration-free 11-bit 1-GS/s single-channel pipelined ADC. Furthermore, the proposed biasing circuits are utilized to alleviate PVT sensitivity. Fabricated in a 28-nm CMOS technology, the ADC achieves a 61.72-dB SFDR and 53.52-dB SNDR at a Nyquist input, while consuming 14.7 mW from a 1-V supply and yielding Schreier and Walden figure-of-merit (FoM) values of 159 dB and 37.9 fJ/conv.-step, respectively.","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"72 1","pages":"28-32"},"PeriodicalIF":4.0,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142880436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Experimental Evaluation of Silicon Nitride Memristors as Coupling Elements for Chimera States in Chaotic Oscillator Networks
IF 4 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-24 DOI: 10.1109/TCSII.2024.3466963
Karolos-Alexandros Tsakalos;Vasileios Ntinas;Nikolaos Vasileiadis;Astero Provata;Panagiotis Dimitrakis;Georgios Ch. Sirakoulis
Chimera states have attracted significant research interest due to their potential in modeling brain network functionality. Memristive nano-crossbars, known for their energy efficiency, massive parallelism, and synaptic-like properties, serve as a promising coupling medium in brain-inspired applications. The operation of these devices is strongly dictated by the non-linear mechanisms of memristor devices when studying synchronization phenomena. Expanding upon our previous work, which explored sneak-path currents in Chimera states, this study investigates the impact of fabricated Silicon Nitride (SiN) devices on the dynamics of Chua circuit (CC) networks. We conducted experimental evaluations to confirm the ability of SiN devices to retain their resistance state, thereby ensuring consistency in the crossbar array, a critical factor in maintaining chimera states during experiments. We employed an exponential memristor model to further investigate the non-linear dynamics within the CC network. Our results not only confirm the formation of various synchronization structures, such as chimera states and full chaotic synchronization but also reveal the intriguing formation of phase-lag structures. These structures, induced by the SiN-fitted model, exhibit distinctive characteristics marked by subtle and non-linear coupling behaviors, particularly evident at near-zero voltages. After analyzing our results, we present a comprehensive phase-parametric regime map, obtained by varying the coupling strength bifurcation parameter. This map provides valuable insights into the mechanisms governing the dynamics of CC networks equipepd with SiN-based memristor nanodevices, which have proven capable of capturing the complex dynamics of chimera states.
{"title":"Experimental Evaluation of Silicon Nitride Memristors as Coupling Elements for Chimera States in Chaotic Oscillator Networks","authors":"Karolos-Alexandros Tsakalos;Vasileios Ntinas;Nikolaos Vasileiadis;Astero Provata;Panagiotis Dimitrakis;Georgios Ch. Sirakoulis","doi":"10.1109/TCSII.2024.3466963","DOIUrl":"https://doi.org/10.1109/TCSII.2024.3466963","url":null,"abstract":"Chimera states have attracted significant research interest due to their potential in modeling brain network functionality. Memristive nano-crossbars, known for their energy efficiency, massive parallelism, and synaptic-like properties, serve as a promising coupling medium in brain-inspired applications. The operation of these devices is strongly dictated by the non-linear mechanisms of memristor devices when studying synchronization phenomena. Expanding upon our previous work, which explored sneak-path currents in Chimera states, this study investigates the impact of fabricated Silicon Nitride (SiN) devices on the dynamics of Chua circuit (CC) networks. We conducted experimental evaluations to confirm the ability of SiN devices to retain their resistance state, thereby ensuring consistency in the crossbar array, a critical factor in maintaining chimera states during experiments. We employed an exponential memristor model to further investigate the non-linear dynamics within the CC network. Our results not only confirm the formation of various synchronization structures, such as chimera states and full chaotic synchronization but also reveal the intriguing formation of phase-lag structures. These structures, induced by the SiN-fitted model, exhibit distinctive characteristics marked by subtle and non-linear coupling behaviors, particularly evident at near-zero voltages. After analyzing our results, we present a comprehensive phase-parametric regime map, obtained by varying the coupling strength bifurcation parameter. This map provides valuable insights into the mechanisms governing the dynamics of CC networks equipepd with SiN-based memristor nanodevices, which have proven capable of capturing the complex dynamics of chimera states.","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"72 1","pages":"33-37"},"PeriodicalIF":4.0,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142880305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A 24V-to-1V Triple Series-Capacitor Buck Converter With Low-Voltage Power Switches
IF 4 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-23 DOI: 10.1109/TCSII.2024.3465832
Weidong Xue;Xinwei Yu;Yisen Zhang;Junyan Ren
This brief introduces a 24V-to-1V hybrid converter with low-voltage power switches. By incorporating three flying capacitors in series with a two-phase switched-inductor, the proposed converter effectively functions as a 4.8V-to-1V converter. Thus, a high-voltage design is realized with low-voltage transistors known for their excellent figure of merit. As a result, the hybrid converter achieves an effective duty cycle five times higher and a better conversion efficiency compared with a double step-down converter (DSD) and a double series-capacitor converter (DSCBC) under identical power switch sizes and working conditions. The circuit is designed using a 0.15- $mu $ m Bipolar-CMOS-DMOS (BCD) process to achieve a conversion from 24 V to 1 V. It can support a load current range of 200 mA to 6 A, with each sub-converter operating at a frequency of 1-MHz. Post-layout simulations show a peak power efficiency of 90.07% under a 2-A current load, with an efficiency of 85% even at a load of 6-A.
{"title":"A 24V-to-1V Triple Series-Capacitor Buck Converter With Low-Voltage Power Switches","authors":"Weidong Xue;Xinwei Yu;Yisen Zhang;Junyan Ren","doi":"10.1109/TCSII.2024.3465832","DOIUrl":"https://doi.org/10.1109/TCSII.2024.3465832","url":null,"abstract":"This brief introduces a 24V-to-1V hybrid converter with low-voltage power switches. By incorporating three flying capacitors in series with a two-phase switched-inductor, the proposed converter effectively functions as a 4.8V-to-1V converter. Thus, a high-voltage design is realized with low-voltage transistors known for their excellent figure of merit. As a result, the hybrid converter achieves an effective duty cycle five times higher and a better conversion efficiency compared with a double step-down converter (DSD) and a double series-capacitor converter (DSCBC) under identical power switch sizes and working conditions. The circuit is designed using a 0.15-\u0000<inline-formula> <tex-math>$mu $ </tex-math></inline-formula>\u0000 m Bipolar-CMOS-DMOS (BCD) process to achieve a conversion from 24 V to 1 V. It can support a load current range of 200 mA to 6 A, with each sub-converter operating at a frequency of 1-MHz. Post-layout simulations show a peak power efficiency of 90.07% under a 2-A current load, with an efficiency of 85% even at a load of 6-A.","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"72 1","pages":"308-312"},"PeriodicalIF":4.0,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142890367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An M-Metric Readout Circuit for MLC Phase-Change Memory With a Comparator-Based Push-Pull Bit-Line Driver 基于比较器的推挽位线驱动器的 MLC 相变存储器 M 计量读出电路
IF 4 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-23 DOI: 10.1109/TCSII.2024.3465888
Ji-Wook Kwon;Dong-Hwan Jin;Min-Jae Seo;Seung-Tak Ryu
This brief introduces a multi-level phase-change memory (PCM) readout circuit that realizes a true M-metric readout scheme that inherently has a wide dynamic input range. In order to overcome the limited readout speed of a basic M-metric scheme that draws a small current through a PCM cell over a large bit-line capacitance and senses the voltage, we propose an opamp-less M-metric readout circuit that drives the bit-line in a successive approximation manner with a comparator-based push-pull driver (CPPD). The bit-line driving speed of the proposed readout circuit is comparable with that of a conventional voltage driver, but the power consumption is greatly reduced owing to the absence of a power hungry opamp. The prototype design achieves a full 6-bit linearity and 245-uW power consumption at a 270-ns readout speed.
本简介介绍了一种多级相变存储器(PCM)读出电路,该电路实现了真正的 M-度量读出方案,具有固有的宽动态输入范围。为了克服基本 M-度量方案的有限读出速度,我们提出了一种无运算放大器的 M-度量读出电路,利用基于比较器的推挽驱动器 (CPPD) 以逐次逼近的方式驱动位线。拟议读出电路的位线驱动速度与传统电压驱动器相当,但由于不使用功耗高的运算放大器,功耗大大降低。原型设计在 270-ns 的读出速度下实现了完整的 6 位线性度和 245-uW 的功耗。
{"title":"An M-Metric Readout Circuit for MLC Phase-Change Memory With a Comparator-Based Push-Pull Bit-Line Driver","authors":"Ji-Wook Kwon;Dong-Hwan Jin;Min-Jae Seo;Seung-Tak Ryu","doi":"10.1109/TCSII.2024.3465888","DOIUrl":"https://doi.org/10.1109/TCSII.2024.3465888","url":null,"abstract":"This brief introduces a multi-level phase-change memory (PCM) readout circuit that realizes a true M-metric readout scheme that inherently has a wide dynamic input range. In order to overcome the limited readout speed of a basic M-metric scheme that draws a small current through a PCM cell over a large bit-line capacitance and senses the voltage, we propose an opamp-less M-metric readout circuit that drives the bit-line in a successive approximation manner with a comparator-based push-pull driver (CPPD). The bit-line driving speed of the proposed readout circuit is comparable with that of a conventional voltage driver, but the power consumption is greatly reduced owing to the absence of a power hungry opamp. The prototype design achieves a full 6-bit linearity and 245-uW power consumption at a 270-ns readout speed.","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"71 11","pages":"4658-4662"},"PeriodicalIF":4.0,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142540461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Number Theoretic Transform Architecture for CRYSTALS-Kyber
IF 4 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-20 DOI: 10.1109/TCSII.2024.3465273
Khalid Javeed;David Gregg
The Number Theoretic Transform (NTT) is a central primitive to compute polynomial multiplication in a finite ring for both post-quantum cryptography (PQC) and fully homomorphic encryption (FHE) schemes. This brief presents a novel, efficient NTT hardware architecture suitable for CRYSTALS-Kyber, one of the NIST PQC standards. It is based on a new novel unified butterfly unit (UBU) developed by combining interleaved multiplication, radix-4, and resource-sharing strategies. This unit computes all butterfly operations for any generic prime modulus value and is re-configurable to any modulus length. In the proposed NTT architecture, multiple UBUs are deployed, demonstrating an area-time tradeoff. UBU and NTT architectures are synthesized and implemented over the Xilinix Artix-7 FPGA platform and results are shown for different performance evaluation metrics. The implementation results show our lightweight and high-speed designs achieve up to $5.6times $ and $7times $ improvements in resource consumption and efficiency, respectively. To the authors’ knowledge, it is the first generic NTT architecture based on interleaved multiplication approaches.
{"title":"Efficient Number Theoretic Transform Architecture for CRYSTALS-Kyber","authors":"Khalid Javeed;David Gregg","doi":"10.1109/TCSII.2024.3465273","DOIUrl":"https://doi.org/10.1109/TCSII.2024.3465273","url":null,"abstract":"The Number Theoretic Transform (NTT) is a central primitive to compute polynomial multiplication in a finite ring for both post-quantum cryptography (PQC) and fully homomorphic encryption (FHE) schemes. This brief presents a novel, efficient NTT hardware architecture suitable for CRYSTALS-Kyber, one of the NIST PQC standards. It is based on a new novel unified butterfly unit (UBU) developed by combining interleaved multiplication, radix-4, and resource-sharing strategies. This unit computes all butterfly operations for any generic prime modulus value and is re-configurable to any modulus length. In the proposed NTT architecture, multiple UBUs are deployed, demonstrating an area-time tradeoff. UBU and NTT architectures are synthesized and implemented over the Xilinix Artix-7 FPGA platform and results are shown for different performance evaluation metrics. The implementation results show our lightweight and high-speed designs achieve up to \u0000<inline-formula> <tex-math>$5.6times $ </tex-math></inline-formula>\u0000 and \u0000<inline-formula> <tex-math>$7times $ </tex-math></inline-formula>\u0000 improvements in resource consumption and efficiency, respectively. To the authors’ knowledge, it is the first generic NTT architecture based on interleaved multiplication approaches.","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"72 1","pages":"263-267"},"PeriodicalIF":4.0,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142890287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Min-Pooling Cost Aggregation for Semi-Global Matching of Stereo Vision Processor 立体视觉处理器半全局匹配的最小池成本聚合
IF 4 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-18 DOI: 10.1109/TCSII.2024.3463200
Wenyue Zhang;Pingcheng Dong;Lei Chen;Zhengyu Ma;Fengwei An
Semi-global matching (SGM) is a low-cost method suitable for hardware implementation, while it suffers from significant memory consumption. This brief presents a stereo-vision processor that leverages a min-pooling cost aggregation method for SGM. The min-pooling method addresses this issue by eliminating redundant values and employing an up-sampling technique to restore the original size without requiring clock domain crossing. As a result, this method effectively reduces memory usage by almost half, leading to a significant improvement in large-scale depth measurement. The experimental results demonstrate that the min-pooling method enhances the continuity of disparity maps, particularly in areas with less texture, by capturing more global information and reducing noise and discontinuities. Evaluations on the Middlebury and KITTI datasets show an average accuracy of 12.19% and 5.3%, respectively, indicating a more pronounced impact on the Middlebury dataset. Resource utilization analysis reveals a 1.6-fold increase in LUT usage and a 1.5-fold increase in register usage with min-pooling, while memory size effectively reduces memory usage by 41.2% compared to the method without min-pooling.
{"title":"Min-Pooling Cost Aggregation for Semi-Global Matching of Stereo Vision Processor","authors":"Wenyue Zhang;Pingcheng Dong;Lei Chen;Zhengyu Ma;Fengwei An","doi":"10.1109/TCSII.2024.3463200","DOIUrl":"10.1109/TCSII.2024.3463200","url":null,"abstract":"Semi-global matching (SGM) is a low-cost method suitable for hardware implementation, while it suffers from significant memory consumption. This brief presents a stereo-vision processor that leverages a min-pooling cost aggregation method for SGM. The min-pooling method addresses this issue by eliminating redundant values and employing an up-sampling technique to restore the original size without requiring clock domain crossing. As a result, this method effectively reduces memory usage by almost half, leading to a significant improvement in large-scale depth measurement. The experimental results demonstrate that the min-pooling method enhances the continuity of disparity maps, particularly in areas with less texture, by capturing more global information and reducing noise and discontinuities. Evaluations on the Middlebury and KITTI datasets show an average accuracy of 12.19% and 5.3%, respectively, indicating a more pronounced impact on the Middlebury dataset. Resource utilization analysis reveals a 1.6-fold increase in LUT usage and a 1.5-fold increase in register usage with min-pooling, while memory size effectively reduces memory usage by 41.2% compared to the method without min-pooling.","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"72 1","pages":"258-262"},"PeriodicalIF":4.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A 10.23-bit ENOB 1 kS/s Differential VCO-Based ADC With Resistive Input Stage in Low-Temperature Poly-Silicon TFT Technology 基于 VCO 的 10.23 位 ENOB 1 kS/s 差分 ADC,采用低温多晶硅 TFT 技术的电阻输入级
IF 4.4 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-17 DOI: 10.1109/tcsii.2024.3462819
Yuqing Lou, Hanbo Zhang, Jun Li, Chen Lin, Leilai Shao, Xiaojun Guo, Yongfu Li, Guoxing Wang, Fakhrul Rokhani, Jian Zhao
{"title":"A 10.23-bit ENOB 1 kS/s Differential VCO-Based ADC With Resistive Input Stage in Low-Temperature Poly-Silicon TFT Technology","authors":"Yuqing Lou, Hanbo Zhang, Jun Li, Chen Lin, Leilai Shao, Xiaojun Guo, Yongfu Li, Guoxing Wang, Fakhrul Rokhani, Jian Zhao","doi":"10.1109/tcsii.2024.3462819","DOIUrl":"https://doi.org/10.1109/tcsii.2024.3462819","url":null,"abstract":"","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"47 1","pages":""},"PeriodicalIF":4.4,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An FPGA-Based Transformer Accelerator With Parallel Unstructured Sparsity Handling for Question-Answering Applications 基于 FPGA 的变压器加速器,可为答题应用提供并行非结构稀疏性处理功能
IF 4 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-17 DOI: 10.1109/TCSII.2024.3462560
Rujian Cao;Zhongyu Zhao;Ka-Fai Un;Wei-Han Yu;Rui P. Martins;Pui-In Mak
Dataflow management provides limited performance improvement to the transformer model due to its lesser weight reuse than the convolution neural network. The cosFormer reduced computational complexity while achieving comparable performance to the vanilla transformer for natural language processing tasks. However, the unstructured sparsity in the cosFormer makes it a challenge to be implemented efficiently. This brief proposes a parallel unstructured sparsity handling (PUSH) scheme to compute sparse-dense matrix multiplication (SDMM) efficiently. It transforms unstructured sparsity into structured sparsity and reduces the total memory access by balancing the memory accesses of the sparse and dense matrices in the SDMM. We also employ unstructured weight pruning cooperating with PUSH to further increase the structured sparsity of the model. Through verification on an FPGA platform, the proposed accelerator achieves a throughput of 2.82 TOPS and an energy efficiency of 144.8 GOPs/W for HotpotQA dataset with long sequences.
与卷积神经网络相比,数据流管理的权重重复利用率较低,因此对变换器模型的性能提升有限。cosFormer 降低了计算复杂度,同时在自然语言处理任务中实现了与 vanilla transformer 相当的性能。然而,cosFormer 中的非结构稀疏性使其难以有效实现。本摘要提出了一种并行非结构稀疏性处理(PUSH)方案,以高效计算稀疏密集矩阵乘法(SDMM)。该方案将非结构稀疏性转化为结构稀疏性,并通过平衡 SDMM 中稀疏矩阵和密集矩阵的内存访问来减少总内存访问。我们还采用了非结构化权重剪枝技术与 PUSH 技术相结合,进一步提高了模型的结构稀疏性。通过在 FPGA 平台上的验证,针对长序列的 HotpotQA 数据集,所提出的加速器实现了 2.82 TOPS 的吞吐量和 144.8 GOPs/W 的能效。
{"title":"An FPGA-Based Transformer Accelerator With Parallel Unstructured Sparsity Handling for Question-Answering Applications","authors":"Rujian Cao;Zhongyu Zhao;Ka-Fai Un;Wei-Han Yu;Rui P. Martins;Pui-In Mak","doi":"10.1109/TCSII.2024.3462560","DOIUrl":"10.1109/TCSII.2024.3462560","url":null,"abstract":"Dataflow management provides limited performance improvement to the transformer model due to its lesser weight reuse than the convolution neural network. The cosFormer reduced computational complexity while achieving comparable performance to the vanilla transformer for natural language processing tasks. However, the unstructured sparsity in the cosFormer makes it a challenge to be implemented efficiently. This brief proposes a parallel unstructured sparsity handling (PUSH) scheme to compute sparse-dense matrix multiplication (SDMM) efficiently. It transforms unstructured sparsity into structured sparsity and reduces the total memory access by balancing the memory accesses of the sparse and dense matrices in the SDMM. We also employ unstructured weight pruning cooperating with PUSH to further increase the structured sparsity of the model. Through verification on an FPGA platform, the proposed accelerator achieves a throughput of 2.82 TOPS and an energy efficiency of 144.8 GOPs/W for HotpotQA dataset with long sequences.","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"71 11","pages":"4688-4692"},"PeriodicalIF":4.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Circuits and Systems II: Express Briefs
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1