Haonan Zhang;Siyu Zhang;Wendong Mao;Zhongfeng Wang
{"title":"An Efficient Brain-Inspired Accelerator Using a High-Accuracy Conversion Algorithm for Spiking Deformable CNN","authors":"Haonan Zhang;Siyu Zhang;Wendong Mao;Zhongfeng Wang","doi":"10.1109/TCSII.2024.3487266","DOIUrl":null,"url":null,"abstract":"Spiking Neural Network (SNN), inspired by the brain, has shown promising potential in terms of low-power deployment on resource-constrained devices. The SNN can be obtained by two approaches: training from scratch or conversion from existing Artificial Neural Network (ANN). However, the directly training SNN often leads to suboptimal accuracy. Therefore, methods based on converting existing ANN have become the preferred choice for achieving high accuracy. To enhance the feature-capturing capability of the converted SNNs, various operations, such as transposed convolution and deformable convolution, have been introduced, which bring multiple challenges to conversion algorithms and hardware designs. In this brief, we propose a universal SNN conversion method for deformable convolution to enhance the modeling capability of receptive fields for spatial information. The proposed conversion algorithm not only maintains high accuracy but also makes converted deformable convolutions highly hardware-efficient. Building upon the deformable SNN, we develop a low-complexity processing element and computing array, enabling flexible execution of complex and heterogeneous operations within deformable SNNs without requiring any multipliers. In addition, the overall architecture with energy-efficient dataflow is designed for our deformable SNN model and is implemented in TSMC 28-nm HPC+ technology node. Experiments show that the proposed conversion algorithm suffers negligible accuracy degradation in the challenging object detection task. The accelerator achieves at least \n<inline-formula> <tex-math>$1.2\\times $ </tex-math></inline-formula>\n higher energy efficiency compared to previous designs while maintaining 47.9% mAP.","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"72 1","pages":"288-292"},"PeriodicalIF":4.0000,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems II: Express Briefs","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10737095/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Spiking Neural Network (SNN), inspired by the brain, has shown promising potential in terms of low-power deployment on resource-constrained devices. The SNN can be obtained by two approaches: training from scratch or conversion from existing Artificial Neural Network (ANN). However, the directly training SNN often leads to suboptimal accuracy. Therefore, methods based on converting existing ANN have become the preferred choice for achieving high accuracy. To enhance the feature-capturing capability of the converted SNNs, various operations, such as transposed convolution and deformable convolution, have been introduced, which bring multiple challenges to conversion algorithms and hardware designs. In this brief, we propose a universal SNN conversion method for deformable convolution to enhance the modeling capability of receptive fields for spatial information. The proposed conversion algorithm not only maintains high accuracy but also makes converted deformable convolutions highly hardware-efficient. Building upon the deformable SNN, we develop a low-complexity processing element and computing array, enabling flexible execution of complex and heterogeneous operations within deformable SNNs without requiring any multipliers. In addition, the overall architecture with energy-efficient dataflow is designed for our deformable SNN model and is implemented in TSMC 28-nm HPC+ technology node. Experiments show that the proposed conversion algorithm suffers negligible accuracy degradation in the challenging object detection task. The accelerator achieves at least
$1.2\times $
higher energy efficiency compared to previous designs while maintaining 47.9% mAP.
期刊介绍:
TCAS II publishes brief papers in the field specified by the theory, analysis, design, and practical implementations of circuits, and the application of circuit techniques to systems and to signal processing. Included is the whole spectrum from basic scientific theory to industrial applications. The field of interest covered includes:
Circuits: Analog, Digital and Mixed Signal Circuits and Systems
Nonlinear Circuits and Systems, Integrated Sensors, MEMS and Systems on Chip, Nanoscale Circuits and Systems, Optoelectronic
Circuits and Systems, Power Electronics and Systems
Software for Analog-and-Logic Circuits and Systems
Control aspects of Circuits and Systems.