Spike-HAR++: an energy-efficient and lightweight parallel spiking transformer for event-based human action recognition.

IF 2.1 4区医学 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Frontiers in Computational Neuroscience Pub Date : 2024-11-26 eCollection Date: 2024-01-01 DOI:10.3389/fncom.2024.1508297

Xinxu Lin, Mingxuan Liu, Hong Chen

{"title":"Spike-HAR++: an energy-efficient and lightweight parallel spiking transformer for event-based human action recognition.","authors":"Xinxu Lin, Mingxuan Liu, Hong Chen","doi":"10.3389/fncom.2024.1508297","DOIUrl":null,"url":null,"abstract":"<p><p>Event-based cameras are suitable for human action recognition (HAR) by providing movement perception with highly dynamic range, high temporal resolution, high power efficiency and low latency. Spike Neural Networks (SNNs) are naturally suited to deal with the asynchronous and sparse data from the event cameras due to their spike-based event-driven paradigm, with less power consumption compared to artificial neural networks. In this paper, we propose two end-to-end SNNs, namely Spike-HAR and Spike-HAR++, to introduce spiking transformer into event-based HAR. Spike-HAR includes two novel blocks: a spike attention branch, which enables model to focus on regions with high spike rates, reducing the impact of noise to improve the accuracy, and a parallel spike transformer block with simplified spiking self-attention mechanism, increasing computational efficiency. To better extract crucial information from high-level features, we modify the architecture of the spike attention branch and extend it in Spike-HAR to a higher dimension, proposing Spike-HAR++ to further enhance classification performance. Comprehensive experiments were conducted on four HAR datasets: SL-Animals-DVS, N-LSA64, DVS128 Gesture and DailyAction-DVS, to demonstrate the superior performance of our proposed model. Additionally, the proposed Spike-HAR and Spike-HAR++ require only 0.03 and 0.06 mJ, respectively, to process a sequence of event frames, with model sizes of only 0.7 and 1.8 M. This efficiency positions it as a promising new SNN baseline for the HAR community. Code is available at Spike-HAR++.</p>","PeriodicalId":12363,"journal":{"name":"Frontiers in Computational Neuroscience","volume":"18 ","pages":"1508297"},"PeriodicalIF":2.1000,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11628275/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Computational Neuroscience","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3389/fncom.2024.1508297","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Event-based cameras are suitable for human action recognition (HAR) by providing movement perception with highly dynamic range, high temporal resolution, high power efficiency and low latency. Spike Neural Networks (SNNs) are naturally suited to deal with the asynchronous and sparse data from the event cameras due to their spike-based event-driven paradigm, with less power consumption compared to artificial neural networks. In this paper, we propose two end-to-end SNNs, namely Spike-HAR and Spike-HAR++, to introduce spiking transformer into event-based HAR. Spike-HAR includes two novel blocks: a spike attention branch, which enables model to focus on regions with high spike rates, reducing the impact of noise to improve the accuracy, and a parallel spike transformer block with simplified spiking self-attention mechanism, increasing computational efficiency. To better extract crucial information from high-level features, we modify the architecture of the spike attention branch and extend it in Spike-HAR to a higher dimension, proposing Spike-HAR++ to further enhance classification performance. Comprehensive experiments were conducted on four HAR datasets: SL-Animals-DVS, N-LSA64, DVS128 Gesture and DailyAction-DVS, to demonstrate the superior performance of our proposed model. Additionally, the proposed Spike-HAR and Spike-HAR++ require only 0.03 and 0.06 mJ, respectively, to process a sequence of event frames, with model sizes of only 0.7 and 1.8 M. This efficiency positions it as a promising new SNN baseline for the HAR community. Code is available at Spike-HAR++.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

一个节能和轻量级的并行尖峰变压器，用于基于事件的人类动作识别。

基于事件的相机通过提供高动态范围、高时间分辨率、高能效和低延迟的运动感知，适用于人类动作识别（HAR）。Spike Neural Networks （snn）由于其基于Spike的事件驱动模式，自然适合处理来自事件摄像机的异步和稀疏数据，并且与人工神经网络相比功耗更低。在本文中，我们提出了两个端到端snn，即Spike-HAR和spike - ha++，以将spike变压器引入到基于事件的HAR中。spike - har包括两个新颖的块：一个是尖峰注意分支，使模型能够关注高尖峰率的区域，减少噪声的影响，提高精度；另一个是并联尖峰变压器块，简化了尖峰自注意机制，提高了计算效率。为了更好地从高级特征中提取关键信息，我们修改了spike注意分支的架构，并将其在spike - har中扩展到更高的维度，提出了spike - ha++来进一步提高分类性能。在4个HAR数据集：SL-Animals-DVS、N-LSA64、DVS128 Gesture和DailyAction-DVS上进行了综合实验，证明了我们提出的模型的优越性能。此外，所提出的Spike-HAR和spike - ha++分别只需要0.03和0.06 mJ来处理一系列事件帧，模型大小仅为0.7和1.8 M.这种效率使其成为HAR社区有希望的新SNN基线。代码可在spike - ha++中获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Frontiers in Computational Neuroscience MATHEMATICAL & COMPUTATIONAL BIOLOGY-NEUROSCIENCES

CiteScore

5.30

自引率

3.10%

发文量

166

审稿时长

6-12 weeks

期刊介绍： Frontiers in Computational Neuroscience is a first-tier electronic journal devoted to promoting theoretical modeling of brain function and fostering interdisciplinary interactions between theoretical and experimental neuroscience. Progress in understanding the amazing capabilities of the brain is still limited, and we believe that it will only come with deep theoretical thinking and mutually stimulating cooperation between different disciplines and approaches. We therefore invite original contributions on a wide range of topics that present the fruits of such cooperation, or provide stimuli for future alliances. We aim to provide an interactive forum for cutting-edge theoretical studies of the nervous system, and for promulgating the best theoretical research to the broader neuroscience community. Models of all styles and at all levels are welcome, from biophysically motivated realistic simulations of neurons and synapses to high-level abstract models of inference and decision making. While the journal is primarily focused on theoretically based and driven research, we welcome experimental studies that validate and test theoretical conclusions. Also: comp neuro