Xiao Wang, Shiao Wang, Pengpeng Shao, Bo Jiang, Lin Zhu, Yonghong Tian
{"title":"Event Stream based Human Action Recognition: A High-Definition Benchmark Dataset and Algorithms","authors":"Xiao Wang, Shiao Wang, Pengpeng Shao, Bo Jiang, Lin Zhu, Yonghong Tian","doi":"arxiv-2408.09764","DOIUrl":null,"url":null,"abstract":"Human Action Recognition (HAR) stands as a pivotal research domain in both\ncomputer vision and artificial intelligence, with RGB cameras dominating as the\npreferred tool for investigation and innovation in this field. However, in\nreal-world applications, RGB cameras encounter numerous challenges, including\nlight conditions, fast motion, and privacy concerns. Consequently, bio-inspired\nevent cameras have garnered increasing attention due to their advantages of low\nenergy consumption, high dynamic range, etc. Nevertheless, most existing\nevent-based HAR datasets are low resolution ($346 \\times 260$). In this paper,\nwe propose a large-scale, high-definition ($1280 \\times 800$) human action\nrecognition dataset based on the CeleX-V event camera, termed CeleX-HAR. It\nencompasses 150 commonly occurring action categories, comprising a total of\n124,625 video sequences. Various factors such as multi-view, illumination,\naction speed, and occlusion are considered when recording these data. To build\na more comprehensive benchmark dataset, we report over 20 mainstream HAR models\nfor future works to compare. In addition, we also propose a novel Mamba vision\nbackbone network for event stream based HAR, termed EVMamba, which equips the\nspatial plane multi-directional scanning and novel voxel temporal scanning\nmechanism. By encoding and mining the spatio-temporal information of event\nstreams, our EVMamba has achieved favorable results across multiple datasets.\nBoth the dataset and source code will be released on\n\\url{https://github.com/Event-AHU/CeleX-HAR}","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"18 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Neural and Evolutionary Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.09764","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Human Action Recognition (HAR) stands as a pivotal research domain in both
computer vision and artificial intelligence, with RGB cameras dominating as the
preferred tool for investigation and innovation in this field. However, in
real-world applications, RGB cameras encounter numerous challenges, including
light conditions, fast motion, and privacy concerns. Consequently, bio-inspired
event cameras have garnered increasing attention due to their advantages of low
energy consumption, high dynamic range, etc. Nevertheless, most existing
event-based HAR datasets are low resolution ($346 \times 260$). In this paper,
we propose a large-scale, high-definition ($1280 \times 800$) human action
recognition dataset based on the CeleX-V event camera, termed CeleX-HAR. It
encompasses 150 commonly occurring action categories, comprising a total of
124,625 video sequences. Various factors such as multi-view, illumination,
action speed, and occlusion are considered when recording these data. To build
a more comprehensive benchmark dataset, we report over 20 mainstream HAR models
for future works to compare. In addition, we also propose a novel Mamba vision
backbone network for event stream based HAR, termed EVMamba, which equips the
spatial plane multi-directional scanning and novel voxel temporal scanning
mechanism. By encoding and mining the spatio-temporal information of event
streams, our EVMamba has achieved favorable results across multiple datasets.
Both the dataset and source code will be released on
\url{https://github.com/Event-AHU/CeleX-HAR}