Fei Yan, Guangyao Jin, Zheng Mu, Shouxing Zhang, Yinghao Cai, Tao Lu, Yan Zhuang
{"title":"Novel vision-LiDAR fusion framework for human action recognition based on dynamic lateral connection","authors":"Fei Yan, Guangyao Jin, Zheng Mu, Shouxing Zhang, Yinghao Cai, Tao Lu, Yan Zhuang","doi":"10.1049/csy2.70005","DOIUrl":null,"url":null,"abstract":"<p>In the past decades, substantial progress has been made in human action recognition. However, most existing studies and datasets for human action recognition utilise still images or videos as the primary modality. Image-based approaches can be easily impacted by adverse environmental conditions. In this paper, the authors propose combining RGB images and point clouds from LiDAR sensors for human action recognition. A dynamic lateral convolutional network (DLCN) is proposed to fuse features from multi-modalities. The RGB features and the geometric information from the point clouds closely interact with each other in the DLCN, which is complementary in action recognition. The experimental results on the JRDB-Act dataset demonstrate that the proposed DLCN outperforms the state-of-the-art approaches of human action recognition. The authors show the potential of the proposed DLCN in various complex scenarios, which is highly valuable in real-world applications.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":"6 4","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.70005","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Cybersystems and Robotics","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/csy2.70005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
In the past decades, substantial progress has been made in human action recognition. However, most existing studies and datasets for human action recognition utilise still images or videos as the primary modality. Image-based approaches can be easily impacted by adverse environmental conditions. In this paper, the authors propose combining RGB images and point clouds from LiDAR sensors for human action recognition. A dynamic lateral convolutional network (DLCN) is proposed to fuse features from multi-modalities. The RGB features and the geometric information from the point clouds closely interact with each other in the DLCN, which is complementary in action recognition. The experimental results on the JRDB-Act dataset demonstrate that the proposed DLCN outperforms the state-of-the-art approaches of human action recognition. The authors show the potential of the proposed DLCN in various complex scenarios, which is highly valuable in real-world applications.