Human Activity Recognition Utilizing Ensemble of Transfer-Learned Attention Networks and a Low-Cost Convolutional Neural Architecture

Azmain Yakin Srizon, S. Hasan, Md. Farukuzzaman Faruk, Abu Sayeed, Md. Ali Hossain
{"title":"Human Activity Recognition Utilizing Ensemble of Transfer-Learned Attention Networks and a Low-Cost Convolutional Neural Architecture","authors":"Azmain Yakin Srizon, S. Hasan, Md. Farukuzzaman Faruk, Abu Sayeed, Md. Ali Hossain","doi":"10.1109/ICCIT57492.2022.10055456","DOIUrl":null,"url":null,"abstract":"Throughout the last decades, human activity recognition has been considered one of the most complex tasks in the domain of computer vision. Previously, many works have suggested different machine learning models for the recognition of human actions from sensor-based data and video-based data which is not cost-efficient. The recent advancement of the convolutional neural network (CNN) has opened the possibility of accurate human activity recognition from still images. Although many researchers have already proposed some deep learning-based approaches addressing the problem, due to the high diversity in human actions, those approaches failed to achieve decent performance for all human actions under consideration. Some researchers argued that an ensemble of different models may work better in this regard. However, as the images used for recognition in this domain are mostly captured by security cameras, often, the deep models couldn’t extract valuable features resulting in misclassifications. To resolve these issues, in this study, we have considered three transfer-learned models i.e., DenseNet201, Xception, and EfficientNetB6, and applied a multichannel attention module to extract more distinguishable features. Moreover, a custom-made low-cost CNN has been proposed that works with small images extracting features that often get lost due to deep computations. Finally, the fusion of features extracted by attention-based transfer-learned models and the low-cost CNN has been used for the final prediction. We validated the proposed ensemble model on Stanford 40 actions, BU-101, and Willow datasets, and it achieved 97.48%, 98.29%, and 94.19% overall accuracy respectively which outperformed the previous performances by notable margins.","PeriodicalId":255498,"journal":{"name":"2022 25th International Conference on Computer and Information Technology (ICCIT)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 25th International Conference on Computer and Information Technology (ICCIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIT57492.2022.10055456","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Throughout the last decades, human activity recognition has been considered one of the most complex tasks in the domain of computer vision. Previously, many works have suggested different machine learning models for the recognition of human actions from sensor-based data and video-based data which is not cost-efficient. The recent advancement of the convolutional neural network (CNN) has opened the possibility of accurate human activity recognition from still images. Although many researchers have already proposed some deep learning-based approaches addressing the problem, due to the high diversity in human actions, those approaches failed to achieve decent performance for all human actions under consideration. Some researchers argued that an ensemble of different models may work better in this regard. However, as the images used for recognition in this domain are mostly captured by security cameras, often, the deep models couldn’t extract valuable features resulting in misclassifications. To resolve these issues, in this study, we have considered three transfer-learned models i.e., DenseNet201, Xception, and EfficientNetB6, and applied a multichannel attention module to extract more distinguishable features. Moreover, a custom-made low-cost CNN has been proposed that works with small images extracting features that often get lost due to deep computations. Finally, the fusion of features extracted by attention-based transfer-learned models and the low-cost CNN has been used for the final prediction. We validated the proposed ensemble model on Stanford 40 actions, BU-101, and Willow datasets, and it achieved 97.48%, 98.29%, and 94.19% overall accuracy respectively which outperformed the previous performances by notable margins.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于转移学习注意网络集成和低成本卷积神经结构的人类活动识别
在过去的几十年里,人类活动识别一直被认为是计算机视觉领域最复杂的任务之一。以前,许多工作已经提出了不同的机器学习模型,用于从基于传感器的数据和基于视频的数据中识别人类行为,这是不划算的。卷积神经网络(CNN)的最新进展已经开启了从静止图像中准确识别人类活动的可能性。尽管许多研究人员已经提出了一些基于深度学习的方法来解决这个问题,但由于人类行为的高度多样性,这些方法未能在考虑的所有人类行为中取得良好的表现。一些研究人员认为,在这方面,不同模型的综合可能会更好。然而,由于用于该领域识别的图像大多是由安全摄像机捕获的,因此深度模型通常无法提取有价值的特征,从而导致错误分类。为了解决这些问题,在本研究中,我们考虑了三个迁移学习模型,即DenseNet201, Xception和EfficientNetB6,并应用了一个多通道注意力模块来提取更多可区分的特征。此外,还提出了一种定制的低成本CNN,用于小图像提取由于深度计算而经常丢失的特征。最后,将基于注意的迁移学习模型提取的特征与低成本的CNN进行融合,用于最终的预测。我们在Stanford 40 actions、BU-101和Willow数据集上验证了所提出的集成模型,其总体准确率分别达到97.48%、98.29%和94.19%,显著优于之前的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
SlotFinder: A Spatio-temporal based Car Parking System Land Cover and Land Use Detection using Semi-Supervised Learning Comparative Analysis of Process Scheduling Algorithm using AI models Throughput Optimization of IEEE 802.15.4e TSCH-Based Scheduling: A Deep Neural Network (DNN) Scheme Towards Developing a Voice-Over-Guided System for Visually Impaired People to Learn Writing the Alphabets
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1