Motion Embedded Images: An Approach to Capture Spatial and Temporal Features for Action Recognition

IF 3.3 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Informatica Pub Date : 2023-08-29 DOI:10.31449/inf.v47i3.4755
Tri Le, Nham Huynh-Duc, Chung Thai Nguyen, Minh-Triet Tran
{"title":"Motion Embedded Images: An Approach to Capture Spatial and Temporal Features for Action Recognition","authors":"Tri Le, Nham Huynh-Duc, Chung Thai Nguyen, Minh-Triet Tran","doi":"10.31449/inf.v47i3.4755","DOIUrl":null,"url":null,"abstract":"The demand for human activity recognition (HAR) from videos has witnessed a significant surge in various real-life applications, including video surveillance, healthcare, elderly care, among others. The explotion of short-form videos on social media platforms has further intensified the interest in this domain. This research endeavors to focus on the problem of HAR in general short videos. In contrast to still images, video clips offer both spatial and temporal information, rendering it challenging to extract complementary information on appearance from still frames and motion between frames. This research makes a two-fold contribution. Firstly, we investigate the use of motion-embedded images in a variant of two-stream Convolutional Neural Network architecture, in which one stream captures motion using combined batches of frames, while another stream employs a normal image classification ConvNet to classify static appearance. Secondly, we create a novel dataset of Southeast Asian Sports short videos that encompasses both videos with and without effects, which is a modern factor that is lacking in all currently available datasets used for benchmarking models. The proposed model is trained and evaluated on two benchmarks: UCF-101 and SEAGS-V1. The results reveal that the proposed model yields competitive performance compared to prior attempts to address the same problem.","PeriodicalId":56292,"journal":{"name":"Informatica","volume":"45 1","pages":"0"},"PeriodicalIF":3.3000,"publicationDate":"2023-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informatica","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31449/inf.v47i3.4755","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

The demand for human activity recognition (HAR) from videos has witnessed a significant surge in various real-life applications, including video surveillance, healthcare, elderly care, among others. The explotion of short-form videos on social media platforms has further intensified the interest in this domain. This research endeavors to focus on the problem of HAR in general short videos. In contrast to still images, video clips offer both spatial and temporal information, rendering it challenging to extract complementary information on appearance from still frames and motion between frames. This research makes a two-fold contribution. Firstly, we investigate the use of motion-embedded images in a variant of two-stream Convolutional Neural Network architecture, in which one stream captures motion using combined batches of frames, while another stream employs a normal image classification ConvNet to classify static appearance. Secondly, we create a novel dataset of Southeast Asian Sports short videos that encompasses both videos with and without effects, which is a modern factor that is lacking in all currently available datasets used for benchmarking models. The proposed model is trained and evaluated on two benchmarks: UCF-101 and SEAGS-V1. The results reveal that the proposed model yields competitive performance compared to prior attempts to address the same problem.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
运动嵌入图像:一种捕捉动作识别的空间和时间特征的方法
对视频中人类活动识别(HAR)的需求在各种现实生活应用中激增,包括视频监控、医疗保健、老年人护理等。社交媒体平台对短视频的开发进一步加剧了人们对这一领域的兴趣。本研究致力于研究一般短视频中的HAR问题。与静止图像相比,视频片段提供了空间和时间信息,这使得从静止帧和帧之间的运动中提取外观的互补信息具有挑战性。这项研究有双重贡献。首先,我们研究了在两流卷积神经网络架构的变体中使用运动嵌入图像,其中一个流使用组合批次的帧捕获运动,而另一个流使用常规图像分类卷积神经网络对静态外观进行分类。其次,我们创建了一个新的东南亚体育短视频数据集,其中包括带效果和不带效果的视频,这是所有当前可用的用于基准模型的数据集所缺乏的现代因素。提出的模型在两个基准上进行了训练和评估:UCF-101和segs - v1。结果表明,与先前解决相同问题的尝试相比,所提出的模型产生了具有竞争力的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Informatica
Informatica 工程技术-计算机:信息系统
CiteScore
5.90
自引率
6.90%
发文量
19
审稿时长
12 months
期刊介绍: The quarterly journal Informatica provides an international forum for high-quality original research and publishes papers on mathematical simulation and optimization, recognition and control, programming theory and systems, automation systems and elements. Informatica provides a multidisciplinary forum for scientists and engineers involved in research and design including experts who implement and manage information systems applications.
期刊最新文献
Beyond Quasi-Adjoint Graphs: On Polynomial-Time Solvable Cases of the Hamiltonian Cycle and Path Problems Confidential Transaction Balance Verification by the Net Using Non-Interactive Zero-Knowledge Proofs An Improved Algorithm for Extracting Frequent Gradual Patterns Offloaded Data Processing Energy Efficiency Evaluation Demystifying the Stability and the Performance Aspects of CoCoSo Ranking Method under Uncertain Preferences
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1