Kinematic matrix: One-shot human action recognition using kinematic data structure

IF 7.5 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Engineering Applications of Artificial Intelligence Pub Date : 2024-10-31 DOI:10.1016/j.engappai.2024.109569
Mohammad Hassan Ranjbar , Ali Abdi , Ju Hong Park
{"title":"Kinematic matrix: One-shot human action recognition using kinematic data structure","authors":"Mohammad Hassan Ranjbar ,&nbsp;Ali Abdi ,&nbsp;Ju Hong Park","doi":"10.1016/j.engappai.2024.109569","DOIUrl":null,"url":null,"abstract":"<div><div>One-shot action recognition, which refers to recognizing human-performed actions using only a single training example, holds significant promise in advancing video analysis, particularly in domains requiring rapid adaptation to new actions. However, existing algorithms for one-shot action recognition face multiple challenges, including high computational complexity, limited accuracy, and difficulties in generalization to unseen actions. To address these issues, we propose a novel kinematic-based skeleton representation that effectively reduces computational demands while enhancing recognition performance. This representation leverages skeleton locations, velocities, and accelerations to formulate the one-shot action recognition task as a metric learning problem, where a model projects kinematic data into an embedding space. In this space, actions are distinguished based on Euclidean distances, facilitating efficient nearest-neighbour searches among activity reference samples. Our approach not only reduces computational complexity but also achieves higher accuracy and better generalization compared to existing methods. Specifically, our model achieved a validation accuracy of 78.5%, outperforming state-of-the-art methods by 8.66% under comparable training conditions. These findings underscore the potential of our method for practical applications in real-time action recognition systems.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5000,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197624017275","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

One-shot action recognition, which refers to recognizing human-performed actions using only a single training example, holds significant promise in advancing video analysis, particularly in domains requiring rapid adaptation to new actions. However, existing algorithms for one-shot action recognition face multiple challenges, including high computational complexity, limited accuracy, and difficulties in generalization to unseen actions. To address these issues, we propose a novel kinematic-based skeleton representation that effectively reduces computational demands while enhancing recognition performance. This representation leverages skeleton locations, velocities, and accelerations to formulate the one-shot action recognition task as a metric learning problem, where a model projects kinematic data into an embedding space. In this space, actions are distinguished based on Euclidean distances, facilitating efficient nearest-neighbour searches among activity reference samples. Our approach not only reduces computational complexity but also achieves higher accuracy and better generalization compared to existing methods. Specifically, our model achieved a validation accuracy of 78.5%, outperforming state-of-the-art methods by 8.66% under comparable training conditions. These findings underscore the potential of our method for practical applications in real-time action recognition systems.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
运动学矩阵:利用运动学数据结构识别一帧人类动作
单次动作识别是指仅使用单个训练示例来识别人类所做动作,它在推进视频分析方面前景广阔,尤其是在需要快速适应新动作的领域。然而,现有的单次动作识别算法面临着多重挑战,包括计算复杂度高、准确性有限以及难以泛化到未见过的动作。为了解决这些问题,我们提出了一种新颖的基于运动学的骨架表示法,它能有效降低计算需求,同时提高识别性能。这种表示方法利用骨架位置、速度和加速度,将单次动作识别任务表述为一个度量学习问题,其中一个模型将运动学数据投射到一个嵌入空间。在这个空间中,动作是根据欧氏距离来区分的,这有利于在活动参考样本中进行高效的近邻搜索。与现有方法相比,我们的方法不仅降低了计算复杂度,还实现了更高的准确性和更好的泛化。具体来说,在可比的训练条件下,我们的模型达到了 78.5% 的验证准确率,比最先进的方法高出 8.66%。这些发现凸显了我们的方法在实时动作识别系统中的实际应用潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Engineering Applications of Artificial Intelligence
Engineering Applications of Artificial Intelligence 工程技术-工程:电子与电气
CiteScore
9.60
自引率
10.00%
发文量
505
审稿时长
68 days
期刊介绍: Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.
期刊最新文献
Constrained multi-objective optimization assisted by convergence and diversity auxiliary tasks A deep sequence-to-sequence model for power swing blocking of distance protection in power transmission lines A Chinese named entity recognition method for landslide geological disasters based on deep learning A deep learning ensemble approach for malware detection in Internet of Things utilizing Explainable Artificial Intelligence Evaluating the financial credibility of third-party logistic providers through a novel frank operators-driven group decision-making model with dual hesitant linguistic q-rung orthopair fuzzy information
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1