MCNet: A unified multi-center graph convolutional network based on skeletal behavior recognition

IF 6.2 2区 工程技术 Q1 ENGINEERING, MULTIDISCIPLINARY alexandria engineering journal Pub Date : 2025-02-12 DOI:10.1016/j.aej.2025.01.118
Haiping Zhang , Xinhao Zhang , Dongjing Wang , Fuxing Zhou , Junfeng Yan
{"title":"MCNet: A unified multi-center graph convolutional network based on skeletal behavior recognition","authors":"Haiping Zhang ,&nbsp;Xinhao Zhang ,&nbsp;Dongjing Wang ,&nbsp;Fuxing Zhou ,&nbsp;Junfeng Yan","doi":"10.1016/j.aej.2025.01.118","DOIUrl":null,"url":null,"abstract":"<div><div>The enhanced stability and computational efficiency of skeletal data render it a highly sought-after option for video action recognition. Although some progress has been made in existing research on skeleton behavior recognition based on graph convolutional networks (GCN), the fixation of the graph structure and the lack of interaction of the objects in the dataset with the objects lead to the lack of some flexibility of the traditional model in recognizing actions with a large degree of similarity. This will have an impact on the final performance of the model. To address these issues, we propose a unified multi-center graph convolutional network (MCNet) for skeletal behavior recognition. Some of the actions with a large movement amplitude will result in a change of the human body centers. A multi-center training approach is proposed for the recognition of such actions, in which three centers are defined in the construction of the topology graph. A Multi-Center Data Selector (MCDS) is employed to differentiate and select these centers, thereby enhancing the adaptability of the recognition task. Some of the action categories are easily confused with each other, and in order to facilitate the recognition of actions with high similarity, a multi-modal training scheme is proposed. This employs a large-scale language model as a knowledge engine to provide textual descriptions for global actions in different centers, thus enabling the differentiation of actions and further improvement of the recognition effect. Finally, an attention mechanism module is employed to aggregate the features of a multi-scale adjacency matrix along the channel dimension. In order to verify the effectiveness of the network model proposed in this paper, a series of ablation experiments and model analyses were conducted on three datasets. The model was also compared with other state-of-the-art models, including CTR-GCN, Info-GCN, and STF. The results demonstrated that the model proposed in this paper reached the SOTA level. MCNet outperforms CTR-GCN(Baseline) by 0.6% on X-Sub and 0.3% on X-View on the NTU RGB+D 60 dataset. On the NTU RGB+D 120 dataset, the performance is even more pronounced, with an improvement of up to 0.8% for the X-Sub and X-Set benchmarks, respectively.</div></div>","PeriodicalId":7484,"journal":{"name":"alexandria engineering journal","volume":"120 ","pages":"Pages 116-127"},"PeriodicalIF":6.2000,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"alexandria engineering journal","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110016825001462","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

The enhanced stability and computational efficiency of skeletal data render it a highly sought-after option for video action recognition. Although some progress has been made in existing research on skeleton behavior recognition based on graph convolutional networks (GCN), the fixation of the graph structure and the lack of interaction of the objects in the dataset with the objects lead to the lack of some flexibility of the traditional model in recognizing actions with a large degree of similarity. This will have an impact on the final performance of the model. To address these issues, we propose a unified multi-center graph convolutional network (MCNet) for skeletal behavior recognition. Some of the actions with a large movement amplitude will result in a change of the human body centers. A multi-center training approach is proposed for the recognition of such actions, in which three centers are defined in the construction of the topology graph. A Multi-Center Data Selector (MCDS) is employed to differentiate and select these centers, thereby enhancing the adaptability of the recognition task. Some of the action categories are easily confused with each other, and in order to facilitate the recognition of actions with high similarity, a multi-modal training scheme is proposed. This employs a large-scale language model as a knowledge engine to provide textual descriptions for global actions in different centers, thus enabling the differentiation of actions and further improvement of the recognition effect. Finally, an attention mechanism module is employed to aggregate the features of a multi-scale adjacency matrix along the channel dimension. In order to verify the effectiveness of the network model proposed in this paper, a series of ablation experiments and model analyses were conducted on three datasets. The model was also compared with other state-of-the-art models, including CTR-GCN, Info-GCN, and STF. The results demonstrated that the model proposed in this paper reached the SOTA level. MCNet outperforms CTR-GCN(Baseline) by 0.6% on X-Sub and 0.3% on X-View on the NTU RGB+D 60 dataset. On the NTU RGB+D 120 dataset, the performance is even more pronounced, with an improvement of up to 0.8% for the X-Sub and X-Set benchmarks, respectively.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
求助全文
约1分钟内获得全文 去求助
来源期刊
alexandria engineering journal
alexandria engineering journal Engineering-General Engineering
CiteScore
11.20
自引率
4.40%
发文量
1015
审稿时长
43 days
期刊介绍: Alexandria Engineering Journal is an international journal devoted to publishing high quality papers in the field of engineering and applied science. Alexandria Engineering Journal is cited in the Engineering Information Services (EIS) and the Chemical Abstracts (CA). The papers published in Alexandria Engineering Journal are grouped into five sections, according to the following classification: • Mechanical, Production, Marine and Textile Engineering • Electrical Engineering, Computer Science and Nuclear Engineering • Civil and Architecture Engineering • Chemical Engineering and Applied Sciences • Environmental Engineering
期刊最新文献
Controllability of pantograph-type nonlinear non-integer order differential system with input delay Energy-efficient scalable routing algorithm based on hierarchical agglomerative clustering for Wireless Sensor Networks Exploring the effects of IoT-enhanced exercise and cognitive training on executive function in middle-aged adults MCNet: A unified multi-center graph convolutional network based on skeletal behavior recognition Techno-economic evaluation of hybrid solar thermal and photovoltaic cooling systems in the industrial sector implementing a dynamic load estimation method
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1