Wenhua Li, Enzeng Dong, Jigang Tong, Sen Yang, Zufeng Zhang, Wenyu Li
{"title":"基于骨架动作识别的多层次分解时间聚集图卷积网络","authors":"Wenhua Li, Enzeng Dong, Jigang Tong, Sen Yang, Zufeng Zhang, Wenyu Li","doi":"10.1109/ICMA57826.2023.10215860","DOIUrl":null,"url":null,"abstract":"Skeleton-based human action recognition has become a popular topic among researchers. This is because using skeletal data provides a robust solution to problems encountered in complex environments, such as changes in perspective and background interference. The robustness of skeletal data enables recognition methods to focus on more specific and relevant features. We propose a model called multilevel decomposition time aggregation graph convolution network (MDT-GCN), which utilizes a multilevel graph convolution kernel to capture higher-order spatial dependence relationships between joints. This is achieved by decomposing a human topology graph into smaller graphs, each of which has its own graph convolution kernel. To further enhance the performance of our model, we employ a two-flow framework and channel topology refinement strategy. Our experiments on the NTU-RGB+D60 and NTU-RGB+D120 datasets demonstrate that our MDT-GCN network outperforms the previous algorithm and significantly improves the accuracy of action recognition.","PeriodicalId":151364,"journal":{"name":"2023 IEEE International Conference on Mechatronics and Automation (ICMA)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multilevel Decomposition Time Aggregation Graph Convolution Networks for Skeleton-Based Action Recognition\",\"authors\":\"Wenhua Li, Enzeng Dong, Jigang Tong, Sen Yang, Zufeng Zhang, Wenyu Li\",\"doi\":\"10.1109/ICMA57826.2023.10215860\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Skeleton-based human action recognition has become a popular topic among researchers. This is because using skeletal data provides a robust solution to problems encountered in complex environments, such as changes in perspective and background interference. The robustness of skeletal data enables recognition methods to focus on more specific and relevant features. We propose a model called multilevel decomposition time aggregation graph convolution network (MDT-GCN), which utilizes a multilevel graph convolution kernel to capture higher-order spatial dependence relationships between joints. This is achieved by decomposing a human topology graph into smaller graphs, each of which has its own graph convolution kernel. To further enhance the performance of our model, we employ a two-flow framework and channel topology refinement strategy. Our experiments on the NTU-RGB+D60 and NTU-RGB+D120 datasets demonstrate that our MDT-GCN network outperforms the previous algorithm and significantly improves the accuracy of action recognition.\",\"PeriodicalId\":151364,\"journal\":{\"name\":\"2023 IEEE International Conference on Mechatronics and Automation (ICMA)\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Conference on Mechatronics and Automation (ICMA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMA57826.2023.10215860\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Mechatronics and Automation (ICMA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMA57826.2023.10215860","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multilevel Decomposition Time Aggregation Graph Convolution Networks for Skeleton-Based Action Recognition
Skeleton-based human action recognition has become a popular topic among researchers. This is because using skeletal data provides a robust solution to problems encountered in complex environments, such as changes in perspective and background interference. The robustness of skeletal data enables recognition methods to focus on more specific and relevant features. We propose a model called multilevel decomposition time aggregation graph convolution network (MDT-GCN), which utilizes a multilevel graph convolution kernel to capture higher-order spatial dependence relationships between joints. This is achieved by decomposing a human topology graph into smaller graphs, each of which has its own graph convolution kernel. To further enhance the performance of our model, we employ a two-flow framework and channel topology refinement strategy. Our experiments on the NTU-RGB+D60 and NTU-RGB+D120 datasets demonstrate that our MDT-GCN network outperforms the previous algorithm and significantly improves the accuracy of action recognition.