{"title":"基于局部和非局部时间特征的快速动作识别","authors":"Zhiang Dong","doi":"10.1109/ICISCAE52414.2021.9590691","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a mixed time-asymmetric (MTA) CNN which uses time-asymmetric convolution to extract non-local temporal feature and uses normal convolution to extract local temporal features. With the fusion of local and non-local temporal feature, our MTA CNN can achieve better action recognition accuracy while keeping the network lightweight and fast. Specially, temporal feature fusion method is designed to replace the common global average pooling in our MTA CNN so as to obtain higher-dimensional feature vector and retain more information. Extensive experimental results demonstrate that our methods can achieve comparable results on Kinetics-400 and UCF101 among leading methods with less parameters and more faster recognition speed.","PeriodicalId":121049,"journal":{"name":"2021 IEEE 4th International Conference on Information Systems and Computer Aided Education (ICISCAE)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fast Action Recognition Based on Local and Nonlocal Temporal Feature\",\"authors\":\"Zhiang Dong\",\"doi\":\"10.1109/ICISCAE52414.2021.9590691\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose a mixed time-asymmetric (MTA) CNN which uses time-asymmetric convolution to extract non-local temporal feature and uses normal convolution to extract local temporal features. With the fusion of local and non-local temporal feature, our MTA CNN can achieve better action recognition accuracy while keeping the network lightweight and fast. Specially, temporal feature fusion method is designed to replace the common global average pooling in our MTA CNN so as to obtain higher-dimensional feature vector and retain more information. Extensive experimental results demonstrate that our methods can achieve comparable results on Kinetics-400 and UCF101 among leading methods with less parameters and more faster recognition speed.\",\"PeriodicalId\":121049,\"journal\":{\"name\":\"2021 IEEE 4th International Conference on Information Systems and Computer Aided Education (ICISCAE)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 4th International Conference on Information Systems and Computer Aided Education (ICISCAE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICISCAE52414.2021.9590691\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 4th International Conference on Information Systems and Computer Aided Education (ICISCAE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICISCAE52414.2021.9590691","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Fast Action Recognition Based on Local and Nonlocal Temporal Feature
In this paper, we propose a mixed time-asymmetric (MTA) CNN which uses time-asymmetric convolution to extract non-local temporal feature and uses normal convolution to extract local temporal features. With the fusion of local and non-local temporal feature, our MTA CNN can achieve better action recognition accuracy while keeping the network lightweight and fast. Specially, temporal feature fusion method is designed to replace the common global average pooling in our MTA CNN so as to obtain higher-dimensional feature vector and retain more information. Extensive experimental results demonstrate that our methods can achieve comparable results on Kinetics-400 and UCF101 among leading methods with less parameters and more faster recognition speed.