{"title":"基于骨架的花样滑冰视频动作质量评价","authors":"Huiyong Li, Qing Lei, Hongbo Zhang, Jixiang Du","doi":"10.1109/ITME53901.2021.00048","DOIUrl":null,"url":null,"abstract":"Action quality assessment(AQA) aims at achieving automatic evaluation the performance of human actions in video. Compared with action recognition problem, AQA focuses more on subtle differences both in spatial and temporal dimensions during the whole executing process of actions. However, most existing AQA methods tried to extract features directly from RGB videos through a 3D ConvNets, which makes the features mixed with useless scene information. To overcome this problem, We propose a deep pose feature learning AQA method that captured detailed and meaningful representations for skeleton information to discover the subtle motion difference of AQA problem. We first apply pose estimation method to obtain human skeleton data from RGB videos. Then a spatio-temporal graph convolutional network (ST-GCN) is employed to extract the dynamic changes of skeleton data and obtain the representative pose features. Finally, a regressor composed of three fully connected layers is developed to reduce the dimension of the obtained pose features and predict the final score. Experiments on MIT figure skating dataset have been extensively conducted, and the results demonstrate that the proposed method has achieved improvements that outperformed current state-of-the-art methods.","PeriodicalId":6774,"journal":{"name":"2021 11th International Conference on Information Technology in Medicine and Education (ITME)","volume":"64 1","pages":"196-200"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Skeleton Based Action Quality Assessment of Figure Skating Videos\",\"authors\":\"Huiyong Li, Qing Lei, Hongbo Zhang, Jixiang Du\",\"doi\":\"10.1109/ITME53901.2021.00048\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Action quality assessment(AQA) aims at achieving automatic evaluation the performance of human actions in video. Compared with action recognition problem, AQA focuses more on subtle differences both in spatial and temporal dimensions during the whole executing process of actions. However, most existing AQA methods tried to extract features directly from RGB videos through a 3D ConvNets, which makes the features mixed with useless scene information. To overcome this problem, We propose a deep pose feature learning AQA method that captured detailed and meaningful representations for skeleton information to discover the subtle motion difference of AQA problem. We first apply pose estimation method to obtain human skeleton data from RGB videos. Then a spatio-temporal graph convolutional network (ST-GCN) is employed to extract the dynamic changes of skeleton data and obtain the representative pose features. Finally, a regressor composed of three fully connected layers is developed to reduce the dimension of the obtained pose features and predict the final score. Experiments on MIT figure skating dataset have been extensively conducted, and the results demonstrate that the proposed method has achieved improvements that outperformed current state-of-the-art methods.\",\"PeriodicalId\":6774,\"journal\":{\"name\":\"2021 11th International Conference on Information Technology in Medicine and Education (ITME)\",\"volume\":\"64 1\",\"pages\":\"196-200\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 11th International Conference on Information Technology in Medicine and Education (ITME)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITME53901.2021.00048\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 11th International Conference on Information Technology in Medicine and Education (ITME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITME53901.2021.00048","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Skeleton Based Action Quality Assessment of Figure Skating Videos
Action quality assessment(AQA) aims at achieving automatic evaluation the performance of human actions in video. Compared with action recognition problem, AQA focuses more on subtle differences both in spatial and temporal dimensions during the whole executing process of actions. However, most existing AQA methods tried to extract features directly from RGB videos through a 3D ConvNets, which makes the features mixed with useless scene information. To overcome this problem, We propose a deep pose feature learning AQA method that captured detailed and meaningful representations for skeleton information to discover the subtle motion difference of AQA problem. We first apply pose estimation method to obtain human skeleton data from RGB videos. Then a spatio-temporal graph convolutional network (ST-GCN) is employed to extract the dynamic changes of skeleton data and obtain the representative pose features. Finally, a regressor composed of three fully connected layers is developed to reduce the dimension of the obtained pose features and predict the final score. Experiments on MIT figure skating dataset have been extensively conducted, and the results demonstrate that the proposed method has achieved improvements that outperformed current state-of-the-art methods.