基于人体关键点检测的视频动作识别

2020 15th International Conference on Computer Science & Education (ICCSE) Pub Date : 2020-08-01 DOI:10.1109/ICCSE49874.2020.9201857

Luona Song, Xin Guo, Yiqi Fan

{"title":"基于人体关键点检测的视频动作识别","authors":"Luona Song, Xin Guo, Yiqi Fan","doi":"10.1109/ICCSE49874.2020.9201857","DOIUrl":null,"url":null,"abstract":"With the popularization of the internet and the increase of video facilities, the recognition and segmentation of actions in the video have become research highlights of high application value. Different from images, the information in the video is more complex and also brings time sequences as a new dimension. This paper proposes a video action recognition and segmentation model in the human keypoint detection task. The main contributions are as follows:1) Based on the speech signal processing method, this paper designs an analysis framework for video action, which consists of three steps. The first step is to obtain data from the key point frame of the human body; the second is the action segmentation model; the third is to visualize the model results;2)the dynamic time warping algorithm is used and improved from calculation cost and constraint conditions;3) a distance function is designed to measure the similarity between time series. Four kinds of features are introduced, and the final distance is the weighted sum of the four kinds of features;4) a non-maximum suppression method is designed to filter the overlapped segments to get the final results. Experiment design verifies the validity of the proposed model and the importance of proposed features is illustrated.","PeriodicalId":350703,"journal":{"name":"2020 15th International Conference on Computer Science & Education (ICCSE)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Action Recognition in Video Using Human Keypoint Detection\",\"authors\":\"Luona Song, Xin Guo, Yiqi Fan\",\"doi\":\"10.1109/ICCSE49874.2020.9201857\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the popularization of the internet and the increase of video facilities, the recognition and segmentation of actions in the video have become research highlights of high application value. Different from images, the information in the video is more complex and also brings time sequences as a new dimension. This paper proposes a video action recognition and segmentation model in the human keypoint detection task. The main contributions are as follows:1) Based on the speech signal processing method, this paper designs an analysis framework for video action, which consists of three steps. The first step is to obtain data from the key point frame of the human body; the second is the action segmentation model; the third is to visualize the model results;2)the dynamic time warping algorithm is used and improved from calculation cost and constraint conditions;3) a distance function is designed to measure the similarity between time series. Four kinds of features are introduced, and the final distance is the weighted sum of the four kinds of features;4) a non-maximum suppression method is designed to filter the overlapped segments to get the final results. Experiment design verifies the validity of the proposed model and the importance of proposed features is illustrated.\",\"PeriodicalId\":350703,\"journal\":{\"name\":\"2020 15th International Conference on Computer Science & Education (ICCSE)\",\"volume\":\"76 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 15th International Conference on Computer Science & Education (ICCSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCSE49874.2020.9201857\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 15th International Conference on Computer Science & Education (ICCSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSE49874.2020.9201857","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

随着互联网的普及和视频设备的增多，视频动作的识别与分割已成为具有较高应用价值的研究亮点。与图像不同，视频中的信息更加复杂，也为时间序列带来了新的维度。提出了一种针对人体关键点检测任务的视频动作识别与分割模型。主要贡献如下:1)基于语音信号处理方法，设计了视频动作分析框架，该框架分为三个步骤。首先从人体关键点帧中获取数据;二是动作分割模型;三是将模型结果可视化;2)采用动态时间规整算法，并从计算成本和约束条件两方面进行改进;3)设计距离函数来度量时间序列之间的相似性。引入四种特征，最终距离为四种特征的加权和;4)设计非极大值抑制方法，对重叠段进行滤波，得到最终结果。实验设计验证了所提模型的有效性，并说明了所提特征的重要性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Action Recognition in Video Using Human Keypoint Detection

With the popularization of the internet and the increase of video facilities, the recognition and segmentation of actions in the video have become research highlights of high application value. Different from images, the information in the video is more complex and also brings time sequences as a new dimension. This paper proposes a video action recognition and segmentation model in the human keypoint detection task. The main contributions are as follows:1) Based on the speech signal processing method, this paper designs an analysis framework for video action, which consists of three steps. The first step is to obtain data from the key point frame of the human body; the second is the action segmentation model; the third is to visualize the model results;2)the dynamic time warping algorithm is used and improved from calculation cost and constraint conditions;3) a distance function is designed to measure the similarity between time series. Four kinds of features are introduced, and the final distance is the weighted sum of the four kinds of features;4) a non-maximum suppression method is designed to filter the overlapped segments to get the final results. Experiment design verifies the validity of the proposed model and the importance of proposed features is illustrated.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 15th International Conference on Computer Science & Education (ICCSE)

自引率

0.00%

发文量