基于拉普拉斯金字塔深度运动图像的多尺度人体动作识别方法

Proceedings of the 2nd ACM International Conference on Multimedia in Asia Pub Date : 2021-03-07 DOI:10.1145/3444685.3446284

Chang Li, Qian Huang, Xing Li, Qianhan Wu

{"title":"基于拉普拉斯金字塔深度运动图像的多尺度人体动作识别方法","authors":"Chang Li, Qian Huang, Xing Li, Qianhan Wu","doi":"10.1145/3444685.3446284","DOIUrl":null,"url":null,"abstract":"Human action recognition is an active research area in computer vision. Aiming at the lack of spatial muti-scale information for human action recognition, we present a novel framework to recognize human actions from depth video sequences using multi-scale Laplacian pyramid depth motion images (LP-DMI). Each depth frame is projected onto three orthogonal Cartesian planes. Under three views, we generate depth motion images (DMI) and construct Laplacian pyramids as structured multi-scale feature maps which enhances multi-scale dynamic information of motions and reduces redundant static information in human bodies. We further extract the multi-granularity descriptor called LP-DMI-HOG to provide more discriminative features. Finally, we utilize extreme learning machine (ELM) for action classification. Through extensive experiments on the public MSRAction3D datasets, we prove that our method outperforms state-of-the-art benchmarks.","PeriodicalId":119278,"journal":{"name":"Proceedings of the 2nd ACM International Conference on Multimedia in Asia","volume":"80 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"A multi-scale human action recognition method based on Laplacian pyramid depth motion images\",\"authors\":\"Chang Li, Qian Huang, Xing Li, Qianhan Wu\",\"doi\":\"10.1145/3444685.3446284\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Human action recognition is an active research area in computer vision. Aiming at the lack of spatial muti-scale information for human action recognition, we present a novel framework to recognize human actions from depth video sequences using multi-scale Laplacian pyramid depth motion images (LP-DMI). Each depth frame is projected onto three orthogonal Cartesian planes. Under three views, we generate depth motion images (DMI) and construct Laplacian pyramids as structured multi-scale feature maps which enhances multi-scale dynamic information of motions and reduces redundant static information in human bodies. We further extract the multi-granularity descriptor called LP-DMI-HOG to provide more discriminative features. Finally, we utilize extreme learning machine (ELM) for action classification. Through extensive experiments on the public MSRAction3D datasets, we prove that our method outperforms state-of-the-art benchmarks.\",\"PeriodicalId\":119278,\"journal\":{\"name\":\"Proceedings of the 2nd ACM International Conference on Multimedia in Asia\",\"volume\":\"80 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-03-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2nd ACM International Conference on Multimedia in Asia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3444685.3446284\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd ACM International Conference on Multimedia in Asia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3444685.3446284","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

人体动作识别是计算机视觉领域的一个活跃研究领域。针对人体动作识别缺乏空间多尺度信息的问题，提出了一种利用多尺度拉普拉斯金字塔深度运动图像(LP-DMI)从深度视频序列中识别人体动作的新框架。每个深度帧被投影到三个正交的笛卡尔平面上。在三种视图下，我们生成深度运动图像(DMI)，并将拉普拉斯金字塔构造为结构化的多尺度特征图，增强了运动的多尺度动态信息，减少了人体中冗余的静态信息。我们进一步提取了称为LP-DMI-HOG的多粒度描述符，以提供更多的判别特征。最后，我们利用极限学习机(ELM)进行动作分类。通过对公共MSRAction3D数据集的广泛实验，我们证明了我们的方法优于最先进的基准。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A multi-scale human action recognition method based on Laplacian pyramid depth motion images

Human action recognition is an active research area in computer vision. Aiming at the lack of spatial muti-scale information for human action recognition, we present a novel framework to recognize human actions from depth video sequences using multi-scale Laplacian pyramid depth motion images (LP-DMI). Each depth frame is projected onto three orthogonal Cartesian planes. Under three views, we generate depth motion images (DMI) and construct Laplacian pyramids as structured multi-scale feature maps which enhances multi-scale dynamic information of motions and reduces redundant static information in human bodies. We further extract the multi-granularity descriptor called LP-DMI-HOG to provide more discriminative features. Finally, we utilize extreme learning machine (ELM) for action classification. Through extensive experiments on the public MSRAction3D datasets, we prove that our method outperforms state-of-the-art benchmarks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2nd ACM International Conference on Multimedia in Asia

自引率

0.00%

发文量

期刊最新文献

Storyboard relational model for group activity recognition Objective object segmentation visual quality evaluation based on pixel-level and region-level characteristics Multiplicative angular margin loss for text-based person search Distilling knowledge in causal inference for unbiased visual question answering A large-scale image retrieval system for everyday scenes