4DCov: A Nested Covariance Descriptor of Spatio-Temporal Features for Gesture Recognition in Depth Sequences

2014 2nd International Conference on 3D Vision Pub Date : 2014-12-08 DOI:10.1109/3DV.2014.10

Pol Cirujeda, Xavier Binefa

{"title":"4DCov: A Nested Covariance Descriptor of Spatio-Temporal Features for Gesture Recognition in Depth Sequences","authors":"Pol Cirujeda, Xavier Binefa","doi":"10.1109/3DV.2014.10","DOIUrl":null,"url":null,"abstract":"In this paper we propose a novel covariance-based framework for the robust characterization and classification of human gestures in 3D depth sequences. The proposed 4DCov descriptor uses the notion of covariance to create compact representations of complex interactions between variations of 3D features in the spatial and temporal domain, instead of using the absolute features themselves. Despite the compactness of this representation, it still offers discriminative power for human-gesture classification. The codification of feature variations along a scene makes our descriptor robust to inter-subject and intra-class variations, periodic motions and different speeds during gesture executions, compared to other key point or histogram-based descriptor approaches. Furthermore, a sparse collaborative classification method is also presented, taking advantage of our descriptor laying on a specific manifold topology and observing that similar motions are geometrically clustered in the descriptor space. Classification accuracy results are presented against state-of-the-art approaches on top of four public human gesture datasets acquired with 3D depth sensor devices, including complex gestures from different natures.","PeriodicalId":275516,"journal":{"name":"2014 2nd International Conference on 3D Vision","volume":"105 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 2nd International Conference on 3D Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/3DV.2014.10","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 28

Abstract

In this paper we propose a novel covariance-based framework for the robust characterization and classification of human gestures in 3D depth sequences. The proposed 4DCov descriptor uses the notion of covariance to create compact representations of complex interactions between variations of 3D features in the spatial and temporal domain, instead of using the absolute features themselves. Despite the compactness of this representation, it still offers discriminative power for human-gesture classification. The codification of feature variations along a scene makes our descriptor robust to inter-subject and intra-class variations, periodic motions and different speeds during gesture executions, compared to other key point or histogram-based descriptor approaches. Furthermore, a sparse collaborative classification method is also presented, taking advantage of our descriptor laying on a specific manifold topology and observing that similar motions are geometrically clustered in the descriptor space. Classification accuracy results are presented against state-of-the-art approaches on top of four public human gesture datasets acquired with 3D depth sensor devices, including complex gestures from different natures.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

深度序列手势识别的时空特征嵌套协方差描述子

在本文中，我们提出了一种新的基于协方差的框架，用于三维深度序列中人类手势的鲁棒表征和分类。提出的4DCov描述符使用协方差的概念来创建空间和时间域中三维特征变化之间复杂相互作用的紧凑表示，而不是使用绝对特征本身。尽管这种表示很紧凑，但它仍然为人类手势分类提供了判别能力。与其他关键点或基于直方图的描述符方法相比，沿着场景的特征变化的编码使我们的描述符对主体间和类内变化，周期运动和手势执行期间的不同速度具有鲁棒性。此外，还提出了一种稀疏协同分类方法，利用我们的描述子放置在特定的流形拓扑上，并观察到相似的运动在描述子空间中呈几何聚类。在使用3D深度传感器设备获取的四种公共人类手势数据集(包括来自不同性质的复杂手势)上，采用最先进的方法给出了分类精度结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2014 2nd International Conference on 3D Vision

自引率

0.00%

发文量

期刊最新文献

Querying 3D Mesh Sequences for Human Action Retrieval Temporal Octrees for Compressing Dynamic Point Cloud Streams High-Quality Depth Recovery via Interactive Multi-view Stereo Iterative Closest Spectral Kernel Maps Close-Range Photometric Stereo with Point Light Sources