深度图动作识别的组稀疏性和几何约束字典学习

2013 IEEE International Conference on Computer Vision Pub Date : 2013-12-01 DOI:10.1109/ICCV.2013.227

Jiajia Luo, Wei Wang, H. Qi

{"title":"深度图动作识别的组稀疏性和几何约束字典学习","authors":"Jiajia Luo, Wei Wang, H. Qi","doi":"10.1109/ICCV.2013.227","DOIUrl":null,"url":null,"abstract":"Human action recognition based on the depth information provided by commodity depth sensors is an important yet challenging task. The noisy depth maps, different lengths of action sequences, and free styles in performing actions, may cause large intra-class variations. In this paper, a new framework based on sparse coding and temporal pyramid matching (TPM) is proposed for depth-based human action recognition. Especially, a discriminative class-specific dictionary learning algorithm is proposed for sparse coding. By adding the group sparsity and geometry constraints, features can be well reconstructed by the sub-dictionary belonging to the same class, and the geometry relationships among features are also kept in the calculated coefficients. The proposed approach is evaluated on two benchmark datasets captured by depth cameras. Experimental results show that the proposed algorithm repeatedly achieves superior performance to the state of the art algorithms. Moreover, the proposed dictionary learning method also outperforms classic dictionary learning approaches.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"25 1","pages":"1809-1816"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"215","resultStr":"{\"title\":\"Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps\",\"authors\":\"Jiajia Luo, Wei Wang, H. Qi\",\"doi\":\"10.1109/ICCV.2013.227\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Human action recognition based on the depth information provided by commodity depth sensors is an important yet challenging task. The noisy depth maps, different lengths of action sequences, and free styles in performing actions, may cause large intra-class variations. In this paper, a new framework based on sparse coding and temporal pyramid matching (TPM) is proposed for depth-based human action recognition. Especially, a discriminative class-specific dictionary learning algorithm is proposed for sparse coding. By adding the group sparsity and geometry constraints, features can be well reconstructed by the sub-dictionary belonging to the same class, and the geometry relationships among features are also kept in the calculated coefficients. The proposed approach is evaluated on two benchmark datasets captured by depth cameras. Experimental results show that the proposed algorithm repeatedly achieves superior performance to the state of the art algorithms. Moreover, the proposed dictionary learning method also outperforms classic dictionary learning approaches.\",\"PeriodicalId\":6351,\"journal\":{\"name\":\"2013 IEEE International Conference on Computer Vision\",\"volume\":\"25 1\",\"pages\":\"1809-1816\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"215\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE International Conference on Computer Vision\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCV.2013.227\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE International Conference on Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCV.2013.227","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 215

摘要

基于商品深度传感器提供的深度信息进行人体动作识别是一项重要而又具有挑战性的任务。嘈杂的深度图、动作序列的不同长度以及执行动作的自由风格，可能会导致很大的类内变化。提出了一种基于稀疏编码和时间金字塔匹配(TPM)的基于深度的人体动作识别框架。特别提出了一种针对稀疏编码的判别分类字典学习算法。通过加入群稀疏性和几何约束，可以很好地利用属于同一类的子字典重构特征，并在计算系数中保持特征之间的几何关系。在深度相机捕获的两个基准数据集上对该方法进行了评估。实验结果表明，该算法多次取得了优于现有算法的性能。此外，所提出的字典学习方法也优于经典的字典学习方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps

Human action recognition based on the depth information provided by commodity depth sensors is an important yet challenging task. The noisy depth maps, different lengths of action sequences, and free styles in performing actions, may cause large intra-class variations. In this paper, a new framework based on sparse coding and temporal pyramid matching (TPM) is proposed for depth-based human action recognition. Especially, a discriminative class-specific dictionary learning algorithm is proposed for sparse coding. By adding the group sparsity and geometry constraints, features can be well reconstructed by the sub-dictionary belonging to the same class, and the geometry relationships among features are also kept in the calculated coefficients. The proposed approach is evaluated on two benchmark datasets captured by depth cameras. Experimental results show that the proposed algorithm repeatedly achieves superior performance to the state of the art algorithms. Moreover, the proposed dictionary learning method also outperforms classic dictionary learning approaches.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2013 IEEE International Conference on Computer Vision

自引率

0.00%

发文量

期刊最新文献

PixelTrack: A Fast Adaptive Algorithm for Tracking Non-rigid Objects A General Dense Image Matching Framework Combining Direct and Feature-Based Costs Latent Space Sparse Subspace Clustering Non-convex P-Norm Projection for Robust Sparsity Hierarchical Joint Max-Margin Learning of Mid and Top Level Representations for Visual Recognition