DeepHand: Robust Hand Pose Estimation by Completing a Matrix Imputed with Deep Features

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI:10.1109/CVPR.2016.450

Ayan Sinha, Chiho Choi, K. Ramani

引用次数: 151

Abstract

We propose DeepHand to estimate the 3D pose of a hand using depth data from commercial 3D sensors. We discriminatively train convolutional neural networks to output a low dimensional activation feature given a depth map. This activation feature vector is representative of the global or local joint angle parameters of a hand pose. We efficiently identify 'spatial' nearest neighbors to the activation feature, from a database of features corresponding to synthetic depth maps, and store some 'temporal' neighbors from previous frames. Our matrix completion algorithm uses these 'spatio-temporal' activation features and the corresponding known pose parameter values to estimate the unknown pose parameters of the input feature vector. Our database of activation features supplements large viewpoint coverage and our hierarchical estimation of pose parameters is robust to occlusions. We show that our approach compares favorably to state-of-the-art methods while achieving real time performance (≈ 32 FPS) on a standard computer.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

DeepHand:基于深度特征完成矩阵的鲁棒手部姿态估计

我们提出DeepHand使用商用3D传感器的深度数据来估计手的3D姿势。我们判别训练卷积神经网络输出给定深度图的低维激活特征。该激活特征向量代表了手部姿态的全局或局部关节角度参数。我们从与合成深度图相对应的特征数据库中有效地识别出激活特征的“空间”近邻，并存储来自前一帧的一些“时间”近邻。我们的矩阵补全算法使用这些“时空”激活特征和相应的已知姿态参数值来估计输入特征向量的未知姿态参数。我们的激活特征数据库补充了大的视点覆盖率，我们的姿态参数分层估计对遮挡具有鲁棒性。我们表明，在标准计算机上实现实时性能(≈32 FPS)的同时，我们的方法与最先进的方法相比具有优势。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

自引率

0.00%

发文量

期刊最新文献

Sketch Me That Shoe Multivariate Regression on the Grassmannian for Predicting Novel Domains How Hard Can It Be? Estimating the Difficulty of Visual Search in an Image Discovering the Physical Parts of an Articulated Object Class from Multiple Videos Simultaneous Optical Flow and Intensity Estimation from an Event Camera