SO-HandNet: Self-Organizing Network for 3D Hand Pose Estimation With Semi-Supervised Learning

2019 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2019-10-01 DOI:10.1109/ICCV.2019.00706

Yujin Chen, Zhigang Tu, Liuhao Ge, Dejun Zhang, Ruizhi Chen, Junsong Yuan

{"title":"SO-HandNet: Self-Organizing Network for 3D Hand Pose Estimation With Semi-Supervised Learning","authors":"Yujin Chen, Zhigang Tu, Liuhao Ge, Dejun Zhang, Ruizhi Chen, Junsong Yuan","doi":"10.1109/ICCV.2019.00706","DOIUrl":null,"url":null,"abstract":"3D hand pose estimation has made significant progress recently, where Convolutional Neural Networks (CNNs) play a critical role. However, most of the existing CNN-based hand pose estimation methods depend much on the training set, while labeling 3D hand pose on training data is laborious and time-consuming. Inspired by the point cloud autoencoder presented in self-organizing network (SO-Net), our proposed SO-HandNet aims at making use of the unannotated data to obtain accurate 3D hand pose estimation in a semi-supervised manner. We exploit hand feature encoder (HFE) to extract multi-level features from hand point cloud and then fuse them to regress 3D hand pose by a hand pose estimator (HPE). We design a hand feature decoder (HFD) to recover the input point cloud from the encoded feature. Since the HFE and the HFD can be trained without 3D hand pose annotation, the proposed method is able to make the best of unannotated data during the training phase. Experiments on four challenging benchmark datasets validate that our proposed SO-HandNet can achieve superior performance for 3D hand pose estimation via semi-supervised learning.","PeriodicalId":6728,"journal":{"name":"2019 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"6 1","pages":"6960-6969"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"67","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/CVF International Conference on Computer Vision (ICCV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCV.2019.00706","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 67

Abstract

3D hand pose estimation has made significant progress recently, where Convolutional Neural Networks (CNNs) play a critical role. However, most of the existing CNN-based hand pose estimation methods depend much on the training set, while labeling 3D hand pose on training data is laborious and time-consuming. Inspired by the point cloud autoencoder presented in self-organizing network (SO-Net), our proposed SO-HandNet aims at making use of the unannotated data to obtain accurate 3D hand pose estimation in a semi-supervised manner. We exploit hand feature encoder (HFE) to extract multi-level features from hand point cloud and then fuse them to regress 3D hand pose by a hand pose estimator (HPE). We design a hand feature decoder (HFD) to recover the input point cloud from the encoded feature. Since the HFE and the HFD can be trained without 3D hand pose annotation, the proposed method is able to make the best of unannotated data during the training phase. Experiments on four challenging benchmark datasets validate that our proposed SO-HandNet can achieve superior performance for 3D hand pose estimation via semi-supervised learning.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

SO-HandNet:基于半监督学习的三维手部姿态估计自组织网络

三维手部姿态估计近年来取得了重大进展，其中卷积神经网络(cnn)发挥了关键作用。然而，现有的基于cnn的手部姿态估计方法大多依赖于训练集，而在训练数据上标注三维手部姿态既费力又耗时。受自组织网络(SO-Net)中点云自动编码器的启发，我们提出的SO-HandNet旨在利用未注释的数据以半监督的方式获得准确的三维手部姿态估计。利用手部特征编码器(HFE)从手部点云中提取多层次特征，然后通过手部姿态估计器(HPE)将这些特征融合到三维手部姿态回归中。我们设计了一个手部特征解码器(HFD)来从编码特征中恢复输入点云。由于HFE和HFD可以在没有三维手姿注释的情况下进行训练，因此该方法能够在训练阶段充分利用未注释的数据。在四个具有挑战性的基准数据集上进行的实验验证了我们提出的SO-HandNet通过半监督学习可以获得优异的3D手部姿态估计性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

自引率

0.00%

发文量

期刊最新文献

Very Long Natural Scenery Image Prediction by Outpainting VTNFP: An Image-Based Virtual Try-On Network With Body and Clothing Feature Preservation Towards Latent Attribute Discovery From Triplet Similarities Gaze360: Physically Unconstrained Gaze Estimation in the Wild Attention Bridging Network for Knowledge Transfer