{"title":"Egocentric hand pose estimation and distance recovery in a single RGB image","authors":"Hui Liang, Junsong Yuan, D. Thalmann","doi":"10.1109/ICME.2015.7177448","DOIUrl":null,"url":null,"abstract":"Articulated hand pose recovery in egocentric vision is useful for in-air interaction with the wearable devices, such as the Google glasses. Despite the progress obtained with the depth camera, this task is still challenging with ordinary RGB cameras. In this paper we demonstrate the possibility to recover both the articulated hand pose and its distance from the camera with a single RGB camera in egocentric view. We address this problem by modeling the distance as a hidden variable and use the Conditional Regression Forest to infer the pose and distance jointly. Especially, we find that the pose estimation accuracy can be further enhanced by incorporating the hand part semantics. The experimental results show that the proposed method achieves good performance on both a synthesized dataset and several real-world color image sequences that are captured in different environments. In addition, our system runs in real-time at more than 10fps.","PeriodicalId":146271,"journal":{"name":"2015 IEEE International Conference on Multimedia and Expo (ICME)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Multimedia and Expo (ICME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICME.2015.7177448","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
Articulated hand pose recovery in egocentric vision is useful for in-air interaction with the wearable devices, such as the Google glasses. Despite the progress obtained with the depth camera, this task is still challenging with ordinary RGB cameras. In this paper we demonstrate the possibility to recover both the articulated hand pose and its distance from the camera with a single RGB camera in egocentric view. We address this problem by modeling the distance as a hidden variable and use the Conditional Regression Forest to infer the pose and distance jointly. Especially, we find that the pose estimation accuracy can be further enhanced by incorporating the hand part semantics. The experimental results show that the proposed method achieves good performance on both a synthesized dataset and several real-world color image sequences that are captured in different environments. In addition, our system runs in real-time at more than 10fps.