Challenges and solutions for vision-based hand gesture interpretation: A review

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Computer Vision and Image Understanding Pub Date : 2024-11-01 Epub Date: 2024-07-25 DOI:10.1016/j.cviu.2024.104095

Kun Gao , Haoyang Zhang , Xiaolong Liu , Xinyi Wang , Liang Xie , Bowen Ji , Ye Yan , Erwei Yin

{"title":"Challenges and solutions for vision-based hand gesture interpretation: A review","authors":"Kun Gao , Haoyang Zhang , Xiaolong Liu , Xinyi Wang , Liang Xie , Bowen Ji , Ye Yan , Erwei Yin","doi":"10.1016/j.cviu.2024.104095","DOIUrl":null,"url":null,"abstract":"<div><p>Hand gesture is one of the most efficient and natural interfaces in current human–computer interaction (HCI) systems. Despite the great progress achieved in hand gesture-based HCI, perceiving or tracking the hand pose from images remains challenging. In the past decade, several challenges have been indicated and explored, such as incomplete data issue, the requirement of large-scale annotated dataset, and 3D hand pose estimation from monocular RGB image; however, there is a lack of surveys to provide comprehensive collection and analysis for these challenges and corresponding solutions. To this end, this paper devotes effort to the general challenges of hand gesture interpretation techniques in HCI systems based on visual sensors and elaborates on the corresponding solutions in current state-of-the-arts, which can provide a systematic reminder for practical problems of hand gesture interpretation. Moreover, this paper provides informative cues for recent datasets to further point out the inherent differences and connections among them, such as the annotation of objects and the number of hands, which is important for conducting research yet ignored by previous reviews. In retrospect of recent developments, this paper also conjectures what the future work will concentrate on, from the perspectives of both hand gesture interpretation and dataset construction.</p></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"248 ","pages":"Article 104095"},"PeriodicalIF":3.5000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Vision and Image Understanding","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1077314224001760","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/25 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Hand gesture is one of the most efficient and natural interfaces in current human–computer interaction (HCI) systems. Despite the great progress achieved in hand gesture-based HCI, perceiving or tracking the hand pose from images remains challenging. In the past decade, several challenges have been indicated and explored, such as incomplete data issue, the requirement of large-scale annotated dataset, and 3D hand pose estimation from monocular RGB image; however, there is a lack of surveys to provide comprehensive collection and analysis for these challenges and corresponding solutions. To this end, this paper devotes effort to the general challenges of hand gesture interpretation techniques in HCI systems based on visual sensors and elaborates on the corresponding solutions in current state-of-the-arts, which can provide a systematic reminder for practical problems of hand gesture interpretation. Moreover, this paper provides informative cues for recent datasets to further point out the inherent differences and connections among them, such as the annotation of objects and the number of hands, which is important for conducting research yet ignored by previous reviews. In retrospect of recent developments, this paper also conjectures what the future work will concentrate on, from the perspectives of both hand gesture interpretation and dataset construction.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于视觉的手势解读面临的挑战和解决方案：综述

手势是当前人机交互（HCI）系统中最有效、最自然的界面之一。尽管在基于手势的人机交互方面取得了巨大进步，但从图像中感知或跟踪手的姿势仍然具有挑战性。在过去的十年中，人们指出并探讨了一些挑战，如数据不完整问题、大规模注释数据集的要求以及从单目 RGB 图像中估计三维手部姿势；然而，目前还缺乏针对这些挑战提供全面收集和分析以及相应解决方案的调查。为此，本文致力于研究基于视觉传感器的人机交互系统中手势解读技术的一般挑战，并阐述了当前技术水平下的相应解决方案，从而为手势解读的实际问题提供系统性的提醒。此外，本文还提供了最新数据集的信息线索，进一步指出了这些数据集之间的内在差异和联系，例如对象的注释和手的数量，这对于开展研究非常重要，但却被以往的综述所忽视。在回顾近期发展的同时，本文还从手势解释和数据集构建两个角度对未来工作的重点进行了猜想。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Computer Vision and Image Understanding 工程技术-工程：电子与电气

CiteScore

7.80

自引率

4.40%

发文量

112

审稿时长

79 days

期刊介绍： The central focus of this journal is the computer analysis of pictorial information. Computer Vision and Image Understanding publishes papers covering all aspects of image analysis from the low-level, iconic processes of early vision to the high-level, symbolic processes of recognition and interpretation. A wide range of topics in the image understanding area is covered, including papers offering insights that differ from predominant views. Research Areas Include: • Theory • Early vision • Data structures and representations • Shape • Range • Motion • Matching and recognition • Architecture and languages • Vision systems