{"title":"Challenges and solutions for vision-based hand gesture interpretation: A review","authors":"","doi":"10.1016/j.cviu.2024.104095","DOIUrl":null,"url":null,"abstract":"<div><p>Hand gesture is one of the most efficient and natural interfaces in current human–computer interaction (HCI) systems. Despite the great progress achieved in hand gesture-based HCI, perceiving or tracking the hand pose from images remains challenging. In the past decade, several challenges have been indicated and explored, such as incomplete data issue, the requirement of large-scale annotated dataset, and 3D hand pose estimation from monocular RGB image; however, there is a lack of surveys to provide comprehensive collection and analysis for these challenges and corresponding solutions. To this end, this paper devotes effort to the general challenges of hand gesture interpretation techniques in HCI systems based on visual sensors and elaborates on the corresponding solutions in current state-of-the-arts, which can provide a systematic reminder for practical problems of hand gesture interpretation. Moreover, this paper provides informative cues for recent datasets to further point out the inherent differences and connections among them, such as the annotation of objects and the number of hands, which is important for conducting research yet ignored by previous reviews. In retrospect of recent developments, this paper also conjectures what the future work will concentrate on, from the perspectives of both hand gesture interpretation and dataset construction.</p></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Vision and Image Understanding","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1077314224001760","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Hand gesture is one of the most efficient and natural interfaces in current human–computer interaction (HCI) systems. Despite the great progress achieved in hand gesture-based HCI, perceiving or tracking the hand pose from images remains challenging. In the past decade, several challenges have been indicated and explored, such as incomplete data issue, the requirement of large-scale annotated dataset, and 3D hand pose estimation from monocular RGB image; however, there is a lack of surveys to provide comprehensive collection and analysis for these challenges and corresponding solutions. To this end, this paper devotes effort to the general challenges of hand gesture interpretation techniques in HCI systems based on visual sensors and elaborates on the corresponding solutions in current state-of-the-arts, which can provide a systematic reminder for practical problems of hand gesture interpretation. Moreover, this paper provides informative cues for recent datasets to further point out the inherent differences and connections among them, such as the annotation of objects and the number of hands, which is important for conducting research yet ignored by previous reviews. In retrospect of recent developments, this paper also conjectures what the future work will concentrate on, from the perspectives of both hand gesture interpretation and dataset construction.
期刊介绍:
The central focus of this journal is the computer analysis of pictorial information. Computer Vision and Image Understanding publishes papers covering all aspects of image analysis from the low-level, iconic processes of early vision to the high-level, symbolic processes of recognition and interpretation. A wide range of topics in the image understanding area is covered, including papers offering insights that differ from predominant views.
Research Areas Include:
• Theory
• Early vision
• Data structures and representations
• Shape
• Range
• Motion
• Matching and recognition
• Architecture and languages
• Vision systems