{"title":"Development of a Lightweight Real-Time Application for Dynamic Hand Gesture Recognition","authors":"Oluwaleke Yusuf, MakiK . Habib","doi":"10.1109/ICMA57826.2023.10216066","DOIUrl":null,"url":null,"abstract":"Hand Gesture Recognition (HGR) is a form of perceptual computing with applications in human-machine interaction, virtual/augmented reality, and human behavior analysis. Within the HGR domain, several frameworks have been developed with different combinations of input modalities and network architectures with varying levels of efficacy. Such frameworks maximized performance at the expense of increased hardware and computational requirements. These drawbacks can be tackled by transforming the relatively complex dynamic hand gesture recognition task into a simpler image classification task. This paper presents a skeleton-based HGR framework that implements data-level fusion for encoding spatiotemporal information from dynamic gestures into static representational images. Said static images are subsequently processed by a custom, end-to-end trainable multi-stream CNN architecture for gesture classification. Our framework reduces the hardware and computational requirements of the HGR task while remaining competitive with the state-of-the-art on the CNR, FPHA, LMDHG, SHREC2017, and DHG142S benchmark datasets. We demonstrated the practical utility of our framework by creating a lightweight real-time application that makes use of skeleton data extracted from RGB video streams captured by a standard inbuilt PC webcam. The application operates successfully with minimal CPU and RAM footprint while achieving 93.46% classification accuracy with approximately 2s latency at 15 frames per second.","PeriodicalId":151364,"journal":{"name":"2023 IEEE International Conference on Mechatronics and Automation (ICMA)","volume":"109 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Mechatronics and Automation (ICMA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMA57826.2023.10216066","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Hand Gesture Recognition (HGR) is a form of perceptual computing with applications in human-machine interaction, virtual/augmented reality, and human behavior analysis. Within the HGR domain, several frameworks have been developed with different combinations of input modalities and network architectures with varying levels of efficacy. Such frameworks maximized performance at the expense of increased hardware and computational requirements. These drawbacks can be tackled by transforming the relatively complex dynamic hand gesture recognition task into a simpler image classification task. This paper presents a skeleton-based HGR framework that implements data-level fusion for encoding spatiotemporal information from dynamic gestures into static representational images. Said static images are subsequently processed by a custom, end-to-end trainable multi-stream CNN architecture for gesture classification. Our framework reduces the hardware and computational requirements of the HGR task while remaining competitive with the state-of-the-art on the CNR, FPHA, LMDHG, SHREC2017, and DHG142S benchmark datasets. We demonstrated the practical utility of our framework by creating a lightweight real-time application that makes use of skeleton data extracted from RGB video streams captured by a standard inbuilt PC webcam. The application operates successfully with minimal CPU and RAM footprint while achieving 93.46% classification accuracy with approximately 2s latency at 15 frames per second.