基于 MP + CNN + BiLSTM 模型的开源混合模型，用于识别智能手机上的手语

IF 1.6 Q2 ENGINEERING, MULTIDISCIPLINARY International Journal of System Assurance Engineering and Management Pub Date : 2024-05-30 DOI:10.1007/s13198-024-02376-x

Hayder M. A. Ghanimi, Sudhakar Sengan, Vijaya Bhaskar Sadu, Parvinder Kaur, Manju Kaushik, Roobaea Alroobaea, Abdullah M. Baqasah, Majed Alsafyani, Pankaj Dadheech

{"title":"基于 MP + CNN + BiLSTM 模型的开源混合模型，用于识别智能手机上的手语","authors":"Hayder M. A. Ghanimi, Sudhakar Sengan, Vijaya Bhaskar Sadu, Parvinder Kaur, Manju Kaushik, Roobaea Alroobaea, Abdullah M. Baqasah, Majed Alsafyani, Pankaj Dadheech","doi":"10.1007/s13198-024-02376-x","DOIUrl":null,"url":null,"abstract":"<p>The communication barriers experienced by deaf and hard-of-hearing individuals often lead to social isolation and limited access to essential services, underlining a critical need for effective and accessible solutions. Recognizing the unique challenges this community faces—such as the scarcity of sign language interpreters, particularly in remote areas, and the lack of real-time translation tools. This paper proposes the development of a smartphone-runnable sign language recognition model to address the communication problems faced by deaf and hard-of-hearing persons. This proposed model combines Mediapipe hand tracking with particle filtering (PF) to accurately detect and track hand movements, and a convolutional neural network (CNN) and bidirectional long short-term memory based gesture recognition model to model the temporal dynamics of Sign Language gestures. These models use a small number of layers and filters, depthwise separable convolutions, and dropout layers to minimize the computational costs and prevent overfitting, making them suitable for smartphone implementation. This article discusses the existing challenges handled by the deaf and hard-of-hearing community and explains how the proposed model could help overcome these challenges. A MediaPipe + PF model performs feature extraction from the image and data preprocessing. During training, with fewer activation functions and parameters, this proposed model performed better to other CNN with RNN variant models (CNN + LSTM, CNN + GRU) used in the experiments of convergence speed and learning efficiency.</p>","PeriodicalId":14463,"journal":{"name":"International Journal of System Assurance Engineering and Management","volume":"130 1","pages":""},"PeriodicalIF":1.6000,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An open-source MP + CNN + BiLSTM model-based hybrid model for recognizing sign language on smartphones\",\"authors\":\"Hayder M. A. Ghanimi, Sudhakar Sengan, Vijaya Bhaskar Sadu, Parvinder Kaur, Manju Kaushik, Roobaea Alroobaea, Abdullah M. Baqasah, Majed Alsafyani, Pankaj Dadheech\",\"doi\":\"10.1007/s13198-024-02376-x\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The communication barriers experienced by deaf and hard-of-hearing individuals often lead to social isolation and limited access to essential services, underlining a critical need for effective and accessible solutions. Recognizing the unique challenges this community faces—such as the scarcity of sign language interpreters, particularly in remote areas, and the lack of real-time translation tools. This paper proposes the development of a smartphone-runnable sign language recognition model to address the communication problems faced by deaf and hard-of-hearing persons. This proposed model combines Mediapipe hand tracking with particle filtering (PF) to accurately detect and track hand movements, and a convolutional neural network (CNN) and bidirectional long short-term memory based gesture recognition model to model the temporal dynamics of Sign Language gestures. These models use a small number of layers and filters, depthwise separable convolutions, and dropout layers to minimize the computational costs and prevent overfitting, making them suitable for smartphone implementation. This article discusses the existing challenges handled by the deaf and hard-of-hearing community and explains how the proposed model could help overcome these challenges. A MediaPipe + PF model performs feature extraction from the image and data preprocessing. During training, with fewer activation functions and parameters, this proposed model performed better to other CNN with RNN variant models (CNN + LSTM, CNN + GRU) used in the experiments of convergence speed and learning efficiency.</p>\",\"PeriodicalId\":14463,\"journal\":{\"name\":\"International Journal of System Assurance Engineering and Management\",\"volume\":\"130 1\",\"pages\":\"\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2024-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of System Assurance Engineering and Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s13198-024-02376-x\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of System Assurance Engineering and Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s13198-024-02376-x","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

聋人和重听者所经历的交流障碍往往导致他们与社会隔绝，获得基本服务的机会有限，这凸显了对有效和无障碍解决方案的迫切需求。认识到这一群体面临的独特挑战，如手语翻译人员稀缺（尤其是在偏远地区）和缺乏实时翻译工具。本文提出开发一种可在智能手机上运行的手语识别模型，以解决聋人和重听者面临的交流问题。该模型将 Mediapipe 手部跟踪与粒子滤波（PF）相结合，以准确检测和跟踪手部动作，并采用基于卷积神经网络（CNN）和双向长短期记忆的手势识别模型来模拟手语手势的时间动态。这些模型使用了少量的层和滤波器、深度可分离卷积和剔除层，以最大限度地降低计算成本并防止过度拟合，从而使其适用于智能手机的实施。本文讨论了聋人和重听者群体所面临的现有挑战，并解释了所提出的模型可如何帮助克服这些挑战。MediaPipe + PF 模型从图像和数据预处理中进行特征提取。在训练过程中，由于使用了较少的激活函数和参数，该模型在收敛速度和学习效率方面的表现优于实验中使用的其他带有 RNN 变体的 CNN 模型（CNN + LSTM、CNN + GRU）。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

An open-source MP + CNN + BiLSTM model-based hybrid model for recognizing sign language on smartphones

The communication barriers experienced by deaf and hard-of-hearing individuals often lead to social isolation and limited access to essential services, underlining a critical need for effective and accessible solutions. Recognizing the unique challenges this community faces—such as the scarcity of sign language interpreters, particularly in remote areas, and the lack of real-time translation tools. This paper proposes the development of a smartphone-runnable sign language recognition model to address the communication problems faced by deaf and hard-of-hearing persons. This proposed model combines Mediapipe hand tracking with particle filtering (PF) to accurately detect and track hand movements, and a convolutional neural network (CNN) and bidirectional long short-term memory based gesture recognition model to model the temporal dynamics of Sign Language gestures. These models use a small number of layers and filters, depthwise separable convolutions, and dropout layers to minimize the computational costs and prevent overfitting, making them suitable for smartphone implementation. This article discusses the existing challenges handled by the deaf and hard-of-hearing community and explains how the proposed model could help overcome these challenges. A MediaPipe + PF model performs feature extraction from the image and data preprocessing. During training, with fewer activation functions and parameters, this proposed model performed better to other CNN with RNN variant models (CNN + LSTM, CNN + GRU) used in the experiments of convergence speed and learning efficiency.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of System Assurance Engineering and Management ENGINEERING, MULTIDISCIPLINARY-

CiteScore

4.30

自引率

10.00%

发文量

252

期刊介绍： This Journal is established with a view to cater to increased awareness for high quality research in the seamless integration of heterogeneous technologies to formulate bankable solutions to the emergent complex engineering problems. Assurance engineering could be thought of as relating to the provision of higher confidence in the reliable and secure implementation of a system’s critical characteristic features through the espousal of a holistic approach by using a wide variety of cross disciplinary tools and techniques. Successful realization of sustainable and dependable products, systems and services involves an extensive adoption of Reliability, Quality, Safety and Risk related procedures for achieving high assurancelevels of performance; also pivotal are the management issues related to risk and uncertainty that govern the practical constraints encountered in their deployment. It is our intention to provide a platform for the modeling and analysis of large engineering systems, among the other aforementioned allied goals of systems assurance engineering, leading to the enforcement of performance enhancement measures. Achieving a fine balance between theory and practice is the primary focus. The Journal only publishes high quality papers that have passed the rigorous peer review procedure of an archival scientific Journal. The aim is an increasing number of submissions, wide circulation and a high impact factor.