Kaustubh Mani Tripathi, P. Kamat, S. Patil, Ruchi Jayaswal, Swati Ahirrao, K. Kotecha
{"title":"用SURF进行手势到文本的翻译","authors":"Kaustubh Mani Tripathi, P. Kamat, S. Patil, Ruchi Jayaswal, Swati Ahirrao, K. Kotecha","doi":"10.3390/asi6020035","DOIUrl":null,"url":null,"abstract":"This research paper focuses on developing an effective gesture-to-text translation system using state-of-the-art computer vision techniques. The existing research on sign language translation has yet to utilize skin masking, edge detection, and feature extraction techniques to their full potential. Therefore, this study employs the speeded-up robust features (SURF) model for feature extraction, which is resistant to variations such as rotation, perspective scaling, and occlusion. The proposed system utilizes a bag of visual words (BoVW) model for gesture-to-text conversion. The study uses a dataset of 42,000 photographs consisting of alphabets (A–Z) and numbers (1–9), divided into 35 classes with 1200 shots per class. The pre-processing phase includes skin masking, where the RGB color space is converted to the HSV color space, and Canny edge detection is used for sharp edge detection. The SURF elements are grouped and converted to a visual language using the K-means mini-batch clustering technique. The proposed system’s performance is evaluated using several machine learning algorithms such as naïve Bayes, logistic regression, K nearest neighbors, support vector machine, and convolutional neural network. All the algorithms benefited from SURF, and the system’s accuracy is promising, ranging from 79% to 92%. This research study not only presents the development of an effective gesture-to-text translation system but also highlights the importance of using skin masking, edge detection, and feature extraction techniques to their full potential in sign language translation. The proposed system aims to bridge the communication gap between individuals who cannot speak and those who cannot understand Indian Sign Language (ISL).","PeriodicalId":36273,"journal":{"name":"Applied System Innovation","volume":" ","pages":""},"PeriodicalIF":3.8000,"publicationDate":"2023-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Gesture-to-Text Translation Using SURF for Indian Sign Language\",\"authors\":\"Kaustubh Mani Tripathi, P. Kamat, S. Patil, Ruchi Jayaswal, Swati Ahirrao, K. Kotecha\",\"doi\":\"10.3390/asi6020035\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This research paper focuses on developing an effective gesture-to-text translation system using state-of-the-art computer vision techniques. The existing research on sign language translation has yet to utilize skin masking, edge detection, and feature extraction techniques to their full potential. Therefore, this study employs the speeded-up robust features (SURF) model for feature extraction, which is resistant to variations such as rotation, perspective scaling, and occlusion. The proposed system utilizes a bag of visual words (BoVW) model for gesture-to-text conversion. The study uses a dataset of 42,000 photographs consisting of alphabets (A–Z) and numbers (1–9), divided into 35 classes with 1200 shots per class. The pre-processing phase includes skin masking, where the RGB color space is converted to the HSV color space, and Canny edge detection is used for sharp edge detection. The SURF elements are grouped and converted to a visual language using the K-means mini-batch clustering technique. The proposed system’s performance is evaluated using several machine learning algorithms such as naïve Bayes, logistic regression, K nearest neighbors, support vector machine, and convolutional neural network. All the algorithms benefited from SURF, and the system’s accuracy is promising, ranging from 79% to 92%. This research study not only presents the development of an effective gesture-to-text translation system but also highlights the importance of using skin masking, edge detection, and feature extraction techniques to their full potential in sign language translation. The proposed system aims to bridge the communication gap between individuals who cannot speak and those who cannot understand Indian Sign Language (ISL).\",\"PeriodicalId\":36273,\"journal\":{\"name\":\"Applied System Innovation\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2023-03-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied System Innovation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/asi6020035\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied System Innovation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/asi6020035","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Gesture-to-Text Translation Using SURF for Indian Sign Language
This research paper focuses on developing an effective gesture-to-text translation system using state-of-the-art computer vision techniques. The existing research on sign language translation has yet to utilize skin masking, edge detection, and feature extraction techniques to their full potential. Therefore, this study employs the speeded-up robust features (SURF) model for feature extraction, which is resistant to variations such as rotation, perspective scaling, and occlusion. The proposed system utilizes a bag of visual words (BoVW) model for gesture-to-text conversion. The study uses a dataset of 42,000 photographs consisting of alphabets (A–Z) and numbers (1–9), divided into 35 classes with 1200 shots per class. The pre-processing phase includes skin masking, where the RGB color space is converted to the HSV color space, and Canny edge detection is used for sharp edge detection. The SURF elements are grouped and converted to a visual language using the K-means mini-batch clustering technique. The proposed system’s performance is evaluated using several machine learning algorithms such as naïve Bayes, logistic regression, K nearest neighbors, support vector machine, and convolutional neural network. All the algorithms benefited from SURF, and the system’s accuracy is promising, ranging from 79% to 92%. This research study not only presents the development of an effective gesture-to-text translation system but also highlights the importance of using skin masking, edge detection, and feature extraction techniques to their full potential in sign language translation. The proposed system aims to bridge the communication gap between individuals who cannot speak and those who cannot understand Indian Sign Language (ISL).