{"title":"An Effective Sign Language Learning with Object Detection Based ROI Segmentation","authors":"Sunmok Kim, Y. Ji, Ki-Baek Lee","doi":"10.1109/IRC.2018.00069","DOIUrl":null,"url":null,"abstract":"This paper proposes a novel sign language learning method which employs region of interest (ROI) segmentation preprocessing of input data through an object detection network. As the input, 2D image frames are sampled and concatenated into a wide image. From the image, ROI is segmented by detecting and extracting the area of hands, crucial information in sign language. The hand area detection process is implemented with a well-known object detection network, you only look once (YOLO) and the sign language learning is implemented with a convolutional neural network (CNN). 12 sign gestures are tested through a 2D camera. The results show that, compared to the method without ROI segmentation, the accuracy is increased by 12% (from 86% to 98%) as well as the training time is reduced by over 50%. Above all, through the pretrained hand features, it has the advantage of ease in adding more sign gestures to learn.","PeriodicalId":416113,"journal":{"name":"2018 Second IEEE International Conference on Robotic Computing (IRC)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Second IEEE International Conference on Robotic Computing (IRC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRC.2018.00069","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 21
Abstract
This paper proposes a novel sign language learning method which employs region of interest (ROI) segmentation preprocessing of input data through an object detection network. As the input, 2D image frames are sampled and concatenated into a wide image. From the image, ROI is segmented by detecting and extracting the area of hands, crucial information in sign language. The hand area detection process is implemented with a well-known object detection network, you only look once (YOLO) and the sign language learning is implemented with a convolutional neural network (CNN). 12 sign gestures are tested through a 2D camera. The results show that, compared to the method without ROI segmentation, the accuracy is increased by 12% (from 86% to 98%) as well as the training time is reduced by over 50%. Above all, through the pretrained hand features, it has the advantage of ease in adding more sign gestures to learn.