Deep Learning Based Machine Vision: First Steps Towards a Hand Gesture Recognition Set Up for Collaborative Robots

2018 Workshop on Metrology for Industry 4.0 and IoT Pub Date : 2018-04-16 DOI:10.1109/METROI4.2018.8439044

Cristina Nuzzi, S. Pasinetti, M. Lancini, F. Docchio, G. Sansoni

{"title":"Deep Learning Based Machine Vision: First Steps Towards a Hand Gesture Recognition Set Up for Collaborative Robots","authors":"Cristina Nuzzi, S. Pasinetti, M. Lancini, F. Docchio, G. Sansoni","doi":"10.1109/METROI4.2018.8439044","DOIUrl":null,"url":null,"abstract":"In this paper, we present a smart hand gesture recognition experimental set up for collaborative robots using a Faster R-CNN object detector to find the accurate position of the hands in the RGB images taken from a Kinect v2 camera. We used MATLAB to code the detector and a purposely designed function for the prediction phase, necessary for detecting static gestures in the way we have defined them. We performed a number of experiments with different datasets to evaluate the performances of the model in different situations: a basic hand gestures dataset with four gestures performed by the combination of both hands, a dataset where the actors wear skin-like color clothes while performing the gestures, a dataset where the actors wear light-blue gloves and a dataset similar to the first one but with the camera placed close to the operator. The same tests have been conducted in a situation where also the face of the operator was detected by the algorithm, in order to improve the prediction accuracy. Our experiments show that the best model accuracy and Fl-Score are achieved by the complete model without the face detection. We tested the model in real-time, achieving good performances that can lead to real-time human-robot interaction, being the inference time around 0.2 seconds.","PeriodicalId":396967,"journal":{"name":"2018 Workshop on Metrology for Industry 4.0 and IoT","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Workshop on Metrology for Industry 4.0 and IoT","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/METROI4.2018.8439044","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 17

Abstract

In this paper, we present a smart hand gesture recognition experimental set up for collaborative robots using a Faster R-CNN object detector to find the accurate position of the hands in the RGB images taken from a Kinect v2 camera. We used MATLAB to code the detector and a purposely designed function for the prediction phase, necessary for detecting static gestures in the way we have defined them. We performed a number of experiments with different datasets to evaluate the performances of the model in different situations: a basic hand gestures dataset with four gestures performed by the combination of both hands, a dataset where the actors wear skin-like color clothes while performing the gestures, a dataset where the actors wear light-blue gloves and a dataset similar to the first one but with the camera placed close to the operator. The same tests have been conducted in a situation where also the face of the operator was detected by the algorithm, in order to improve the prediction accuracy. Our experiments show that the best model accuracy and Fl-Score are achieved by the complete model without the face detection. We tested the model in real-time, achieving good performances that can lead to real-time human-robot interaction, being the inference time around 0.2 seconds.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于深度学习的机器视觉:迈向协作机器人手势识别的第一步

在本文中，我们提出了一个用于协作机器人的智能手势识别实验设置，该实验使用Faster R-CNN对象检测器在Kinect v2相机拍摄的RGB图像中找到手的准确位置。我们使用MATLAB对检测器和一个专门为预测阶段设计的函数进行编码，这对于以我们定义的方式检测静态手势是必要的。我们用不同的数据集进行了大量的实验，以评估模型在不同情况下的性能:一个基本的手势数据集，由双手组合执行四种手势，一个数据集，演员在执行手势时穿着类似皮肤的颜色的衣服，一个数据集，演员戴着浅蓝色的手套，一个数据集类似于第一个数据集，但相机靠近操作员。为了提高预测精度，还在算法检测操作员面部的情况下进行了相同的测试。我们的实验表明，在没有人脸检测的情况下，完整的模型获得了最好的模型精度和Fl-Score。我们对模型进行了实时测试，获得了良好的性能，可以实现实时人机交互，推理时间约为0.2秒。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2018 Workshop on Metrology for Industry 4.0 and IoT

自引率

0.00%

发文量