利用 SSD 和 YOLOV5 模型实现手部物体姿态估计,用于 SCARA 机器人抓取物体

IF 4.2 2区 计算机科学 Q2 ROBOTICS Journal of Field Robotics Pub Date : 2024-05-07 DOI:10.1002/rob.22358
Ramasamy Sivabalakrishnan, Angappamudaliar Palanisamy Senthil Kumar, Janaki Saminathan
{"title":"利用 SSD 和 YOLOV5 模型实现手部物体姿态估计,用于 SCARA 机器人抓取物体","authors":"Ramasamy Sivabalakrishnan,&nbsp;Angappamudaliar Palanisamy Senthil Kumar,&nbsp;Janaki Saminathan","doi":"10.1002/rob.22358","DOIUrl":null,"url":null,"abstract":"<p>Enforcement of advanced deep learning methods in hand-object pose estimation is an imperative method for grasping the objects safely during the human–robot collaborative tasks. The position and orientation of a hand-object from a two-dimensional image is still a crucial problem under various circumstances like occlusion, critical lighting, and salient region detection and blur images. In this paper, the proposed method uses an enhanced MobileNetV3 with single shot detection (SSD) and YOLOv5 to ensure the improvement in accuracy and without compromising the latency in the detection of hand-object pose and its orientation. To overcome the limitations of higher computation cost, latency and accuracy, the Network Architecture Search and NetAdapt Algorithm is used in MobileNetV3 that perform the network search for parameter tuning and adaptive learning for multiscale feature extraction and anchor box offset adjustment due to auto-variance of weight in the level of each layers. The squeeze-and-excitation block reduces the computation and latency of the model. Hard-swish activation function and feature pyramid networks are used to prevent over fitting the data and stabilizing the training. Based on the comparative analysis of MobileNetV3 with its predecessor and YOLOV5 are carried out, the obtained results are 92.8% and 89.7% of precision value, recall value of 93.1% and 90.2%, mAP value of 93.3% and 89.2%, respectively. The proposed methods ensure better grasping for robots by providing the pose estimation and orientation of hand-objects with tolerance of −1.9 to 2.15 mm along <i>x</i>, −1.55 to 2.21 mm along <i>y</i>, −0.833 to 1.51 mm along <i>z</i> axis and −0.233° to 0.273° along <i>z</i>-axis.</p>","PeriodicalId":192,"journal":{"name":"Journal of Field Robotics","volume":"41 5","pages":"1558-1569"},"PeriodicalIF":4.2000,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Implementation of hand-object pose estimation using SSD and YOLOV5 model for object grasping by SCARA robot\",\"authors\":\"Ramasamy Sivabalakrishnan,&nbsp;Angappamudaliar Palanisamy Senthil Kumar,&nbsp;Janaki Saminathan\",\"doi\":\"10.1002/rob.22358\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Enforcement of advanced deep learning methods in hand-object pose estimation is an imperative method for grasping the objects safely during the human–robot collaborative tasks. The position and orientation of a hand-object from a two-dimensional image is still a crucial problem under various circumstances like occlusion, critical lighting, and salient region detection and blur images. In this paper, the proposed method uses an enhanced MobileNetV3 with single shot detection (SSD) and YOLOv5 to ensure the improvement in accuracy and without compromising the latency in the detection of hand-object pose and its orientation. To overcome the limitations of higher computation cost, latency and accuracy, the Network Architecture Search and NetAdapt Algorithm is used in MobileNetV3 that perform the network search for parameter tuning and adaptive learning for multiscale feature extraction and anchor box offset adjustment due to auto-variance of weight in the level of each layers. The squeeze-and-excitation block reduces the computation and latency of the model. Hard-swish activation function and feature pyramid networks are used to prevent over fitting the data and stabilizing the training. Based on the comparative analysis of MobileNetV3 with its predecessor and YOLOV5 are carried out, the obtained results are 92.8% and 89.7% of precision value, recall value of 93.1% and 90.2%, mAP value of 93.3% and 89.2%, respectively. The proposed methods ensure better grasping for robots by providing the pose estimation and orientation of hand-objects with tolerance of −1.9 to 2.15 mm along <i>x</i>, −1.55 to 2.21 mm along <i>y</i>, −0.833 to 1.51 mm along <i>z</i> axis and −0.233° to 0.273° along <i>z</i>-axis.</p>\",\"PeriodicalId\":192,\"journal\":{\"name\":\"Journal of Field Robotics\",\"volume\":\"41 5\",\"pages\":\"1558-1569\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2024-05-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Field Robotics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/rob.22358\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Field Robotics","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/rob.22358","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0

摘要

在手部物体姿态估计中采用先进的深度学习方法是在人机协作任务中安全抓取物体的必要方法。在遮挡、关键光照、突出区域检测和模糊图像等各种情况下,从二维图像中获取手部物体的位置和方向仍然是一个关键问题。本文提出的方法使用了具有单次检测(SSD)功能的增强型 MobileNetV3 和 YOLOv5,以确保在不影响手部物体姿态和方向检测延迟的情况下提高精度。为了克服较高的计算成本、延迟和准确性等限制,MobileNetV3 采用了网络结构搜索和 NetAdapt 算法,执行网络搜索参数调整和自适应学习,以进行多尺度特征提取,并根据各层权重的自动变化调整锚框偏移。挤压-激励块可减少模型的计算量和延迟。硬偏移激活函数和特征金字塔网络用于防止数据过度拟合和稳定训练。在对 MobileNetV3 与其前身和 YOLOV5 进行对比分析的基础上,得到的结果分别是精度值为 92.8% 和 89.7%,召回值为 93.1% 和 90.2%,mAP 值为 93.3% 和 89.2%。所提出的方法可提供手部物体的姿态估计和方向定位,X 轴公差为 -1.9 至 2.15 mm,Y 轴公差为 -1.55 至 2.21 mm,Z 轴公差为 -0.833 至 1.51 mm,Z 轴公差为 -0.233 至 0.273°,从而确保机器人能更好地抓取物体。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Implementation of hand-object pose estimation using SSD and YOLOV5 model for object grasping by SCARA robot

Enforcement of advanced deep learning methods in hand-object pose estimation is an imperative method for grasping the objects safely during the human–robot collaborative tasks. The position and orientation of a hand-object from a two-dimensional image is still a crucial problem under various circumstances like occlusion, critical lighting, and salient region detection and blur images. In this paper, the proposed method uses an enhanced MobileNetV3 with single shot detection (SSD) and YOLOv5 to ensure the improvement in accuracy and without compromising the latency in the detection of hand-object pose and its orientation. To overcome the limitations of higher computation cost, latency and accuracy, the Network Architecture Search and NetAdapt Algorithm is used in MobileNetV3 that perform the network search for parameter tuning and adaptive learning for multiscale feature extraction and anchor box offset adjustment due to auto-variance of weight in the level of each layers. The squeeze-and-excitation block reduces the computation and latency of the model. Hard-swish activation function and feature pyramid networks are used to prevent over fitting the data and stabilizing the training. Based on the comparative analysis of MobileNetV3 with its predecessor and YOLOV5 are carried out, the obtained results are 92.8% and 89.7% of precision value, recall value of 93.1% and 90.2%, mAP value of 93.3% and 89.2%, respectively. The proposed methods ensure better grasping for robots by providing the pose estimation and orientation of hand-objects with tolerance of −1.9 to 2.15 mm along x, −1.55 to 2.21 mm along y, −0.833 to 1.51 mm along z axis and −0.233° to 0.273° along z-axis.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Field Robotics
Journal of Field Robotics 工程技术-机器人学
CiteScore
15.00
自引率
3.60%
发文量
80
审稿时长
6 months
期刊介绍: The Journal of Field Robotics seeks to promote scholarly publications dealing with the fundamentals of robotics in unstructured and dynamic environments. The Journal focuses on experimental robotics and encourages publication of work that has both theoretical and practical significance.
期刊最新文献
Issue Information Cover Image, Volume 41, Number 8, December 2024 Issue Information ForzaETH Race Stack—Scaled Autonomous Head‐to‐Head Racing on Fully Commercial Off‐the‐Shelf Hardware Research on Satellite Navigation Control of Six‐Crawler Machinery Based on Fuzzy PID Algorithm
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1