A Robotic Arm Visual Grasp Detection Algorithm Combining 2D Images and 3D Point Clouds

Applied Mechanics and Materials Pub Date : 2024-02-05 DOI:10.4028/p-vndon1

Nan Mu Hui, Xiao Hui Wu, Xiao Wei Han, Bao Ju Wu

{"title":"A Robotic Arm Visual Grasp Detection Algorithm Combining 2D Images and 3D Point Clouds","authors":"Nan Mu Hui, Xiao Hui Wu, Xiao Wei Han, Bao Ju Wu","doi":"10.4028/p-vndon1","DOIUrl":null,"url":null,"abstract":"Robot grasping detection methods are categorized into two-dimension (2D) and three-dimension (3D) approaches. In 2D grasp detection, gripper pose prediction occurs directly on RedGreenBlue (RGB) images, limiting grasp direction. Conversely, 3D grasp detection predicts gripper pose using 3D point clouds, allowing greater grasp flexibility. However, the data volume of 3D point clouds hampers real-time detection. To address this, this paper proposes a novel grasping detection algorithm that combines 2D images and 3D point clouds. Initially, a Single Shot MultiBox Detector (SSD) network generates 2D prediction boxes on RGB images. Through enhancements to the prior box scaling, the accuracy of bounding the target object is improved. Subsequently, 2D boxes are transformed into 3D frustums, and extraneous data points are removed. By utilizing Random Sampling Consistent Segmentation (RANSAC) and Euclidean Clustering Segmentation Algorithm (ECSA), the target point clouds are isolated, and subsequently, the spatial pose of the target is represented using an Oriented Bounding Box (OBB). Processed point clouds enter an enhanced PointNet Grasp Pose Detection (PointNetGPD) algorithm, In contrast to the original approach involving extensive random sampling of grasp candidates, the enhanced PointNetGPD method enables the selective sampling of grasp candidates by incorporating pose constraints between the target and gripper. Following this, the generated grasp candidates are subjected to evaluation through a scoring process conducted by an assessment network. Ultimately, the robotic arm is guided to perform the grasp associated with the highest score. In the experimental phase, the proposed algorithm demonstrated a high success rate in capturing multiple targets, along with a reduced grasping time. These results underscore the algorithm's superior grasping quality and enhanced real-time performance when compared to similar algorithms.","PeriodicalId":8039,"journal":{"name":"Applied Mechanics and Materials","volume":"27 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Mechanics and Materials","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4028/p-vndon1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Robot grasping detection methods are categorized into two-dimension (2D) and three-dimension (3D) approaches. In 2D grasp detection, gripper pose prediction occurs directly on RedGreenBlue (RGB) images, limiting grasp direction. Conversely, 3D grasp detection predicts gripper pose using 3D point clouds, allowing greater grasp flexibility. However, the data volume of 3D point clouds hampers real-time detection. To address this, this paper proposes a novel grasping detection algorithm that combines 2D images and 3D point clouds. Initially, a Single Shot MultiBox Detector (SSD) network generates 2D prediction boxes on RGB images. Through enhancements to the prior box scaling, the accuracy of bounding the target object is improved. Subsequently, 2D boxes are transformed into 3D frustums, and extraneous data points are removed. By utilizing Random Sampling Consistent Segmentation (RANSAC) and Euclidean Clustering Segmentation Algorithm (ECSA), the target point clouds are isolated, and subsequently, the spatial pose of the target is represented using an Oriented Bounding Box (OBB). Processed point clouds enter an enhanced PointNet Grasp Pose Detection (PointNetGPD) algorithm, In contrast to the original approach involving extensive random sampling of grasp candidates, the enhanced PointNetGPD method enables the selective sampling of grasp candidates by incorporating pose constraints between the target and gripper. Following this, the generated grasp candidates are subjected to evaluation through a scoring process conducted by an assessment network. Ultimately, the robotic arm is guided to perform the grasp associated with the highest score. In the experimental phase, the proposed algorithm demonstrated a high success rate in capturing multiple targets, along with a reduced grasping time. These results underscore the algorithm's superior grasping quality and enhanced real-time performance when compared to similar algorithms.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

结合二维图像和三维点云的机械臂视觉抓握检测算法

机器人抓取检测方法分为二维（2D）和三维（3D）方法。在二维抓取检测中，抓取姿势预测直接在红绿蓝（RGB）图像上进行，限制了抓取方向。与此相反，三维抓取检测使用三维点云预测抓取姿势，使抓取更加灵活。然而，三维点云的数据量阻碍了实时检测。为此，本文提出了一种结合二维图像和三维点云的新型抓取检测算法。起初，单发多方框检测器（SSD）网络会在 RGB 图像上生成 2D 预测方框。通过增强事先的方框缩放，提高了目标物体边界的准确性。随后，二维方框被转换为三维轮廓，无关数据点被移除。通过使用随机取样一致性分割（RANSAC）和欧氏聚类分割算法（ECSA），目标点云被分离出来，随后使用定向边界框（OBB）来表示目标的空间姿态。处理后的点云进入增强型点网抓取姿态检测（PointNetGPD）算法，与原始方法中对抓取候选对象进行大量随机抽样不同，增强型点网抓取姿态检测方法通过在目标和抓手之间加入姿态约束，对抓取候选对象进行选择性抽样。随后，通过评估网络的评分过程对生成的抓取候选对象进行评估。最终，引导机械臂执行得分最高的抓取动作。在实验阶段，所提出的算法在抓取多个目标方面表现出了很高的成功率，同时还缩短了抓取时间。这些结果表明，与同类算法相比，该算法具有更高的抓取质量和更强的实时性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Applied Mechanics and Materials

自引率

0.00%

发文量