{"title":"A Robotic Arm Visual Grasp Detection Algorithm Combining 2D Images and 3D Point Clouds","authors":"Nan Mu Hui, Xiao Hui Wu, Xiao Wei Han, Bao Ju Wu","doi":"10.4028/p-vndon1","DOIUrl":null,"url":null,"abstract":"Robot grasping detection methods are categorized into two-dimension (2D) and three-dimension (3D) approaches. In 2D grasp detection, gripper pose prediction occurs directly on RedGreenBlue (RGB) images, limiting grasp direction. Conversely, 3D grasp detection predicts gripper pose using 3D point clouds, allowing greater grasp flexibility. However, the data volume of 3D point clouds hampers real-time detection. To address this, this paper proposes a novel grasping detection algorithm that combines 2D images and 3D point clouds. Initially, a Single Shot MultiBox Detector (SSD) network generates 2D prediction boxes on RGB images. Through enhancements to the prior box scaling, the accuracy of bounding the target object is improved. Subsequently, 2D boxes are transformed into 3D frustums, and extraneous data points are removed. By utilizing Random Sampling Consistent Segmentation (RANSAC) and Euclidean Clustering Segmentation Algorithm (ECSA), the target point clouds are isolated, and subsequently, the spatial pose of the target is represented using an Oriented Bounding Box (OBB). Processed point clouds enter an enhanced PointNet Grasp Pose Detection (PointNetGPD) algorithm, In contrast to the original approach involving extensive random sampling of grasp candidates, the enhanced PointNetGPD method enables the selective sampling of grasp candidates by incorporating pose constraints between the target and gripper. Following this, the generated grasp candidates are subjected to evaluation through a scoring process conducted by an assessment network. Ultimately, the robotic arm is guided to perform the grasp associated with the highest score. In the experimental phase, the proposed algorithm demonstrated a high success rate in capturing multiple targets, along with a reduced grasping time. These results underscore the algorithm's superior grasping quality and enhanced real-time performance when compared to similar algorithms.","PeriodicalId":8039,"journal":{"name":"Applied Mechanics and Materials","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Mechanics and Materials","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4028/p-vndon1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Robot grasping detection methods are categorized into two-dimension (2D) and three-dimension (3D) approaches. In 2D grasp detection, gripper pose prediction occurs directly on RedGreenBlue (RGB) images, limiting grasp direction. Conversely, 3D grasp detection predicts gripper pose using 3D point clouds, allowing greater grasp flexibility. However, the data volume of 3D point clouds hampers real-time detection. To address this, this paper proposes a novel grasping detection algorithm that combines 2D images and 3D point clouds. Initially, a Single Shot MultiBox Detector (SSD) network generates 2D prediction boxes on RGB images. Through enhancements to the prior box scaling, the accuracy of bounding the target object is improved. Subsequently, 2D boxes are transformed into 3D frustums, and extraneous data points are removed. By utilizing Random Sampling Consistent Segmentation (RANSAC) and Euclidean Clustering Segmentation Algorithm (ECSA), the target point clouds are isolated, and subsequently, the spatial pose of the target is represented using an Oriented Bounding Box (OBB). Processed point clouds enter an enhanced PointNet Grasp Pose Detection (PointNetGPD) algorithm, In contrast to the original approach involving extensive random sampling of grasp candidates, the enhanced PointNetGPD method enables the selective sampling of grasp candidates by incorporating pose constraints between the target and gripper. Following this, the generated grasp candidates are subjected to evaluation through a scoring process conducted by an assessment network. Ultimately, the robotic arm is guided to perform the grasp associated with the highest score. In the experimental phase, the proposed algorithm demonstrated a high success rate in capturing multiple targets, along with a reduced grasping time. These results underscore the algorithm's superior grasping quality and enhanced real-time performance when compared to similar algorithms.