{"title":"基于特征增强的葡萄串点云三维语义分割","authors":"Jiangtao Luo, Dongbo Zhang, Tao Yi","doi":"10.1109/ROBIO58561.2023.10354793","DOIUrl":null,"url":null,"abstract":"As a representative bunch-type fruit,the collision-free and undamaged harvesting of grapes is of great significance. To obtain accurate 3D spatial semantic information,this paper proposes a method for multi-feature enhanced semantic segmentation model based on Mask R-CNN and PointNet++. Firstly, a depth camera is used to obtain RGBD images. The RGB images are then inputted into the Mask-RCNN network for fast detection of grape bunches. The color and depth information are fused and transformed into point cloud data, followed by the estimation of normal vectors. Finally, the nine-dimensional point cloud,which include spatial location, color information, and normal vectors, are inputted into the improved PointNet++ network to achieve semantic segmentation of grape bunches, peduncles, and leaves. This process obtains the extraction of spatial semantic information from the surrounding area of the bunches. The experimental results show that by incorporating normal vector and color features, the overall accuracy of point cloud segmentation increases to 93.7%, with a mean accuracy of 81.8%. This represents a significant improvement of 12.1% and 13.5% compared to using only positional features. The results demonstrate that the model method presented in this paper can effectively provide precise 3D semantic information to the robot while ensuring both speed and accuracy. This lays the groundwork for subsequent collision-free and damage-free picking.","PeriodicalId":505134,"journal":{"name":"2023 IEEE International Conference on Robotics and Biomimetics (ROBIO)","volume":"63 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"3D Semantic Segmentation for Grape Bunch Point Cloud Based on Feature Enhancement\",\"authors\":\"Jiangtao Luo, Dongbo Zhang, Tao Yi\",\"doi\":\"10.1109/ROBIO58561.2023.10354793\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As a representative bunch-type fruit,the collision-free and undamaged harvesting of grapes is of great significance. To obtain accurate 3D spatial semantic information,this paper proposes a method for multi-feature enhanced semantic segmentation model based on Mask R-CNN and PointNet++. Firstly, a depth camera is used to obtain RGBD images. The RGB images are then inputted into the Mask-RCNN network for fast detection of grape bunches. The color and depth information are fused and transformed into point cloud data, followed by the estimation of normal vectors. Finally, the nine-dimensional point cloud,which include spatial location, color information, and normal vectors, are inputted into the improved PointNet++ network to achieve semantic segmentation of grape bunches, peduncles, and leaves. This process obtains the extraction of spatial semantic information from the surrounding area of the bunches. The experimental results show that by incorporating normal vector and color features, the overall accuracy of point cloud segmentation increases to 93.7%, with a mean accuracy of 81.8%. This represents a significant improvement of 12.1% and 13.5% compared to using only positional features. The results demonstrate that the model method presented in this paper can effectively provide precise 3D semantic information to the robot while ensuring both speed and accuracy. This lays the groundwork for subsequent collision-free and damage-free picking.\",\"PeriodicalId\":505134,\"journal\":{\"name\":\"2023 IEEE International Conference on Robotics and Biomimetics (ROBIO)\",\"volume\":\"63 1\",\"pages\":\"1-6\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Conference on Robotics and Biomimetics (ROBIO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ROBIO58561.2023.10354793\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Robotics and Biomimetics (ROBIO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROBIO58561.2023.10354793","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
3D Semantic Segmentation for Grape Bunch Point Cloud Based on Feature Enhancement
As a representative bunch-type fruit,the collision-free and undamaged harvesting of grapes is of great significance. To obtain accurate 3D spatial semantic information,this paper proposes a method for multi-feature enhanced semantic segmentation model based on Mask R-CNN and PointNet++. Firstly, a depth camera is used to obtain RGBD images. The RGB images are then inputted into the Mask-RCNN network for fast detection of grape bunches. The color and depth information are fused and transformed into point cloud data, followed by the estimation of normal vectors. Finally, the nine-dimensional point cloud,which include spatial location, color information, and normal vectors, are inputted into the improved PointNet++ network to achieve semantic segmentation of grape bunches, peduncles, and leaves. This process obtains the extraction of spatial semantic information from the surrounding area of the bunches. The experimental results show that by incorporating normal vector and color features, the overall accuracy of point cloud segmentation increases to 93.7%, with a mean accuracy of 81.8%. This represents a significant improvement of 12.1% and 13.5% compared to using only positional features. The results demonstrate that the model method presented in this paper can effectively provide precise 3D semantic information to the robot while ensuring both speed and accuracy. This lays the groundwork for subsequent collision-free and damage-free picking.