RGB and LiDAR Fusion-based 3D Semantic Segmentation for Autonomous Driving

IF 4.6 Q1 OPTICS Journal of Physics-Photonics Pub Date : 2023-11-01 DOI:10.1088/1742-6596/2632/1/012034

Jianguo Liu, Zhiling Jia, Gongbo Li, Fuwu Yan, Youhua Wu, Yunfei Sun

{"title":"RGB and LiDAR Fusion-based 3D Semantic Segmentation for Autonomous Driving","authors":"Jianguo Liu, Zhiling Jia, Gongbo Li, Fuwu Yan, Youhua Wu, Yunfei Sun","doi":"10.1088/1742-6596/2632/1/012034","DOIUrl":null,"url":null,"abstract":"Abstract Projection-based multimodal 3D semantic segmentation methods suffer from information loss during the point cloud projection process. This issue becomes more prominent for small objects. Moreover, the alignment of sparse target features with the corresponding object features in the camera image during the fusion process is inaccurate, leading to low segmentation accuracy for small objects. Therefore, we propose an attention-based multimodal feature alignment and fusion network module. This module aggregates features in spatial directions and generates attention matrices. Through this transformation, the module could capture remote dependencies of features in one spatial direction. This helps our network precisely locate objects and establish relationships between similar features. It enables the adaptive alignment of sparse target features with the corresponding object features in the camera image, resulting in a better fusion of the two modalities. We validate our method on the nuScenes-lidar seg dataset. Our CAFNet achieves an improvement in segmentation accuracy for small objects with fewer points compared to the baseline network, such as bicycles (6% improvement), pedestrians (2.1% improvement), and traffic cones (0.9% improvement).","PeriodicalId":44008,"journal":{"name":"Journal of Physics-Photonics","volume":"92 3","pages":"0"},"PeriodicalIF":4.6000,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Physics-Photonics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/1742-6596/2632/1/012034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPTICS","Score":null,"Total":0}

引用次数: 0

Abstract

Abstract Projection-based multimodal 3D semantic segmentation methods suffer from information loss during the point cloud projection process. This issue becomes more prominent for small objects. Moreover, the alignment of sparse target features with the corresponding object features in the camera image during the fusion process is inaccurate, leading to low segmentation accuracy for small objects. Therefore, we propose an attention-based multimodal feature alignment and fusion network module. This module aggregates features in spatial directions and generates attention matrices. Through this transformation, the module could capture remote dependencies of features in one spatial direction. This helps our network precisely locate objects and establish relationships between similar features. It enables the adaptive alignment of sparse target features with the corresponding object features in the camera image, resulting in a better fusion of the two modalities. We validate our method on the nuScenes-lidar seg dataset. Our CAFNet achieves an improvement in segmentation accuracy for small objects with fewer points compared to the baseline network, such as bicycles (6% improvement), pedestrians (2.1% improvement), and traffic cones (0.9% improvement).

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于RGB和LiDAR融合的自动驾驶3D语义分割

基于投影的多模态三维语义分割方法在点云投影过程中存在信息丢失的问题。这个问题对于小对象来说更加突出。此外，在融合过程中，稀疏目标特征与相机图像中相应目标特征的对齐不准确，导致小目标的分割精度较低。因此，我们提出了一个基于注意力的多模态特征对齐与融合网络模块。该模块聚合空间方向上的特征，生成注意力矩阵。通过这种转换，该模块可以在一个空间方向上捕获特征的远程依赖关系。这有助于我们的网络精确定位对象，并在相似特征之间建立关系。它可以使稀疏目标特征与相机图像中相应的目标特征自适应对齐，从而更好地融合两种模态。我们在nuScenes-lidar seg数据集上验证了我们的方法。与基线网络相比，我们的CAFNet在点较少的小物体上实现了分割精度的提高，例如自行车(提高6%)，行人(提高2.1%)和交通锥(提高0.9%)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊