基于RGB和LiDAR融合的自动驾驶3D语义分割

IF 4.6 Q1 OPTICS Journal of Physics-Photonics Pub Date : 2023-11-01 DOI:10.1088/1742-6596/2632/1/012034
Jianguo Liu, Zhiling Jia, Gongbo Li, Fuwu Yan, Youhua Wu, Yunfei Sun
{"title":"基于RGB和LiDAR融合的自动驾驶3D语义分割","authors":"Jianguo Liu, Zhiling Jia, Gongbo Li, Fuwu Yan, Youhua Wu, Yunfei Sun","doi":"10.1088/1742-6596/2632/1/012034","DOIUrl":null,"url":null,"abstract":"Abstract Projection-based multimodal 3D semantic segmentation methods suffer from information loss during the point cloud projection process. This issue becomes more prominent for small objects. Moreover, the alignment of sparse target features with the corresponding object features in the camera image during the fusion process is inaccurate, leading to low segmentation accuracy for small objects. Therefore, we propose an attention-based multimodal feature alignment and fusion network module. This module aggregates features in spatial directions and generates attention matrices. Through this transformation, the module could capture remote dependencies of features in one spatial direction. This helps our network precisely locate objects and establish relationships between similar features. It enables the adaptive alignment of sparse target features with the corresponding object features in the camera image, resulting in a better fusion of the two modalities. We validate our method on the nuScenes-lidar seg dataset. Our CAFNet achieves an improvement in segmentation accuracy for small objects with fewer points compared to the baseline network, such as bicycles (6% improvement), pedestrians (2.1% improvement), and traffic cones (0.9% improvement).","PeriodicalId":44008,"journal":{"name":"Journal of Physics-Photonics","volume":"92 3","pages":"0"},"PeriodicalIF":4.6000,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"RGB and LiDAR Fusion-based 3D Semantic Segmentation for Autonomous Driving\",\"authors\":\"Jianguo Liu, Zhiling Jia, Gongbo Li, Fuwu Yan, Youhua Wu, Yunfei Sun\",\"doi\":\"10.1088/1742-6596/2632/1/012034\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Projection-based multimodal 3D semantic segmentation methods suffer from information loss during the point cloud projection process. This issue becomes more prominent for small objects. Moreover, the alignment of sparse target features with the corresponding object features in the camera image during the fusion process is inaccurate, leading to low segmentation accuracy for small objects. Therefore, we propose an attention-based multimodal feature alignment and fusion network module. This module aggregates features in spatial directions and generates attention matrices. Through this transformation, the module could capture remote dependencies of features in one spatial direction. This helps our network precisely locate objects and establish relationships between similar features. It enables the adaptive alignment of sparse target features with the corresponding object features in the camera image, resulting in a better fusion of the two modalities. We validate our method on the nuScenes-lidar seg dataset. Our CAFNet achieves an improvement in segmentation accuracy for small objects with fewer points compared to the baseline network, such as bicycles (6% improvement), pedestrians (2.1% improvement), and traffic cones (0.9% improvement).\",\"PeriodicalId\":44008,\"journal\":{\"name\":\"Journal of Physics-Photonics\",\"volume\":\"92 3\",\"pages\":\"0\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2023-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Physics-Photonics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1088/1742-6596/2632/1/012034\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"OPTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Physics-Photonics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/1742-6596/2632/1/012034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPTICS","Score":null,"Total":0}
引用次数: 0

摘要

基于投影的多模态三维语义分割方法在点云投影过程中存在信息丢失的问题。这个问题对于小对象来说更加突出。此外,在融合过程中,稀疏目标特征与相机图像中相应目标特征的对齐不准确,导致小目标的分割精度较低。因此,我们提出了一个基于注意力的多模态特征对齐与融合网络模块。该模块聚合空间方向上的特征,生成注意力矩阵。通过这种转换,该模块可以在一个空间方向上捕获特征的远程依赖关系。这有助于我们的网络精确定位对象,并在相似特征之间建立关系。它可以使稀疏目标特征与相机图像中相应的目标特征自适应对齐,从而更好地融合两种模态。我们在nuScenes-lidar seg数据集上验证了我们的方法。与基线网络相比,我们的CAFNet在点较少的小物体上实现了分割精度的提高,例如自行车(提高6%),行人(提高2.1%)和交通锥(提高0.9%)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
RGB and LiDAR Fusion-based 3D Semantic Segmentation for Autonomous Driving
Abstract Projection-based multimodal 3D semantic segmentation methods suffer from information loss during the point cloud projection process. This issue becomes more prominent for small objects. Moreover, the alignment of sparse target features with the corresponding object features in the camera image during the fusion process is inaccurate, leading to low segmentation accuracy for small objects. Therefore, we propose an attention-based multimodal feature alignment and fusion network module. This module aggregates features in spatial directions and generates attention matrices. Through this transformation, the module could capture remote dependencies of features in one spatial direction. This helps our network precisely locate objects and establish relationships between similar features. It enables the adaptive alignment of sparse target features with the corresponding object features in the camera image, resulting in a better fusion of the two modalities. We validate our method on the nuScenes-lidar seg dataset. Our CAFNet achieves an improvement in segmentation accuracy for small objects with fewer points compared to the baseline network, such as bicycles (6% improvement), pedestrians (2.1% improvement), and traffic cones (0.9% improvement).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
10.70
自引率
0.00%
发文量
27
审稿时长
12 weeks
期刊最新文献
Wavefront shaping simulations with augmented partial factorization An efficient compact blazed grating antenna for optical phased arrays Highly reflective and high-Q thin resonant subwavelength gratings A practical guide to digital micro-mirror devices (DMDs) for wavefront shaping A modular GUI-based program for genetic algorithm-based feedback-assisted wavefront shaping
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1