Dense Dual-Branch Cross Attention Network for Semantic Segmentation of Large-Scale Point Clouds

IF 8.6 1区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Geoscience and Remote Sensing Pub Date : 2023-12-12 DOI:10.1109/TGRS.2023.3341894
Ziwei Luo;Ziyin Zeng;Wei Tang;Jie Wan;Zhong Xie;Yongyang Xu
{"title":"Dense Dual-Branch Cross Attention Network for Semantic Segmentation of Large-Scale Point Clouds","authors":"Ziwei Luo;Ziyin Zeng;Wei Tang;Jie Wan;Zhong Xie;Yongyang Xu","doi":"10.1109/TGRS.2023.3341894","DOIUrl":null,"url":null,"abstract":"Semantic segmentation of large-scale point clouds provides foundational knowledge for various geodetic and cartographic applications, including autonomous driving, smart cities, and indoor navigation. However, point cloud data’s unstructured and inherently disordered characteristics pose challenges in extracting accurate 3-D semantic information. In this study, we introduce a novel semantic segmentation network for large-scale point cloud scenes, referred to as dense dual-branch cross attention network (D2CAN). We propose a local multidimensional feature aggregation (LMFA) module to increase multidimensional feature representation types and preserve rich local details. Based on the augmented local features, an expanded dual-branch cross attention (EDCA) module establishes internal deep connections between multidimensional attributes and semantic features. This assists the network in reducing boundary ambiguities and expanding the receptive field, enabling the parallel capture of long-range contexts specifically adapted for large-scale scene point cloud segmentation. These two modules work collaboratively to constitute a local context deep perception (LCDP) block. To reduce information loss during feature sampling and propagation, we propose a global feature pyramid dense fusion (GFDF) block. This block adaptively integrates features across different scales and effectively captures global context with long-range dependencies. In conclusion, D2CAN combines LCDP and GFDF to aggregate both local and global contexts, resulting in robust feature discrimination for semantic segmentation of large-scale scenes. Our method’s effectiveness and superior generation ability have been validated across three challenging benchmarks and achieve state-of-the-art performance on Toronto-3D, SensatUrban, and Stanford large-scale 3-D indoor spaces (S3DIS) datasets, with mean intersection over union (IoU) values of 83.5%, 61.1%, and 72.3%, respectively.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"62 ","pages":"1-16"},"PeriodicalIF":8.6000,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10354344/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Semantic segmentation of large-scale point clouds provides foundational knowledge for various geodetic and cartographic applications, including autonomous driving, smart cities, and indoor navigation. However, point cloud data’s unstructured and inherently disordered characteristics pose challenges in extracting accurate 3-D semantic information. In this study, we introduce a novel semantic segmentation network for large-scale point cloud scenes, referred to as dense dual-branch cross attention network (D2CAN). We propose a local multidimensional feature aggregation (LMFA) module to increase multidimensional feature representation types and preserve rich local details. Based on the augmented local features, an expanded dual-branch cross attention (EDCA) module establishes internal deep connections between multidimensional attributes and semantic features. This assists the network in reducing boundary ambiguities and expanding the receptive field, enabling the parallel capture of long-range contexts specifically adapted for large-scale scene point cloud segmentation. These two modules work collaboratively to constitute a local context deep perception (LCDP) block. To reduce information loss during feature sampling and propagation, we propose a global feature pyramid dense fusion (GFDF) block. This block adaptively integrates features across different scales and effectively captures global context with long-range dependencies. In conclusion, D2CAN combines LCDP and GFDF to aggregate both local and global contexts, resulting in robust feature discrimination for semantic segmentation of large-scale scenes. Our method’s effectiveness and superior generation ability have been validated across three challenging benchmarks and achieve state-of-the-art performance on Toronto-3D, SensatUrban, and Stanford large-scale 3-D indoor spaces (S3DIS) datasets, with mean intersection over union (IoU) values of 83.5%, 61.1%, and 72.3%, respectively.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于大规模点云语义分割的密集双分支交叉注意网络
大规模点云的语义分割为自动驾驶、智能城市和室内导航等各种大地测量和制图应用提供了基础知识。然而,点云数据的非结构化和固有的无序特性为提取准确的三维语义信息带来了挑战。在本研究中,我们为大规模点云场景引入了一种新型语义分割网络,即密集双分支交叉注意力网络(D2CAN)。我们提出了局部多维特征聚合(LMFA)模块,以增加多维特征表示类型并保留丰富的局部细节。在增强的局部特征基础上,扩展的双分支交叉注意(EDCA)模块在多维属性和语义特征之间建立了内部深层联系。这有助于网络减少边界模糊性并扩大感受野,从而能够并行捕捉专门适用于大规模场景点云分割的远距离上下文。这两个模块协同工作,构成了局部上下文深度感知(LCDP)模块。为了减少特征采样和传播过程中的信息损失,我们提出了全局特征金字塔密集融合(GFDF)模块。这个区块能自适应地整合不同尺度的特征,有效捕捉具有长距离依赖性的全局上下文。总之,D2CAN 结合了 LCDP 和 GFDF 来聚合局部和全局上下文,从而为大尺度场景的语义分割提供了强大的特征识别能力。我们的方法的有效性和卓越的生成能力在三个具有挑战性的基准测试中得到了验证,并在多伦多-3D、SensatUrban 和斯坦福大规模三维室内空间(S3DIS)数据集上取得了最先进的性能,其平均交集大于联合(IoU)值分别为 83.5%、61.1% 和 72.3%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Geoscience and Remote Sensing
IEEE Transactions on Geoscience and Remote Sensing 工程技术-地球化学与地球物理
CiteScore
11.50
自引率
28.00%
发文量
1912
审稿时长
4.0 months
期刊介绍: IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.
期刊最新文献
The InSAR absolute phase amid singularities Integrating Neighboring Structure Knowledge into A CNN-Transformer Hybrid Model for Global Open-access DEM Correction Using ICESat-2 Altimetry Introducing WSOD-SAM Proposals and Heuristic Pseudo-Fully Supervised Training Strategy for Weakly Supervised Object Detection in Remote Sensing Images Efficient One-Way Wave-Equation Depth Migration Using Fast Fourier Transform and Complex Padé Approximation via Helmholtz Operator TSTrans: Temporal-Sequence-Driven Transformer for Single Object Tracking in Satellite Videos
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1