Point cloud semantic segmentation based on local feature fusion and multilayer attention network

IF 1.5 4区 计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE IET Computer Vision Pub Date : 2023-11-27 DOI:10.1049/cvi2.12255
Junjie Wen, Jie Ma, Yuehua Zhao, Tong Nie, Mengxuan Sun, Ziming Fan
{"title":"Point cloud semantic segmentation based on local feature fusion and multilayer attention network","authors":"Junjie Wen,&nbsp;Jie Ma,&nbsp;Yuehua Zhao,&nbsp;Tong Nie,&nbsp;Mengxuan Sun,&nbsp;Ziming Fan","doi":"10.1049/cvi2.12255","DOIUrl":null,"url":null,"abstract":"<p>Semantic segmentation from a three-dimensional point cloud is vital in autonomous driving, computer vision, and augmented reality. However, current semantic segmentation does not effectively use the point cloud's local geometric features and contextual information, essential for improving segmentation accuracy. A semantic segmentation network that uses local feature fusion and a multilayer attention mechanism is proposed to address these challenges. Specifically, the authors designed a local feature fusion module to encode the geometric and feature information separately, which fully leverages the point cloud's feature perception and geometric structure representation. Furthermore, the authors designed a multilayer attention pooling module consisting of local attention pooling and cascade attention pooling to extract contextual information. Local attention pooling is used to learn local neighbourhood information, and cascade attention pooling captures contextual information from deeper local neighbourhoods. Finally, an enhanced feature representation of important information is obtained by aggregating the features from the two deep attention pooling methods. Extensive experiments on large-scale point-cloud datasets Stanford 3D large-scale indoor spaces and SemanticKITTI indicate that authors network shows excellent advantages over existing representative methods regarding local geometric feature description and global contextual relationships.</p>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"18 3","pages":"381-392"},"PeriodicalIF":1.5000,"publicationDate":"2023-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12255","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Computer Vision","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/cvi2.12255","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Semantic segmentation from a three-dimensional point cloud is vital in autonomous driving, computer vision, and augmented reality. However, current semantic segmentation does not effectively use the point cloud's local geometric features and contextual information, essential for improving segmentation accuracy. A semantic segmentation network that uses local feature fusion and a multilayer attention mechanism is proposed to address these challenges. Specifically, the authors designed a local feature fusion module to encode the geometric and feature information separately, which fully leverages the point cloud's feature perception and geometric structure representation. Furthermore, the authors designed a multilayer attention pooling module consisting of local attention pooling and cascade attention pooling to extract contextual information. Local attention pooling is used to learn local neighbourhood information, and cascade attention pooling captures contextual information from deeper local neighbourhoods. Finally, an enhanced feature representation of important information is obtained by aggregating the features from the two deep attention pooling methods. Extensive experiments on large-scale point-cloud datasets Stanford 3D large-scale indoor spaces and SemanticKITTI indicate that authors network shows excellent advantages over existing representative methods regarding local geometric feature description and global contextual relationships.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于局部特征融合和多层注意力网络的点云语义分割
三维点云的语义分割在自动驾驶、计算机视觉和增强现实中至关重要。然而,目前的语义分割并不能有效地利用点云的局部几何特征和上下文信息,而这对提高分割精度至关重要。为了应对这些挑战,作者提出了一种使用局部特征融合和多层注意机制的语义分割网络。具体来说,作者设计了一个局部特征融合模块,将几何信息和特征信息分开编码,充分利用了点云的特征感知和几何结构表示。此外,作者还设计了一个由局部注意力池和级联注意力池组成的多层注意力池模块来提取上下文信息。本地注意力池用于学习本地邻域信息,级联注意力池则从更深的本地邻域中捕捉上下文信息。最后,通过汇总两种深度注意力汇集方法的特征,获得重要信息的增强特征表征。在大规模点云数据集斯坦福三维大型室内空间和 SemanticKITTI 上进行的大量实验表明,作者网络在局部几何特征描述和全局上下文关系方面比现有的代表性方法具有更出色的优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IET Computer Vision
IET Computer Vision 工程技术-工程:电子与电气
CiteScore
3.30
自引率
11.80%
发文量
76
审稿时长
3.4 months
期刊介绍: IET Computer Vision seeks original research papers in a wide range of areas of computer vision. The vision of the journal is to publish the highest quality research work that is relevant and topical to the field, but not forgetting those works that aim to introduce new horizons and set the agenda for future avenues of research in computer vision. IET Computer Vision welcomes submissions on the following topics: Biologically and perceptually motivated approaches to low level vision (feature detection, etc.); Perceptual grouping and organisation Representation, analysis and matching of 2D and 3D shape Shape-from-X Object recognition Image understanding Learning with visual inputs Motion analysis and object tracking Multiview scene analysis Cognitive approaches in low, mid and high level vision Control in visual systems Colour, reflectance and light Statistical and probabilistic models Face and gesture Surveillance Biometrics and security Robotics Vehicle guidance Automatic model aquisition Medical image analysis and understanding Aerial scene analysis and remote sensing Deep learning models in computer vision Both methodological and applications orientated papers are welcome. Manuscripts submitted are expected to include a detailed and analytical review of the literature and state-of-the-art exposition of the original proposed research and its methodology, its thorough experimental evaluation, and last but not least, comparative evaluation against relevant and state-of-the-art methods. Submissions not abiding by these minimum requirements may be returned to authors without being sent to review. Special Issues Current Call for Papers: Computer Vision for Smart Cameras and Camera Networks - https://digital-library.theiet.org/files/IET_CVI_SC.pdf Computer Vision for the Creative Industries - https://digital-library.theiet.org/files/IET_CVI_CVCI.pdf
期刊最新文献
SRL-ProtoNet: Self-supervised representation learning for few-shot remote sensing scene classification Balanced parametric body prior for implicit clothed human reconstruction from a monocular RGB Social-ATPGNN: Prediction of multi-modal pedestrian trajectory of non-homogeneous social interaction HIST: Hierarchical and sequential transformer for image captioning Multi-modal video search by examples—A video quality impact analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1