PFENet：从稀疏点云中精确提取特征用于三维目标检测。

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Neural Networks Pub Date : 2025-01-16 DOI:10.1016/j.neunet.2025.107144

Yaochen Li, Qiao Li, Cong Gao, Shengjing Gao, Hao Wu, Rui Liu

{"title":"PFENet：从稀疏点云中精确提取特征用于三维目标检测。","authors":"Yaochen Li, Qiao Li, Cong Gao, Shengjing Gao, Hao Wu, Rui Liu","doi":"10.1016/j.neunet.2025.107144","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate 3D point cloud object detection is crucially important for autonomous driving vehicles. The sparsity of point clouds in 3D scenes, especially for smaller targets like pedestrians and bicycles that contain fewer points, makes detection particularly challenging. To solve this problem, we propose a single-stage voxel-based 3D object detection method, namely PFENet. Firstly, we design a robust voxel feature encoding network that incorporates a stacked triple attention mechanism to enhance the extraction of key features and suppress noise. Moreover, a 3D sparse convolution layer dynamically adjusts feature processing based on output location importance, improving small object recognition. Additionally, the attentional feature fusion module in the region proposal network merges low-level spatial features with high-level semantic features, and broadens the receptive field through atrous spatial pyramid pooling to capture multi-scale features. Finally, we develop multiple detection heads for more refined feature extraction and object classification, as well as more accurate bounding box regression. Experimental results on the KITTI dataset demonstrate the effectiveness of the proposed method.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107144"},"PeriodicalIF":6.0000,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PFENet: Towards precise feature extraction from sparse point cloud for 3D object detection\",\"authors\":\"Yaochen Li, Qiao Li, Cong Gao, Shengjing Gao, Hao Wu, Rui Liu\",\"doi\":\"10.1016/j.neunet.2025.107144\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Accurate 3D point cloud object detection is crucially important for autonomous driving vehicles. The sparsity of point clouds in 3D scenes, especially for smaller targets like pedestrians and bicycles that contain fewer points, makes detection particularly challenging. To solve this problem, we propose a single-stage voxel-based 3D object detection method, namely PFENet. Firstly, we design a robust voxel feature encoding network that incorporates a stacked triple attention mechanism to enhance the extraction of key features and suppress noise. Moreover, a 3D sparse convolution layer dynamically adjusts feature processing based on output location importance, improving small object recognition. Additionally, the attentional feature fusion module in the region proposal network merges low-level spatial features with high-level semantic features, and broadens the receptive field through atrous spatial pyramid pooling to capture multi-scale features. Finally, we develop multiple detection heads for more refined feature extraction and object classification, as well as more accurate bounding box regression. Experimental results on the KITTI dataset demonstrate the effectiveness of the proposed method.</div></div>\",\"PeriodicalId\":49763,\"journal\":{\"name\":\"Neural Networks\",\"volume\":\"185 \",\"pages\":\"Article 107144\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2025-01-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0893608025000231\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025000231","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

准确的三维点云目标检测对于自动驾驶车辆至关重要。3D场景中点云的稀疏性，特别是对于像行人和自行车这样包含较少点的较小目标，使得检测特别具有挑战性。为了解决这一问题，我们提出了一种基于单阶段体素的三维物体检测方法，即PFENet。首先，我们设计了一个鲁棒的体素特征编码网络，该网络结合了堆叠三重注意机制来增强关键特征的提取并抑制噪声。此外，三维稀疏卷积层根据输出位置重要性动态调整特征处理，提高小目标识别能力。此外，区域建议网络中的注意特征融合模块将低水平空间特征与高水平语义特征融合，并通过空间金字塔池化扩大接收野以捕获多尺度特征。最后，我们开发了多个检测头，用于更精细的特征提取和目标分类，以及更准确的边界盒回归。在KITTI数据集上的实验结果证明了该方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

PFENet: Towards precise feature extraction from sparse point cloud for 3D object detection

Accurate 3D point cloud object detection is crucially important for autonomous driving vehicles. The sparsity of point clouds in 3D scenes, especially for smaller targets like pedestrians and bicycles that contain fewer points, makes detection particularly challenging. To solve this problem, we propose a single-stage voxel-based 3D object detection method, namely PFENet. Firstly, we design a robust voxel feature encoding network that incorporates a stacked triple attention mechanism to enhance the extraction of key features and suppress noise. Moreover, a 3D sparse convolution layer dynamically adjusts feature processing based on output location importance, improving small object recognition. Additionally, the attentional feature fusion module in the region proposal network merges low-level spatial features with high-level semantic features, and broadens the receptive field through atrous spatial pyramid pooling to capture multi-scale features. Finally, we develop multiple detection heads for more refined feature extraction and object classification, as well as more accurate bounding box regression. Experimental results on the KITTI dataset demonstrate the effectiveness of the proposed method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Neural Networks 工程技术-计算机：人工智能

CiteScore

13.90

自引率

7.70%

发文量

425

审稿时长

67 days

期刊介绍： Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.