{"title":"PFENet: Towards precise feature extraction from sparse point cloud for 3D object detection","authors":"Yaochen Li, Qiao Li, Cong Gao, Shengjing Gao, Hao Wu, Rui Liu","doi":"10.1016/j.neunet.2025.107144","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate 3D point cloud object detection is crucially important for autonomous driving vehicles. The sparsity of point clouds in 3D scenes, especially for smaller targets like pedestrians and bicycles that contain fewer points, makes detection particularly challenging. To solve this problem, we propose a single-stage voxel-based 3D object detection method, namely PFENet. Firstly, we design a robust voxel feature encoding network that incorporates a stacked triple attention mechanism to enhance the extraction of key features and suppress noise. Moreover, a 3D sparse convolution layer dynamically adjusts feature processing based on output location importance, improving small object recognition. Additionally, the attentional feature fusion module in the region proposal network merges low-level spatial features with high-level semantic features, and broadens the receptive field through atrous spatial pyramid pooling to capture multi-scale features. Finally, we develop multiple detection heads for more refined feature extraction and object classification, as well as more accurate bounding box regression. Experimental results on the KITTI dataset demonstrate the effectiveness of the proposed method.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107144"},"PeriodicalIF":6.0000,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025000231","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate 3D point cloud object detection is crucially important for autonomous driving vehicles. The sparsity of point clouds in 3D scenes, especially for smaller targets like pedestrians and bicycles that contain fewer points, makes detection particularly challenging. To solve this problem, we propose a single-stage voxel-based 3D object detection method, namely PFENet. Firstly, we design a robust voxel feature encoding network that incorporates a stacked triple attention mechanism to enhance the extraction of key features and suppress noise. Moreover, a 3D sparse convolution layer dynamically adjusts feature processing based on output location importance, improving small object recognition. Additionally, the attentional feature fusion module in the region proposal network merges low-level spatial features with high-level semantic features, and broadens the receptive field through atrous spatial pyramid pooling to capture multi-scale features. Finally, we develop multiple detection heads for more refined feature extraction and object classification, as well as more accurate bounding box regression. Experimental results on the KITTI dataset demonstrate the effectiveness of the proposed method.
期刊介绍:
Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.