{"title":"SSF: Sparse point cloud object detection based on self-adaptive voxel encoding and focal-sparse convolution","authors":"Yu Zhang, Zilong Wang, Yongjian Zhu, Jianxin Li","doi":"10.3233/jifs-238176","DOIUrl":null,"url":null,"abstract":"Point cloud object detection is gradually playing a key role in autonomous driving tasks. To address the issue of insensitivity to sparse objects in point cloud object detection, we have made improvements to the voxel encoding and 3D backbone network of the PVRCNN++. We have introduced adaptive pooling operations during voxel feature encoding to expand the point cloud information within each voxel, followed by the utilization of multi-layer perceptrons to extract richer point cloud features. On the 3D backbone network, we have employed adaptive sparse convolution operations to make the backbone network’s channel count more flexible, allowing it to accommodate a wider range of input data types. Furthermore, we have integrated Focal Loss to tackle the issue of class imbalance in detection tasks. Experimental results on the public KITTI dataset demonstrate significant improvements over the PVRCNN++, particularly in pedestrian and bicycle detection tasks. Specifically, we have observed 1% increase in detection accuracy for pedestrians and 2.1% improvement for bicycles. Our detection performance also surpasses that of other comparative detection algorithms.","PeriodicalId":509313,"journal":{"name":"Journal of Intelligent & Fuzzy Systems","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Intelligent & Fuzzy Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/jifs-238176","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Point cloud object detection is gradually playing a key role in autonomous driving tasks. To address the issue of insensitivity to sparse objects in point cloud object detection, we have made improvements to the voxel encoding and 3D backbone network of the PVRCNN++. We have introduced adaptive pooling operations during voxel feature encoding to expand the point cloud information within each voxel, followed by the utilization of multi-layer perceptrons to extract richer point cloud features. On the 3D backbone network, we have employed adaptive sparse convolution operations to make the backbone network’s channel count more flexible, allowing it to accommodate a wider range of input data types. Furthermore, we have integrated Focal Loss to tackle the issue of class imbalance in detection tasks. Experimental results on the public KITTI dataset demonstrate significant improvements over the PVRCNN++, particularly in pedestrian and bicycle detection tasks. Specifically, we have observed 1% increase in detection accuracy for pedestrians and 2.1% improvement for bicycles. Our detection performance also surpasses that of other comparative detection algorithms.