Baotong Wang , Chenxing Xia , Xiuju Gao , Yuan Yang , Kuan-Ching Li , Xianjin Fang , Yan Zhang , Sijia Ge
{"title":"Instance-aware sampling and voxel-transformer encoding for single-stage 3D object detection","authors":"Baotong Wang , Chenxing Xia , Xiuju Gao , Yuan Yang , Kuan-Ching Li , Xianjin Fang , Yan Zhang , Sijia Ge","doi":"10.1016/j.dsp.2025.105171","DOIUrl":null,"url":null,"abstract":"<div><div>In point cloud 3D object detection tasks, single-stage detectors offer fast inference but are less accurate than two-stage detectors. We point out two main problems: first, traditional methods deal with the whole point cloud, making them vulnerable to background noise interference; second, existing methods exhibit insufficient single-channel feature encoding capability. Therefore, this paper proposes Instance-Aware Sampling and Voxel-Transformer Encoding for Single-Stage 3D Object Detection (IAVT-SSD). Specifically, we design an Instance-Aware Weighted Sampling Strategy to filter out ground reflection points, enhancing the model's focus on the foreground points. Meanwhile, we introduce a Voxel-Transformer Dual-Channel Feature Encoding Module to capture more comprehensive features through two independent channels, efficiently fusing non-empty voxels and remote context information. In addition, a Collaborative Enhancement Branch is designed to predict the complete structure of the object. Experiments show that IAVT-SSD achieves a good balance of accuracy and speed, with an inference speed of 42 FPS (frames per second) and a mAP (mean average precision) of 81.70% on the KITTI dataset, and a mAP of 66.96% on the ONCE dataset, validating its effectiveness and superiority.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"162 ","pages":"Article 105171"},"PeriodicalIF":3.0000,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200425001939","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
In point cloud 3D object detection tasks, single-stage detectors offer fast inference but are less accurate than two-stage detectors. We point out two main problems: first, traditional methods deal with the whole point cloud, making them vulnerable to background noise interference; second, existing methods exhibit insufficient single-channel feature encoding capability. Therefore, this paper proposes Instance-Aware Sampling and Voxel-Transformer Encoding for Single-Stage 3D Object Detection (IAVT-SSD). Specifically, we design an Instance-Aware Weighted Sampling Strategy to filter out ground reflection points, enhancing the model's focus on the foreground points. Meanwhile, we introduce a Voxel-Transformer Dual-Channel Feature Encoding Module to capture more comprehensive features through two independent channels, efficiently fusing non-empty voxels and remote context information. In addition, a Collaborative Enhancement Branch is designed to predict the complete structure of the object. Experiments show that IAVT-SSD achieves a good balance of accuracy and speed, with an inference speed of 42 FPS (frames per second) and a mAP (mean average precision) of 81.70% on the KITTI dataset, and a mAP of 66.96% on the ONCE dataset, validating its effectiveness and superiority.
期刊介绍:
Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal.
The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as:
• big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,