The fusion of LiDAR and 4D radar has emerged as a promising solution for robust and accurate 3D object detection in complex and adverse conditions. Existing methods typically rely on pillar-based representations, which, although computationally efficient, fail to provide fine-grained structural details necessary for precise object localization and recognition. In contrast, voxel-based representations offer richer spatial information but face challenges such as background noise and data quality disparity. To address these limitations, we propose SVEFusion, a voxel-based 3D object detection framework that integrates LiDAR and 4D radar data using a salient voxel enhancement mechanism. Our method introduces an adaptive feature alignment module and a novel spatial neighborhood attention module for efficient early-stage multi-modal voxel feature integration. Furthermore, we design a salient voxel enhancement mechanism that assigns higher weights to foreground voxels using a multi-scale weight prediction strategy, progressively refining weight accuracy with supervision loss. Experimental results demonstrate that SVEFusion significantly outperforms state-of-the-art methods, establishing a new benchmark in multi-modal 3D object detection. The source code and network weighting for reproducibility are available at https://github.com/icdm-adteam/SVEFusion.
扫码关注我们
求助内容:
应助结果提醒方式:
