Feature-Enhanced PointPillars for 3-D Millimeter-Wave Object Detection

IF 5.7 2区计算机科学 Q1 ENGINEERING, AEROSPACE IEEE Transactions on Aerospace and Electronic Systems Pub Date : 2024-11-04 DOI:10.1109/TAES.2024.3491058

Yanyi Chang;Shuai Wan;Yichen Gao;Zhaohui Bu;Ping Li;Li Ding

{"title":"Feature-Enhanced PointPillars for 3-D Millimeter-Wave Object Detection","authors":"Yanyi Chang;Shuai Wan;Yichen Gao;Zhaohui Bu;Ping Li;Li Ding","doi":"10.1109/TAES.2024.3491058","DOIUrl":null,"url":null,"abstract":"PointPillars, a voxel-based 3-D object detection model, would encounter the resolution loss after voxelization, leading to the capability reduction in capturing intricate object details. This limitation is particularly evident in processing 3-D millimeter-wave (MMW) images. Therefore, this article proposes a feature-enhanced PointPillars for 3-D MMW object detection. This enhancement first integrates a multiscale feature extraction (MFE) module into the pillar feature network. This module is adept at handling the substantial volume of point cloud data characteristic of MMW images and significantly improves feature encoding efficiency. Considering the local density variations and sparsity patterns observed in MMW images, the modified PointPillars further explores a pyramidally attended feature extraction (PAFE) module to improve the inference efficiency. By employing multibranch convolutional kernels with varying dilation rates in the backbone network, the proposed approach expands the receptive fields and augments the contextual interconnectedness of the detected objects. This effectively curtails the semantic and spatial detail loss commonly associated with downsampling. Empirical evaluation of our proposed method against the standard PointPillars benchmark highlights its superiority. In particular, our method presents performance enhancement of 0.4$\\%$ and 7.16$\\%$ in $\\mathrm{{AP\\_{R}}}{40_{0.5}}$ (AP) for the bird's eye view and 3-D bounding boxes, respectively. Furthermore, it achieves a great 50.4$\\%$ reduction in the number of parameters and delivers an impressive inference speed of 0.0132 s per frame. These advancements confirm that the augmented network achieves a balance between computational efficiency and 3-D object detection performance for 3-D MMW images, all the while ensuring a practical inference cost.","PeriodicalId":13157,"journal":{"name":"IEEE Transactions on Aerospace and Electronic Systems","volume":"61 2","pages":"3828-3839"},"PeriodicalIF":5.7000,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Aerospace and Electronic Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10742475/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, AEROSPACE","Score":null,"Total":0}

引用次数: 0

Abstract

PointPillars, a voxel-based 3-D object detection model, would encounter the resolution loss after voxelization, leading to the capability reduction in capturing intricate object details. This limitation is particularly evident in processing 3-D millimeter-wave (MMW) images. Therefore, this article proposes a feature-enhanced PointPillars for 3-D MMW object detection. This enhancement first integrates a multiscale feature extraction (MFE) module into the pillar feature network. This module is adept at handling the substantial volume of point cloud data characteristic of MMW images and significantly improves feature encoding efficiency. Considering the local density variations and sparsity patterns observed in MMW images, the modified PointPillars further explores a pyramidally attended feature extraction (PAFE) module to improve the inference efficiency. By employing multibranch convolutional kernels with varying dilation rates in the backbone network, the proposed approach expands the receptive fields and augments the contextual interconnectedness of the detected objects. This effectively curtails the semantic and spatial detail loss commonly associated with downsampling. Empirical evaluation of our proposed method against the standard PointPillars benchmark highlights its superiority. In particular, our method presents performance enhancement of 0.4$\%$ and 7.16$\%$ in $\mathrm{{AP\_{R}}}{40_{0.5}}$ (AP) for the bird's eye view and 3-D bounding boxes, respectively. Furthermore, it achieves a great 50.4$\%$ reduction in the number of parameters and delivers an impressive inference speed of 0.0132 s per frame. These advancements confirm that the augmented network achieves a balance between computational efficiency and 3-D object detection performance for 3-D MMW images, all the while ensuring a practical inference cost.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于三维毫米波物体探测的特征增强型点阵

PointPillars是一种基于体素的三维目标检测模型，在体素化后会出现分辨率损失，导致捕获复杂目标细节的能力下降。这种限制在处理3-D毫米波（MMW）图像时尤为明显。因此，本文提出了一种用于三维毫米波目标检测的特征增强点柱。该增强首先将多尺度特征提取（MFE）模块集成到支柱特征网络中。该模块擅长处理毫米波图像中大量的点云数据特征，显著提高了特征编码效率。考虑到毫米波图像的局部密度变化和稀疏模式，改进的PointPillars进一步探索了金字塔参与特征提取（PAFE）模块，以提高推理效率。该方法通过在主干网中使用不同扩张率的多分支卷积核，扩展了感知域，增强了被检测对象的上下文互联性。这有效地减少了通常与降采样相关的语义和空间细节损失。针对标准的PointPillars基准对我们提出的方法进行了实证评估，突出了它的优越性。特别是，我们的方法在$\ mathm {{AP\_{R}}}{40_{0.5}}$ （AP）中对鸟瞰图和三维边界框的性能分别提高了0.4$\%$和7.16$\%$。此外，它在参数数量上减少了50.4%，并提供了令人印象深刻的每帧0.0132秒的推理速度。这些进步证实，增强网络在3-D毫米波图像的计算效率和3-D目标检测性能之间取得了平衡，同时确保了实际的推理成本。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Aerospace and Electronic Systems 工程技术-电信学

CiteScore

7.80

自引率

13.60%

发文量

433

审稿时长

8.7 months

期刊介绍： IEEE Transactions on Aerospace and Electronic Systems focuses on the organization, design, development, integration, and operation of complex systems for space, air, ocean, or ground environment. These systems include, but are not limited to, navigation, avionics, spacecraft, aerospace power, radar, sonar, telemetry, defense, transportation, automated testing, and command and control.