Fully Sparse Fusion for 3D Object Detection

Yingyan Li;Lue Fan;Yang Liu;Zehao Huang;Yuntao Chen;Naiyan Wang;Zhaoxiang Zhang
{"title":"Fully Sparse Fusion for 3D Object Detection","authors":"Yingyan Li;Lue Fan;Yang Liu;Zehao Huang;Yuntao Chen;Naiyan Wang;Zhaoxiang Zhang","doi":"10.1109/TPAMI.2024.3392303","DOIUrl":null,"url":null,"abstract":"Currently prevalent multi-modal 3D detection methods rely on dense detectors that usually use dense Bird’s-Eye-View (BEV) feature maps. However, the cost of such BEV feature maps is quadratic to the detection range, making it not scalable for long-range detection. Recently, LiDAR-only fully sparse architecture has been gaining attention for its high efficiency in long-range perception. In this paper, we study how to develop a multi-modal fully sparse detector. Specifically, our proposed detector integrates the well-studied 2D instance segmentation into the LiDAR side, which is parallel to the 3D instance segmentation part in the LiDAR-only baseline. The proposed instance-based fusion framework maintains full sparsity while overcoming the constraints associated with the LiDAR-only fully sparse detector. Our framework showcases state-of-the-art performance on the widely used nuScenes dataset, Waymo Open Dataset, and the long-range Argoverse 2 dataset. Notably, the inference speed of our proposed method under the long-range perception setting is 2.7× faster than that of other state-of-the-art multimodal 3D detection methods.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"46 11","pages":"7217-7231"},"PeriodicalIF":18.6000,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10506794/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Currently prevalent multi-modal 3D detection methods rely on dense detectors that usually use dense Bird’s-Eye-View (BEV) feature maps. However, the cost of such BEV feature maps is quadratic to the detection range, making it not scalable for long-range detection. Recently, LiDAR-only fully sparse architecture has been gaining attention for its high efficiency in long-range perception. In this paper, we study how to develop a multi-modal fully sparse detector. Specifically, our proposed detector integrates the well-studied 2D instance segmentation into the LiDAR side, which is parallel to the 3D instance segmentation part in the LiDAR-only baseline. The proposed instance-based fusion framework maintains full sparsity while overcoming the constraints associated with the LiDAR-only fully sparse detector. Our framework showcases state-of-the-art performance on the widely used nuScenes dataset, Waymo Open Dataset, and the long-range Argoverse 2 dataset. Notably, the inference speed of our proposed method under the long-range perception setting is 2.7× faster than that of other state-of-the-art multimodal 3D detection methods.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于 3D 物体检测的完全稀疏融合技术
目前流行的多模式三维检测方法依赖于密集检测器,而密集检测器通常使用密集的鸟眼视图(BEV)特征图。然而,这种 BEV 特征图的成本与探测距离成二次方关系,因此无法扩展到远距离探测。最近,纯激光雷达全稀疏架构因其在远距离感知方面的高效率而受到关注。在本文中,我们将研究如何开发一种多模式全稀疏检测器。具体来说,我们提出的检测器将研究成熟的二维实例分割集成到激光雷达侧,这与纯激光雷达基线中的三维实例分割部分是并行的。所提出的基于实例的融合框架保持了完全稀疏性,同时克服了与纯激光雷达完全稀疏检测器相关的限制。我们的框架在广泛使用的 nuScenes 数据集、Waymo 开放数据集和远距离 Argoverse 2 数据集上展示了最先进的性能。值得注意的是,我们提出的方法在远距离感知设置下的推理速度比其他最先进的多模态三维检测方法快 2.7 倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Examining the Impact of Optical Aberrations to Image Classification and Object Detection Models. Neural Eigenfunctions are Structured Representation Learners. ASIL: Augmented Structural Information Learning for Deep Graph Clustering in Hyperbolic Space. FC$^{2}$: Fast Co-Clustering With Small-Scale Similarity Graph and Bipartite Graph Learning. Robust Matrix Completion With Deterministic Sampling Via Convex Optimization.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1