MonoAMNet:基于自适应方法的三阶段实时单目三维目标检测

IF 8.4 1区 工程技术 Q1 ENGINEERING, CIVIL IEEE Transactions on Intelligent Transportation Systems Pub Date : 2025-01-16 DOI:10.1109/TITS.2025.3525772
Huihui Pan;Yisong Jia;Jue Wang;Weichao Sun
{"title":"MonoAMNet:基于自适应方法的三阶段实时单目三维目标检测","authors":"Huihui Pan;Yisong Jia;Jue Wang;Weichao Sun","doi":"10.1109/TITS.2025.3525772","DOIUrl":null,"url":null,"abstract":"Monocular 3D object detection finds applications in various fields, notably in intelligent driving, due to its cost-effectiveness and ease of deployment. However, its accuracy significantly lags behind LiDAR-based methods, primarily because the monocular depth estimation problem is inherently challenging. While some methods leverage additional information to aid in network training and enhance performance, they are hindered by their reliance on specific datasets. We contend that many components of monocular 3D object detection lack the necessary adaptability, impeding the performance of the detector. In this paper, we propose six adaptive methods addressing issues related to network structure, loss function, and optimizer. These methods specifically target the rigid components within the detector that hinder adaptability. Simultaneously, we provide theoretical insights into the network output and propose two novel regression methods. These methods facilitate more straightforward learning for the network. Importantly, our approach does not depend on supplementary information, allowing for end-to-end training. In comparison with existing methods, our proposed approach demonstrates competitive speed and accuracy. On the KITTI dataset, our method achieves a 17.72% AP3D(IOU =0.7, Car, Moderate), outperforming all previous monocular methods. Additionally, our approach prioritizes speed, achieving a runtime of up to 52 FPS on an RTX 2080Ti GPU, surpassing all previous monocular methods. The source codes are at: <uri>https://github.com/jiayisong/AMNet</uri>.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 3","pages":"3574-3587"},"PeriodicalIF":8.4000,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MonoAMNet: Three-Stage Real-Time Monocular 3D Object Detection With Adaptive Methods\",\"authors\":\"Huihui Pan;Yisong Jia;Jue Wang;Weichao Sun\",\"doi\":\"10.1109/TITS.2025.3525772\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Monocular 3D object detection finds applications in various fields, notably in intelligent driving, due to its cost-effectiveness and ease of deployment. However, its accuracy significantly lags behind LiDAR-based methods, primarily because the monocular depth estimation problem is inherently challenging. While some methods leverage additional information to aid in network training and enhance performance, they are hindered by their reliance on specific datasets. We contend that many components of monocular 3D object detection lack the necessary adaptability, impeding the performance of the detector. In this paper, we propose six adaptive methods addressing issues related to network structure, loss function, and optimizer. These methods specifically target the rigid components within the detector that hinder adaptability. Simultaneously, we provide theoretical insights into the network output and propose two novel regression methods. These methods facilitate more straightforward learning for the network. Importantly, our approach does not depend on supplementary information, allowing for end-to-end training. In comparison with existing methods, our proposed approach demonstrates competitive speed and accuracy. On the KITTI dataset, our method achieves a 17.72% AP3D(IOU =0.7, Car, Moderate), outperforming all previous monocular methods. Additionally, our approach prioritizes speed, achieving a runtime of up to 52 FPS on an RTX 2080Ti GPU, surpassing all previous monocular methods. The source codes are at: <uri>https://github.com/jiayisong/AMNet</uri>.\",\"PeriodicalId\":13416,\"journal\":{\"name\":\"IEEE Transactions on Intelligent Transportation Systems\",\"volume\":\"26 3\",\"pages\":\"3574-3587\"},\"PeriodicalIF\":8.4000,\"publicationDate\":\"2025-01-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Intelligent Transportation Systems\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10843993/\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, CIVIL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Intelligent Transportation Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10843993/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
引用次数: 0

摘要

单目3D目标检测由于其成本效益和易于部署,在各个领域都有应用,特别是在智能驾驶领域。然而,其精度明显落后于基于激光雷达的方法,主要是因为单目深度估计问题本身就具有挑战性。虽然一些方法利用额外的信息来帮助网络训练和提高性能,但它们受到对特定数据集的依赖的阻碍。我们认为单目三维物体检测的许多组件缺乏必要的适应性,阻碍了检测器的性能。在本文中,我们提出了六种自适应方法来解决与网络结构、损失函数和优化器相关的问题。这些方法专门针对探测器内阻碍适应性的刚性部件。同时,我们对网络输出提供了理论见解,并提出了两种新的回归方法。这些方法使得网络的学习更加直接。重要的是,我们的方法不依赖于补充信息,允许端到端训练。与现有方法相比,我们提出的方法具有竞争力的速度和准确性。在KITTI数据集上,我们的方法实现了17.72%的AP3D(IOU =0.7, Car, Moderate),优于以往的所有单目方法。此外,我们的方法优先考虑速度,在RTX 2080Ti GPU上实现高达52 FPS的运行时,超过了所有以前的单目方法。源代码在:https://github.com/jiayisong/AMNet。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
MonoAMNet: Three-Stage Real-Time Monocular 3D Object Detection With Adaptive Methods
Monocular 3D object detection finds applications in various fields, notably in intelligent driving, due to its cost-effectiveness and ease of deployment. However, its accuracy significantly lags behind LiDAR-based methods, primarily because the monocular depth estimation problem is inherently challenging. While some methods leverage additional information to aid in network training and enhance performance, they are hindered by their reliance on specific datasets. We contend that many components of monocular 3D object detection lack the necessary adaptability, impeding the performance of the detector. In this paper, we propose six adaptive methods addressing issues related to network structure, loss function, and optimizer. These methods specifically target the rigid components within the detector that hinder adaptability. Simultaneously, we provide theoretical insights into the network output and propose two novel regression methods. These methods facilitate more straightforward learning for the network. Importantly, our approach does not depend on supplementary information, allowing for end-to-end training. In comparison with existing methods, our proposed approach demonstrates competitive speed and accuracy. On the KITTI dataset, our method achieves a 17.72% AP3D(IOU =0.7, Car, Moderate), outperforming all previous monocular methods. Additionally, our approach prioritizes speed, achieving a runtime of up to 52 FPS on an RTX 2080Ti GPU, surpassing all previous monocular methods. The source codes are at: https://github.com/jiayisong/AMNet.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Intelligent Transportation Systems
IEEE Transactions on Intelligent Transportation Systems 工程技术-工程:电子与电气
CiteScore
14.80
自引率
12.90%
发文量
1872
审稿时长
7.5 months
期刊介绍: The theoretical, experimental and operational aspects of electrical and electronics engineering and information technologies as applied to Intelligent Transportation Systems (ITS). Intelligent Transportation Systems are defined as those systems utilizing synergistic technologies and systems engineering concepts to develop and improve transportation systems of all kinds. The scope of this interdisciplinary activity includes the promotion, consolidation and coordination of ITS technical activities among IEEE entities, and providing a focus for cooperative activities, both internally and externally.
期刊最新文献
IEEE Intelligent Transportation Systems Society Information An Adaptive Forwarding With Path Optimization Method for Vehicular Named Data Networking Vehicle Localization in GPS-Denied Scenarios Using Arc-Length-Based Map Matching IEEE Intelligent Transportation Systems Society Information Controllable Multimodal Motion Behavior Generation for Autonomous Driving
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1