基于单目伪激光雷达的自监督三维目标检测

Curie Kim, Ue-Hwan Kim, Jong-Hwan Kim
{"title":"基于单目伪激光雷达的自监督三维目标检测","authors":"Curie Kim, Ue-Hwan Kim, Jong-Hwan Kim","doi":"10.1109/MFI55806.2022.9913846","DOIUrl":null,"url":null,"abstract":"There have been attempts to detect 3D objects by fusion of stereo camera images and LiDAR sensor data or using LiDAR for pre-training and only monocular images for testing, but there have been less attempts to use only monocular image sequences due to low accuracy. In addition, when depth prediction using only monocular images, only scale-inconsistent depth can be predicted, which is the reason why researchers are reluctant to use monocular images alone.Therefore, we propose a method for predicting absolute depth and detecting 3D objects using only monocular image sequences by enabling end-to-end learning of detection networks and depth prediction networks. As a result, the proposed method surpasses other existing methods in performance on the KITTI 3D dataset. Even when monocular image and 3D LiDAR are used together during training in an attempt to improve performance, ours exhibit is the best performance compared to other methods using the same input. In addition, end-to-end learning not only improves depth prediction performance, but also enables absolute depth prediction, because our network utilizes the fact that the size of a 3D object such as a car is determined by the approximate size.","PeriodicalId":344737,"journal":{"name":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Self-supervised 3D Object Detection from Monocular Pseudo-LiDAR\",\"authors\":\"Curie Kim, Ue-Hwan Kim, Jong-Hwan Kim\",\"doi\":\"10.1109/MFI55806.2022.9913846\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There have been attempts to detect 3D objects by fusion of stereo camera images and LiDAR sensor data or using LiDAR for pre-training and only monocular images for testing, but there have been less attempts to use only monocular image sequences due to low accuracy. In addition, when depth prediction using only monocular images, only scale-inconsistent depth can be predicted, which is the reason why researchers are reluctant to use monocular images alone.Therefore, we propose a method for predicting absolute depth and detecting 3D objects using only monocular image sequences by enabling end-to-end learning of detection networks and depth prediction networks. As a result, the proposed method surpasses other existing methods in performance on the KITTI 3D dataset. Even when monocular image and 3D LiDAR are used together during training in an attempt to improve performance, ours exhibit is the best performance compared to other methods using the same input. In addition, end-to-end learning not only improves depth prediction performance, but also enables absolute depth prediction, because our network utilizes the fact that the size of a 3D object such as a car is determined by the approximate size.\",\"PeriodicalId\":344737,\"journal\":{\"name\":\"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MFI55806.2022.9913846\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MFI55806.2022.9913846","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

已经有人尝试通过融合立体摄像机图像和激光雷达传感器数据来检测3D物体,或者使用激光雷达进行预训练,仅使用单眼图像进行测试,但由于精度低,仅使用单眼图像序列的尝试较少。此外,当仅使用单眼图像进行深度预测时,只能预测尺度不一致的深度,这也是研究者不愿意单独使用单眼图像的原因。因此,我们提出了一种仅使用单眼图像序列预测绝对深度和检测3D物体的方法,该方法支持检测网络和深度预测网络的端到端学习。结果表明,该方法在KITTI三维数据集上的性能优于其他现有方法。即使在训练期间将单眼图像和3D激光雷达一起使用以试图提高性能,与使用相同输入的其他方法相比,我们的展示也具有最佳性能。此外,端到端学习不仅可以提高深度预测性能,还可以实现绝对深度预测,因为我们的网络利用了汽车等3D物体的大小是由近似大小决定的这一事实。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Self-supervised 3D Object Detection from Monocular Pseudo-LiDAR
There have been attempts to detect 3D objects by fusion of stereo camera images and LiDAR sensor data or using LiDAR for pre-training and only monocular images for testing, but there have been less attempts to use only monocular image sequences due to low accuracy. In addition, when depth prediction using only monocular images, only scale-inconsistent depth can be predicted, which is the reason why researchers are reluctant to use monocular images alone.Therefore, we propose a method for predicting absolute depth and detecting 3D objects using only monocular image sequences by enabling end-to-end learning of detection networks and depth prediction networks. As a result, the proposed method surpasses other existing methods in performance on the KITTI 3D dataset. Even when monocular image and 3D LiDAR are used together during training in an attempt to improve performance, ours exhibit is the best performance compared to other methods using the same input. In addition, end-to-end learning not only improves depth prediction performance, but also enables absolute depth prediction, because our network utilizes the fact that the size of a 3D object such as a car is determined by the approximate size.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Regression with Ensemble of RANSAC in Camera-LiDAR Fusion for Road Boundary Detection and Modeling Global-local Feature Aggregation for Event-based Object Detection on EventKITTI Predicting Autonomous Vehicle Navigation Parameters via Image and Image-and-Point Cloud Fusion-based End-to-End Methods Perception-aware Receding Horizon Path Planning for UAVs with LiDAR-based SLAM PIPO: Policy Optimization with Permutation-Invariant Constraint for Distributed Multi-Robot Navigation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1