利用多任务 CNN 对无人机图像中的车辆进行单目姿态和形状重构

IF 2.1 4区 地球科学 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY PFG-Journal of Photogrammetry Remote Sensing and Geoinformation Science Pub Date : 2024-09-16 DOI:10.1007/s41064-024-00311-0
S. El Amrani Abouelassad, M. Mehltretter, F. Rottensteiner
{"title":"利用多任务 CNN 对无人机图像中的车辆进行单目姿态和形状重构","authors":"S. El Amrani Abouelassad, M. Mehltretter, F. Rottensteiner","doi":"10.1007/s41064-024-00311-0","DOIUrl":null,"url":null,"abstract":"<p>Estimating the pose and shape of vehicles from aerial images is an important, yet challenging task. While there are many existing approaches that use stereo images from street-level perspectives to reconstruct objects in 3D, the majority of aerial configurations used for purposes like traffic surveillance are limited to monocular images. Addressing this challenge, a Convolutional Neural Network-based method is presented in this paper, which jointly performs detection, pose, type and 3D shape estimation for vehicles observed in monocular UAV imagery. For this purpose, a robust 3D object model is used following the concept of an Active Shape Model. In addition, different variants of loss functions for learning 3D shape estimation are presented, focusing on the height component, which is particularly challenging to estimate from monocular near-nadir images. We also introduce a UAV-based dataset to evaluate our model in addition to an augmented version of the publicly available Hessigheim benchmark dataset. Our method yields promising results in pose and shape estimation: utilising images with a ground sampling distance (GSD) of 3 cm, it achieves median errors of up to 4 cm in position and 3° in orientation. Additionally, it achieves root mean square (RMS) errors of <span>\\(\\pm 6\\)</span> cm in planimetry and <span>\\(\\pm 18\\)</span> cm in height for keypoints defining the car shape.</p>","PeriodicalId":56035,"journal":{"name":"PFG-Journal of Photogrammetry Remote Sensing and Geoinformation Science","volume":"81 1","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Monocular Pose and Shape Reconstruction of Vehicles in UAV imagery using a Multi-task CNN\",\"authors\":\"S. El Amrani Abouelassad, M. Mehltretter, F. Rottensteiner\",\"doi\":\"10.1007/s41064-024-00311-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Estimating the pose and shape of vehicles from aerial images is an important, yet challenging task. While there are many existing approaches that use stereo images from street-level perspectives to reconstruct objects in 3D, the majority of aerial configurations used for purposes like traffic surveillance are limited to monocular images. Addressing this challenge, a Convolutional Neural Network-based method is presented in this paper, which jointly performs detection, pose, type and 3D shape estimation for vehicles observed in monocular UAV imagery. For this purpose, a robust 3D object model is used following the concept of an Active Shape Model. In addition, different variants of loss functions for learning 3D shape estimation are presented, focusing on the height component, which is particularly challenging to estimate from monocular near-nadir images. We also introduce a UAV-based dataset to evaluate our model in addition to an augmented version of the publicly available Hessigheim benchmark dataset. Our method yields promising results in pose and shape estimation: utilising images with a ground sampling distance (GSD) of 3 cm, it achieves median errors of up to 4 cm in position and 3° in orientation. Additionally, it achieves root mean square (RMS) errors of <span>\\\\(\\\\pm 6\\\\)</span> cm in planimetry and <span>\\\\(\\\\pm 18\\\\)</span> cm in height for keypoints defining the car shape.</p>\",\"PeriodicalId\":56035,\"journal\":{\"name\":\"PFG-Journal of Photogrammetry Remote Sensing and Geoinformation Science\",\"volume\":\"81 1\",\"pages\":\"\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2024-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PFG-Journal of Photogrammetry Remote Sensing and Geoinformation Science\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://doi.org/10.1007/s41064-024-00311-0\",\"RegionNum\":4,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PFG-Journal of Photogrammetry Remote Sensing and Geoinformation Science","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1007/s41064-024-00311-0","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

从航空图像中估计车辆的姿态和形状是一项重要而又具有挑战性的任务。虽然现有的许多方法都使用街景立体图像来重建三维物体,但大多数用于交通监控等目的的航空配置都仅限于单目图像。为了应对这一挑战,本文提出了一种基于卷积神经网络的方法,该方法可联合执行单目无人机图像中观察到的车辆的检测、姿态、类型和三维形状估计。为此,根据主动形状模型的概念,使用了一个稳健的三维物体模型。此外,我们还介绍了用于学习三维形状估计的不同损失函数变体,重点关注高度分量,因为从单目近天底图像中估计高度分量尤其具有挑战性。除了公开的 Hessigheim 基准数据集的增强版外,我们还引入了一个基于无人机的数据集来评估我们的模型。我们的方法在姿态和形状估计方面取得了可喜的成果:利用地面采样距离(GSD)为 3 厘米的图像,我们的方法在位置和方向上的中值误差分别达到了 4 厘米和 3°。此外,对于定义汽车形状的关键点,它的平面测量均方根(RMS)误差为 \(\pm 6\) 厘米,高度误差为 \(\pm 18\) 厘米。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Monocular Pose and Shape Reconstruction of Vehicles in UAV imagery using a Multi-task CNN

Estimating the pose and shape of vehicles from aerial images is an important, yet challenging task. While there are many existing approaches that use stereo images from street-level perspectives to reconstruct objects in 3D, the majority of aerial configurations used for purposes like traffic surveillance are limited to monocular images. Addressing this challenge, a Convolutional Neural Network-based method is presented in this paper, which jointly performs detection, pose, type and 3D shape estimation for vehicles observed in monocular UAV imagery. For this purpose, a robust 3D object model is used following the concept of an Active Shape Model. In addition, different variants of loss functions for learning 3D shape estimation are presented, focusing on the height component, which is particularly challenging to estimate from monocular near-nadir images. We also introduce a UAV-based dataset to evaluate our model in addition to an augmented version of the publicly available Hessigheim benchmark dataset. Our method yields promising results in pose and shape estimation: utilising images with a ground sampling distance (GSD) of 3 cm, it achieves median errors of up to 4 cm in position and 3° in orientation. Additionally, it achieves root mean square (RMS) errors of \(\pm 6\) cm in planimetry and \(\pm 18\) cm in height for keypoints defining the car shape.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
8.20
自引率
2.40%
发文量
38
期刊介绍: PFG is an international scholarly journal covering the progress and application of photogrammetric methods, remote sensing technology and the interconnected field of geoinformation science. It places special editorial emphasis on the communication of new methodologies in data acquisition and new approaches to optimized processing and interpretation of all types of data which were acquired by photogrammetric methods, remote sensing, image processing and the computer-aided interpretation of such data in general. The journal hence addresses both researchers and students of these disciplines at academic institutions and universities as well as the downstream users in both the private sector and public administration. Founded in 1926 under the former name Bildmessung und Luftbildwesen, PFG is worldwide the oldest journal on photogrammetry. It is the official journal of the German Society for Photogrammetry, Remote Sensing and Geoinformation (DGPF).
期刊最新文献
Self-Supervised 3D Semantic Occupancy Prediction from Multi-View 2D Surround Images Characterization of transient movements within the Joshimath hillslope complex: Results from multi-sensor InSAR observations Monocular Pose and Shape Reconstruction of Vehicles in UAV imagery using a Multi-task CNN Assessing the Impact of Data-resolution On Ocean Frontal Characteristics Challenges and Opportunities of Sentinel-1 InSAR for Transport Infrastructure Monitoring
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1