On the design of deep learning-based control algorithms for visually guided UAVs engaged in power tower inspection tasks.

IF 2.9 Q2 ROBOTICS Frontiers in Robotics and AI Pub Date : 2024-04-26 eCollection Date: 2024-01-01 DOI:10.3389/frobt.2024.1378149
Guillaume Maitre, Dimitri Martinot, Elio Tuci
{"title":"On the design of deep learning-based control algorithms for visually guided UAVs engaged in power tower inspection tasks.","authors":"Guillaume Maitre, Dimitri Martinot, Elio Tuci","doi":"10.3389/frobt.2024.1378149","DOIUrl":null,"url":null,"abstract":"<p><p>This paper focuses on the design of Convolution Neural Networks to visually guide an autonomous Unmanned Aerial Vehicle required to inspect power towers. The network is required to precisely segment images taken by a camera mounted on a UAV in order to allow a motion module to generate collision-free and inspection-relevant manoeuvres of the UAV along different types of towers. The images segmentation process is particularly challenging not only because of the different structures of the towers but also because of the enormous variability of the background, which can vary from the uniform blue of the sky to the multi-colour complexity of a rural, forest, or urban area. To be able to train networks that are robust enough to deal with the task variability, without incurring into a labour-intensive and costly annotation process of physical-world images, we have carried out a comparative study in which we evaluate the performances of networks trained either with synthetic images (i.e., the synthetic dataset), physical-world images (i.e., the physical-world dataset), or a combination of these two types of images (i.e., the hybrid dataset). The network used is an attention-based U-NET. The synthetic images are created using photogrammetry, to accurately model power towers, and simulated environments modelling a UAV during inspection of different power towers in different settings. Our findings reveal that the network trained on the hybrid dataset outperforms the networks trained with the synthetic and the physical-world image datasets. Most notably, the networks trained with the hybrid dataset demonstrates a superior performance on multiples evaluation metrics related to the image-segmentation task. This suggests that, the combination of synthetic and physical-world images represents the best trade-off to minimise the costs related to capturing and annotating physical-world images, and to maximise the task performances. Moreover, the results of our study demonstrate the potential of photogrammetry in creating effective training datasets to design networks to automate the precise movement of visually-guided UAVs.</p>","PeriodicalId":47597,"journal":{"name":"Frontiers in Robotics and AI","volume":null,"pages":null},"PeriodicalIF":2.9000,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11082499/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Robotics and AI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frobt.2024.1378149","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0

Abstract

This paper focuses on the design of Convolution Neural Networks to visually guide an autonomous Unmanned Aerial Vehicle required to inspect power towers. The network is required to precisely segment images taken by a camera mounted on a UAV in order to allow a motion module to generate collision-free and inspection-relevant manoeuvres of the UAV along different types of towers. The images segmentation process is particularly challenging not only because of the different structures of the towers but also because of the enormous variability of the background, which can vary from the uniform blue of the sky to the multi-colour complexity of a rural, forest, or urban area. To be able to train networks that are robust enough to deal with the task variability, without incurring into a labour-intensive and costly annotation process of physical-world images, we have carried out a comparative study in which we evaluate the performances of networks trained either with synthetic images (i.e., the synthetic dataset), physical-world images (i.e., the physical-world dataset), or a combination of these two types of images (i.e., the hybrid dataset). The network used is an attention-based U-NET. The synthetic images are created using photogrammetry, to accurately model power towers, and simulated environments modelling a UAV during inspection of different power towers in different settings. Our findings reveal that the network trained on the hybrid dataset outperforms the networks trained with the synthetic and the physical-world image datasets. Most notably, the networks trained with the hybrid dataset demonstrates a superior performance on multiples evaluation metrics related to the image-segmentation task. This suggests that, the combination of synthetic and physical-world images represents the best trade-off to minimise the costs related to capturing and annotating physical-world images, and to maximise the task performances. Moreover, the results of our study demonstrate the potential of photogrammetry in creating effective training datasets to design networks to automate the precise movement of visually-guided UAVs.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
为执行电力塔检测任务的视觉制导无人机设计基于深度学习的控制算法。
本文的重点是设计卷积神经网络,为检查电塔所需的自主无人驾驶飞行器提供视觉引导。该网络需要对无人飞行器上安装的摄像头拍摄的图像进行精确分割,以便让运动模块生成无人飞行器沿不同类型的塔架进行无碰撞和与检查相关的机动操作。图像分割过程尤其具有挑战性,这不仅是因为塔楼的结构各不相同,还因为背景的巨大变异性,从天空的统一蓝色到农村、森林或城市地区的多色复杂性都可能发生变化。为了能够训练出足够强大的网络来应对任务的多变性,同时又不需要对物理世界的图像进行耗费大量人力和财力的标注,我们开展了一项比较研究,对使用合成图像(即合成数据集)、物理世界图像(即物理世界数据集)或这两种图像的组合(即混合数据集)训练的网络的性能进行了评估。使用的网络是基于注意力的 U-NET 网络。合成图像使用摄影测量法创建,以准确模拟电力塔,并模拟无人机在不同环境中检查不同电力塔的环境。我们的研究结果表明,使用混合数据集训练的网络优于使用合成和物理世界图像数据集训练的网络。最值得注意的是,使用混合数据集训练的网络在与图像分割任务相关的多个评估指标上都表现出色。这表明,合成图像和物理世界图像的结合是最佳的权衡方法,既能最大限度地降低捕获和注释物理世界图像的相关成本,又能最大限度地提高任务性能。此外,我们的研究结果还证明了摄影测量在创建有效训练数据集方面的潜力,这些数据集可用于设计网络,使视觉制导无人机的精确运动实现自动化。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
6.50
自引率
5.90%
发文量
355
审稿时长
14 weeks
期刊介绍: Frontiers in Robotics and AI publishes rigorously peer-reviewed research covering all theory and applications of robotics, technology, and artificial intelligence, from biomedical to space robotics.
期刊最新文献
Cybernic robot hand-arm that realizes cooperative work as a new hand-arm for people with a single upper-limb dysfunction. Advancements in the use of AI in the diagnosis and management of inflammatory bowel disease. Remote science at sea with remotely operated vehicles. A pipeline for estimating human attention toward objects with on-board cameras on the iCub humanoid robot. Leveraging imitation learning in agricultural robotics: a comprehensive survey and comparative analysis.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1