Reinforcement learning for shared autonomy drone landings

IF 3.7 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Autonomous Robots Pub Date : 2023-10-21 DOI:10.1007/s10514-023-10143-3
Kal Backman, Dana Kulić, Hoam Chung
{"title":"Reinforcement learning for shared autonomy drone landings","authors":"Kal Backman,&nbsp;Dana Kulić,&nbsp;Hoam Chung","doi":"10.1007/s10514-023-10143-3","DOIUrl":null,"url":null,"abstract":"<div><p>Novice pilots find it difficult to operate and land unmanned aerial vehicles (UAVs), due to the complex UAV dynamics, challenges in depth perception, lack of expertise with the control interface and additional disturbances from the ground effect. Therefore we propose a shared autonomy approach to assist pilots in safely landing a UAV under conditions where depth perception is difficult and safe landing zones are limited. Our approach is comprised of two modules: a perception module that encodes information onto a compressed latent representation using two RGB-D cameras and a policy module that is trained with the reinforcement learning algorithm TD3 to discern the pilot’s intent and to provide control inputs that augment the user’s input to safely land the UAV. The policy module is trained in simulation using a population of simulated users. Simulated users are sampled from a parametric model with four parameters, which model a pilot’s tendency to conform to the assistant, proficiency, aggressiveness and speed. We conduct a user study (<span>\\(n=28\\)</span>) where human participants were tasked with landing a physical UAV on one of several platforms under challenging viewing conditions. The assistant, trained with only simulated user data, improved task success rate from 51.4 to 98.2% despite being unaware of the human participants’ goal or the structure of the environment a priori. With the proposed assistant, regardless of prior piloting experience, participants performed with a proficiency greater than the most experienced unassisted participants.\n</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":null,"pages":null},"PeriodicalIF":3.7000,"publicationDate":"2023-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-023-10143-3.pdf","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Autonomous Robots","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10514-023-10143-3","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 2

Abstract

Novice pilots find it difficult to operate and land unmanned aerial vehicles (UAVs), due to the complex UAV dynamics, challenges in depth perception, lack of expertise with the control interface and additional disturbances from the ground effect. Therefore we propose a shared autonomy approach to assist pilots in safely landing a UAV under conditions where depth perception is difficult and safe landing zones are limited. Our approach is comprised of two modules: a perception module that encodes information onto a compressed latent representation using two RGB-D cameras and a policy module that is trained with the reinforcement learning algorithm TD3 to discern the pilot’s intent and to provide control inputs that augment the user’s input to safely land the UAV. The policy module is trained in simulation using a population of simulated users. Simulated users are sampled from a parametric model with four parameters, which model a pilot’s tendency to conform to the assistant, proficiency, aggressiveness and speed. We conduct a user study (\(n=28\)) where human participants were tasked with landing a physical UAV on one of several platforms under challenging viewing conditions. The assistant, trained with only simulated user data, improved task success rate from 51.4 to 98.2% despite being unaware of the human participants’ goal or the structure of the environment a priori. With the proposed assistant, regardless of prior piloting experience, participants performed with a proficiency greater than the most experienced unassisted participants.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
共享自主无人机着陆的强化学习
由于复杂的无人机动力学,深度感知的挑战,缺乏控制界面的专业知识以及来自地面效应的额外干扰,新手飞行员发现很难操作和降落无人机(UAV)。因此,我们提出了一种共享自主方法来帮助飞行员在深度感知困难和安全着陆区域有限的情况下安全着陆无人机。我们的方法由两个模块组成:一个感知模块,使用两个RGB-D相机将信息编码到压缩的潜在表示中;一个策略模块,使用强化学习算法TD3进行训练,以识别飞行员的意图,并提供控制输入,增加用户的输入以安全降落无人机。策略模块在模拟中使用一组模拟用户进行训练。模拟用户从一个参数模型中抽样,该模型有四个参数,分别模拟飞行员的服从倾向、熟练程度、侵略性和速度。我们进行了一项用户研究(\(n=28\)),其中人类参与者的任务是在具有挑战性的观看条件下将实体无人机降落在几个平台之一上。仅使用模拟用户数据进行训练的助手将任务成功率从51.4提高到98.2% despite being unaware of the human participants’ goal or the structure of the environment a priori. With the proposed assistant, regardless of prior piloting experience, participants performed with a proficiency greater than the most experienced unassisted participants.
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Autonomous Robots
Autonomous Robots 工程技术-机器人学
CiteScore
7.90
自引率
5.70%
发文量
46
审稿时长
3 months
期刊介绍: Autonomous Robots reports on the theory and applications of robotic systems capable of some degree of self-sufficiency. It features papers that include performance data on actual robots in the real world. Coverage includes: control of autonomous robots · real-time vision · autonomous wheeled and tracked vehicles · legged vehicles · computational architectures for autonomous systems · distributed architectures for learning, control and adaptation · studies of autonomous robot systems · sensor fusion · theory of autonomous systems · terrain mapping and recognition · self-calibration and self-repair for robots · self-reproducing intelligent structures · genetic algorithms as models for robot development. The focus is on the ability to move and be self-sufficient, not on whether the system is an imitation of biology. Of course, biological models for robotic systems are of major interest to the journal since living systems are prototypes for autonomous behavior.
期刊最新文献
Optimal policies for autonomous navigation in strong currents using fast marching trees A concurrent learning approach to monocular vision range regulation of leader/follower systems Correction: Planning under uncertainty for safe robot exploration using gaussian process prediction Dynamic event-triggered integrated task and motion planning for process-aware source seeking Continuous planning for inertial-aided systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1