Reinforcement learning for shared autonomy drone landings

IF 3.7 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Autonomous Robots Pub Date : 2023-10-21 DOI:10.1007/s10514-023-10143-3

Kal Backman, Dana Kulić, Hoam Chung

{"title":"Reinforcement learning for shared autonomy drone landings","authors":"Kal Backman, Dana Kulić, Hoam Chung","doi":"10.1007/s10514-023-10143-3","DOIUrl":null,"url":null,"abstract":"<div><p>Novice pilots find it difficult to operate and land unmanned aerial vehicles (UAVs), due to the complex UAV dynamics, challenges in depth perception, lack of expertise with the control interface and additional disturbances from the ground effect. Therefore we propose a shared autonomy approach to assist pilots in safely landing a UAV under conditions where depth perception is difficult and safe landing zones are limited. Our approach is comprised of two modules: a perception module that encodes information onto a compressed latent representation using two RGB-D cameras and a policy module that is trained with the reinforcement learning algorithm TD3 to discern the pilot’s intent and to provide control inputs that augment the user’s input to safely land the UAV. The policy module is trained in simulation using a population of simulated users. Simulated users are sampled from a parametric model with four parameters, which model a pilot’s tendency to conform to the assistant, proficiency, aggressiveness and speed. We conduct a user study (<span>\\(n=28\\)</span>) where human participants were tasked with landing a physical UAV on one of several platforms under challenging viewing conditions. The assistant, trained with only simulated user data, improved task success rate from 51.4 to 98.2% despite being unaware of the human participants’ goal or the structure of the environment a priori. With the proposed assistant, regardless of prior piloting experience, participants performed with a proficiency greater than the most experienced unassisted participants.\n</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1419 - 1438"},"PeriodicalIF":3.7000,"publicationDate":"2023-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-023-10143-3.pdf","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Autonomous Robots","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10514-023-10143-3","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 2

Abstract

Novice pilots find it difficult to operate and land unmanned aerial vehicles (UAVs), due to the complex UAV dynamics, challenges in depth perception, lack of expertise with the control interface and additional disturbances from the ground effect. Therefore we propose a shared autonomy approach to assist pilots in safely landing a UAV under conditions where depth perception is difficult and safe landing zones are limited. Our approach is comprised of two modules: a perception module that encodes information onto a compressed latent representation using two RGB-D cameras and a policy module that is trained with the reinforcement learning algorithm TD3 to discern the pilot’s intent and to provide control inputs that augment the user’s input to safely land the UAV. The policy module is trained in simulation using a population of simulated users. Simulated users are sampled from a parametric model with four parameters, which model a pilot’s tendency to conform to the assistant, proficiency, aggressiveness and speed. We conduct a user study (\(n=28\)) where human participants were tasked with landing a physical UAV on one of several platforms under challenging viewing conditions. The assistant, trained with only simulated user data, improved task success rate from 51.4 to 98.2% despite being unaware of the human participants’ goal or the structure of the environment a priori. With the proposed assistant, regardless of prior piloting experience, participants performed with a proficiency greater than the most experienced unassisted participants.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

共享自主无人机着陆的强化学习

由于复杂的无人机动力学，深度感知的挑战，缺乏控制界面的专业知识以及来自地面效应的额外干扰，新手飞行员发现很难操作和降落无人机(UAV)。因此，我们提出了一种共享自主方法来帮助飞行员在深度感知困难和安全着陆区域有限的情况下安全着陆无人机。我们的方法由两个模块组成:一个感知模块，使用两个RGB-D相机将信息编码到压缩的潜在表示中;一个策略模块，使用强化学习算法TD3进行训练，以识别飞行员的意图，并提供控制输入，增加用户的输入以安全降落无人机。策略模块在模拟中使用一组模拟用户进行训练。模拟用户从一个参数模型中抽样，该模型有四个参数，分别模拟飞行员的服从倾向、熟练程度、侵略性和速度。我们进行了一项用户研究(\(n=28\))，其中人类参与者的任务是在具有挑战性的观看条件下将实体无人机降落在几个平台之一上。仅使用模拟用户数据进行训练的助手将任务成功率从51.4提高到98.2% despite being unaware of the human participants’ goal or the structure of the environment a priori. With the proposed assistant, regardless of prior piloting experience, participants performed with a proficiency greater than the most experienced unassisted participants.

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Autonomous Robots 工程技术-机器人学

CiteScore

7.90

自引率

5.70%

发文量

审稿时长

3 months

期刊介绍： Autonomous Robots reports on the theory and applications of robotic systems capable of some degree of self-sufficiency. It features papers that include performance data on actual robots in the real world. Coverage includes: control of autonomous robots · real-time vision · autonomous wheeled and tracked vehicles · legged vehicles · computational architectures for autonomous systems · distributed architectures for learning, control and adaptation · studies of autonomous robot systems · sensor fusion · theory of autonomous systems · terrain mapping and recognition · self-calibration and self-repair for robots · self-reproducing intelligent structures · genetic algorithms as models for robot development. The focus is on the ability to move and be self-sufficient, not on whether the system is an imitation of biology. Of course, biological models for robotic systems are of major interest to the journal since living systems are prototypes for autonomous behavior.