Trajectory Design for Unmanned Aerial Vehicles via Meta-Reinforcement Learning

Ziyang Lu, Xueyuan Wang, M. C. Gursoy
{"title":"Trajectory Design for Unmanned Aerial Vehicles via Meta-Reinforcement Learning","authors":"Ziyang Lu, Xueyuan Wang, M. C. Gursoy","doi":"10.1109/INFOCOMWKSHPS57453.2023.10226090","DOIUrl":null,"url":null,"abstract":"This paper considers the trajectory design problem for unmanned aerial vehicles (UAVs) via meta-reinforcement learning. It is assumed that the UAV can move in different directions to explore a specific area and collect data from the ground nodes (GNs) located in the area. The goal of the UAV is to reach the destination and maximize the total data collected during the flight on the trajectory while avoiding collisions with other UAVs. In the literature on UAV trajectory designs, vanilla learning algorithms are typically used to train a task-specific model, and provide near-optimal solutions for a specific spatial distribution of the GNs. However, this approach requires retraining from scratch when the locations of the GNs vary. In this work, we propose a meta reinforcement learning framework that incorporates the method of Model-Agnostic Meta-Learning (MAML). Instead of training task-specific models, we train a common initialization for different distributions of GNs and different channel conditions. From the initialization, only a few gradient descents are required for adapting to different tasks with different GN distributions and channel conditions. Additionally, we also explore when the proposed MAML framework is preferred and can outperform the compared algorithms.","PeriodicalId":354290,"journal":{"name":"IEEE INFOCOM 2023 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)","volume":"533 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE INFOCOM 2023 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INFOCOMWKSHPS57453.2023.10226090","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

This paper considers the trajectory design problem for unmanned aerial vehicles (UAVs) via meta-reinforcement learning. It is assumed that the UAV can move in different directions to explore a specific area and collect data from the ground nodes (GNs) located in the area. The goal of the UAV is to reach the destination and maximize the total data collected during the flight on the trajectory while avoiding collisions with other UAVs. In the literature on UAV trajectory designs, vanilla learning algorithms are typically used to train a task-specific model, and provide near-optimal solutions for a specific spatial distribution of the GNs. However, this approach requires retraining from scratch when the locations of the GNs vary. In this work, we propose a meta reinforcement learning framework that incorporates the method of Model-Agnostic Meta-Learning (MAML). Instead of training task-specific models, we train a common initialization for different distributions of GNs and different channel conditions. From the initialization, only a few gradient descents are required for adapting to different tasks with different GN distributions and channel conditions. Additionally, we also explore when the proposed MAML framework is preferred and can outperform the compared algorithms.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于元强化学习的无人机轨迹设计
本文利用元强化学习方法研究了无人机的轨迹设计问题。假设无人机可以在不同方向上移动以探索特定区域,并从位于该区域的地面节点(GNs)收集数据。无人机的目标是到达目的地并在飞行轨迹上最大限度地收集总数据,同时避免与其他无人机发生碰撞。在无人机轨迹设计的文献中,香草学习算法通常用于训练特定任务的模型,并为gn的特定空间分布提供接近最优的解决方案。然而,当gn的位置发生变化时,这种方法需要从头开始重新训练。在这项工作中,我们提出了一个元强化学习框架,该框架结合了模型不可知元学习(MAML)的方法。我们不是训练特定于任务的模型,而是针对不同的gn分布和不同的信道条件训练一个共同的初始化。从初始化开始,只需要少量的梯度下降就可以适应不同GN分布和信道条件下的不同任务。此外,我们还探讨了所提出的MAML框架在什么情况下是首选的,并且可以优于所比较的算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Hasty Grid S&R Prototype Using Autonomous UTM and AI-Based Mission Coordination Trajectory Design for Unmanned Aerial Vehicles via Meta-Reinforcement Learning CICADA: Cloud-based Intelligent Classification and Active Defense Approach for IoT Security Learning-Aided Multi-UAV Online Trajectory Coordination and Resource Allocation for Mobile WSNs Vulnerability Exploit Pattern Generation and Analysis for proactive security risk mitigation for 5G networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1