Trajectory Design for Unmanned Aerial Vehicles via Meta-Reinforcement Learning

IEEE INFOCOM 2023 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS) Pub Date : 2023-05-20 DOI:10.1109/INFOCOMWKSHPS57453.2023.10226090

Ziyang Lu, Xueyuan Wang, M. C. Gursoy

{"title":"Trajectory Design for Unmanned Aerial Vehicles via Meta-Reinforcement Learning","authors":"Ziyang Lu, Xueyuan Wang, M. C. Gursoy","doi":"10.1109/INFOCOMWKSHPS57453.2023.10226090","DOIUrl":null,"url":null,"abstract":"This paper considers the trajectory design problem for unmanned aerial vehicles (UAVs) via meta-reinforcement learning. It is assumed that the UAV can move in different directions to explore a specific area and collect data from the ground nodes (GNs) located in the area. The goal of the UAV is to reach the destination and maximize the total data collected during the flight on the trajectory while avoiding collisions with other UAVs. In the literature on UAV trajectory designs, vanilla learning algorithms are typically used to train a task-specific model, and provide near-optimal solutions for a specific spatial distribution of the GNs. However, this approach requires retraining from scratch when the locations of the GNs vary. In this work, we propose a meta reinforcement learning framework that incorporates the method of Model-Agnostic Meta-Learning (MAML). Instead of training task-specific models, we train a common initialization for different distributions of GNs and different channel conditions. From the initialization, only a few gradient descents are required for adapting to different tasks with different GN distributions and channel conditions. Additionally, we also explore when the proposed MAML framework is preferred and can outperform the compared algorithms.","PeriodicalId":354290,"journal":{"name":"IEEE INFOCOM 2023 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)","volume":"533 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE INFOCOM 2023 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INFOCOMWKSHPS57453.2023.10226090","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

This paper considers the trajectory design problem for unmanned aerial vehicles (UAVs) via meta-reinforcement learning. It is assumed that the UAV can move in different directions to explore a specific area and collect data from the ground nodes (GNs) located in the area. The goal of the UAV is to reach the destination and maximize the total data collected during the flight on the trajectory while avoiding collisions with other UAVs. In the literature on UAV trajectory designs, vanilla learning algorithms are typically used to train a task-specific model, and provide near-optimal solutions for a specific spatial distribution of the GNs. However, this approach requires retraining from scratch when the locations of the GNs vary. In this work, we propose a meta reinforcement learning framework that incorporates the method of Model-Agnostic Meta-Learning (MAML). Instead of training task-specific models, we train a common initialization for different distributions of GNs and different channel conditions. From the initialization, only a few gradient descents are required for adapting to different tasks with different GN distributions and channel conditions. Additionally, we also explore when the proposed MAML framework is preferred and can outperform the compared algorithms.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于元强化学习的无人机轨迹设计

本文利用元强化学习方法研究了无人机的轨迹设计问题。假设无人机可以在不同方向上移动以探索特定区域，并从位于该区域的地面节点(GNs)收集数据。无人机的目标是到达目的地并在飞行轨迹上最大限度地收集总数据，同时避免与其他无人机发生碰撞。在无人机轨迹设计的文献中，香草学习算法通常用于训练特定任务的模型，并为gn的特定空间分布提供接近最优的解决方案。然而，当gn的位置发生变化时，这种方法需要从头开始重新训练。在这项工作中，我们提出了一个元强化学习框架，该框架结合了模型不可知元学习(MAML)的方法。我们不是训练特定于任务的模型，而是针对不同的gn分布和不同的信道条件训练一个共同的初始化。从初始化开始，只需要少量的梯度下降就可以适应不同GN分布和信道条件下的不同任务。此外，我们还探讨了所提出的MAML框架在什么情况下是首选的，并且可以优于所比较的算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE INFOCOM 2023 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)

自引率

0.00%

发文量