通过知识转移实现无人机控制任务变化下的高效深度强化学习

IF 4.1 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS ICT Express Pub Date : 2024-06-01 DOI:10.1016/j.icte.2024.04.002

Sooyoung Jang, Hyung-Il Kim

{"title":"通过知识转移实现无人机控制任务变化下的高效深度强化学习","authors":"Sooyoung Jang, Hyung-Il Kim","doi":"10.1016/j.icte.2024.04.002","DOIUrl":null,"url":null,"abstract":"<div><p>Despite the growing interest in using deep reinforcement learning (DRL) for drone control, several challenges remain to be addressed, including issues with generalization across task variations and agent training (which requires significant computational power and time). When the agent’s input changes owing to the drone’s sensors or mission variations, significant retraining overhead is required to handle the changes in the input data pattern and the neural network architecture to accommodate the input data. These difficulties severely limit their applicability in dynamic real-world environments. In this paper, we propose an efficient DRL method that leverages the knowledge of the source agent to accelerate the training of the target agent under task variations. The proposed method consists of three phases: collecting training data for the target agent using the source agent, supervised pre-training of the target agent, and DRL-based fine-tuning. Experimental validation demonstrated a remarkable reduction in the training time (up to 94.29%), suggesting a potential avenue for the successful and efficient application of DRL in drone control.</p></div>","PeriodicalId":48526,"journal":{"name":"ICT Express","volume":"10 3","pages":"Pages 576-582"},"PeriodicalIF":4.1000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S240595952400033X/pdfft?md5=7d370e1bd566b1fe70dbc9a76bf4c077&pid=1-s2.0-S240595952400033X-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Efficient deep reinforcement learning under task variations via knowledge transfer for drone control\",\"authors\":\"Sooyoung Jang, Hyung-Il Kim\",\"doi\":\"10.1016/j.icte.2024.04.002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Despite the growing interest in using deep reinforcement learning (DRL) for drone control, several challenges remain to be addressed, including issues with generalization across task variations and agent training (which requires significant computational power and time). When the agent’s input changes owing to the drone’s sensors or mission variations, significant retraining overhead is required to handle the changes in the input data pattern and the neural network architecture to accommodate the input data. These difficulties severely limit their applicability in dynamic real-world environments. In this paper, we propose an efficient DRL method that leverages the knowledge of the source agent to accelerate the training of the target agent under task variations. The proposed method consists of three phases: collecting training data for the target agent using the source agent, supervised pre-training of the target agent, and DRL-based fine-tuning. Experimental validation demonstrated a remarkable reduction in the training time (up to 94.29%), suggesting a potential avenue for the successful and efficient application of DRL in drone control.</p></div>\",\"PeriodicalId\":48526,\"journal\":{\"name\":\"ICT Express\",\"volume\":\"10 3\",\"pages\":\"Pages 576-582\"},\"PeriodicalIF\":4.1000,\"publicationDate\":\"2024-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S240595952400033X/pdfft?md5=7d370e1bd566b1fe70dbc9a76bf4c077&pid=1-s2.0-S240595952400033X-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ICT Express\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S240595952400033X\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICT Express","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S240595952400033X","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

尽管人们对使用深度强化学习（DRL）进行无人机控制的兴趣与日俱增，但仍有一些挑战有待解决，其中包括在任务变化和代理训练（需要大量计算能力和时间）中的泛化问题。当无人机的传感器或任务变化导致代理的输入发生变化时，需要大量的重新训练开销来处理输入数据模式和神经网络架构的变化，以适应输入数据。这些困难严重限制了它们在动态真实环境中的适用性。在本文中，我们提出了一种高效的 DRL 方法，利用源代理的知识来加速任务变化下目标代理的训练。所提出的方法包括三个阶段：利用源代理为目标代理收集训练数据、对目标代理进行有监督的预训练以及基于 DRL 的微调。实验验证表明，训练时间显著缩短（达 94.29%），这为 DRL 在无人机控制中的成功和高效应用提供了潜在途径。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Efficient deep reinforcement learning under task variations via knowledge transfer for drone control

Despite the growing interest in using deep reinforcement learning (DRL) for drone control, several challenges remain to be addressed, including issues with generalization across task variations and agent training (which requires significant computational power and time). When the agent’s input changes owing to the drone’s sensors or mission variations, significant retraining overhead is required to handle the changes in the input data pattern and the neural network architecture to accommodate the input data. These difficulties severely limit their applicability in dynamic real-world environments. In this paper, we propose an efficient DRL method that leverages the knowledge of the source agent to accelerate the training of the target agent under task variations. The proposed method consists of three phases: collecting training data for the target agent using the source agent, supervised pre-training of the target agent, and DRL-based fine-tuning. Experimental validation demonstrated a remarkable reduction in the training time (up to 94.29%), suggesting a potential avenue for the successful and efficient application of DRL in drone control.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ICT Express Multiple-

CiteScore

10.20

自引率

1.90%

发文量

167

审稿时长

35 weeks

期刊介绍： The ICT Express journal published by the Korean Institute of Communications and Information Sciences (KICS) is an international, peer-reviewed research publication covering all aspects of information and communication technology. The journal aims to publish research that helps advance the theoretical and practical understanding of ICT convergence, platform technologies, communication networks, and device technologies. The technology advancement in information and communication technology (ICT) sector enables portable devices to be always connected while supporting high data rate, resulting in the recent popularity of smartphones that have a considerable impact in economic and social development.