密集速度约束多无人机避碰的深度强化学习方法

IF 4.6 2区 计算机科学 Q2 ROBOTICS IEEE Robotics and Automation Letters Pub Date : 2025-01-08 DOI:10.1109/LRA.2025.3527292
Jiale Han;Yi Zhu;Jian Yang
{"title":"密集速度约束多无人机避碰的深度强化学习方法","authors":"Jiale Han;Yi Zhu;Jian Yang","doi":"10.1109/LRA.2025.3527292","DOIUrl":null,"url":null,"abstract":"This letter introduces a novel deep reinforcement learning (DRL) method for collision avoidance problem of fixed-wing unmanned aerial vehicles (UAVs). First, with considering the characteristics of collision avoidance problem, a collision prediction method is proposed to identify the neighboring UAVs with a significant threat. A convolutional neural network model is devised to extract the dynamic environment features. Second, a trajectory tracking macro action is incorporated into the action space of the proposed DRL-based algorithm. Guided by the reward function that considers to reward for closing to the preset flight paths, UAVs could return to the preset flight path after completing the collision avoidance. The proposed method is trained in simulation scenarios, with model updates implemented using a soft actor-critic (SAC) algorithm. Validation experiments are conducted in several complex multi-UAV flight environments. The results demonstrate the superiority of our method over other advanced methods.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 3","pages":"2152-2159"},"PeriodicalIF":4.6000,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Deep Reinforcement Learning Method for Collision Avoidance with Dense Speed-Constrained Multi-UAV\",\"authors\":\"Jiale Han;Yi Zhu;Jian Yang\",\"doi\":\"10.1109/LRA.2025.3527292\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This letter introduces a novel deep reinforcement learning (DRL) method for collision avoidance problem of fixed-wing unmanned aerial vehicles (UAVs). First, with considering the characteristics of collision avoidance problem, a collision prediction method is proposed to identify the neighboring UAVs with a significant threat. A convolutional neural network model is devised to extract the dynamic environment features. Second, a trajectory tracking macro action is incorporated into the action space of the proposed DRL-based algorithm. Guided by the reward function that considers to reward for closing to the preset flight paths, UAVs could return to the preset flight path after completing the collision avoidance. The proposed method is trained in simulation scenarios, with model updates implemented using a soft actor-critic (SAC) algorithm. Validation experiments are conducted in several complex multi-UAV flight environments. The results demonstrate the superiority of our method over other advanced methods.\",\"PeriodicalId\":13241,\"journal\":{\"name\":\"IEEE Robotics and Automation Letters\",\"volume\":\"10 3\",\"pages\":\"2152-2159\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-01-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Robotics and Automation Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10833826/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Robotics and Automation Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10833826/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0

摘要

本文介绍了一种新的用于固定翼无人机避碰问题的深度强化学习(DRL)方法。首先,结合避碰问题的特点,提出了一种识别具有显著威胁的相邻无人机的避碰预测方法;设计了卷积神经网络模型来提取动态环境特征。其次,在基于drl算法的动作空间中加入轨迹跟踪宏观动作;在奖励函数的指导下,无人机考虑对接近预定飞行路径进行奖励,在完成避碰后返回预定飞行路径。所提出的方法在仿真场景中进行训练,并使用软actor-critic (SAC)算法实现模型更新。在多个复杂的多无人机飞行环境下进行了验证实验。结果表明,该方法优于其他先进方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A Deep Reinforcement Learning Method for Collision Avoidance with Dense Speed-Constrained Multi-UAV
This letter introduces a novel deep reinforcement learning (DRL) method for collision avoidance problem of fixed-wing unmanned aerial vehicles (UAVs). First, with considering the characteristics of collision avoidance problem, a collision prediction method is proposed to identify the neighboring UAVs with a significant threat. A convolutional neural network model is devised to extract the dynamic environment features. Second, a trajectory tracking macro action is incorporated into the action space of the proposed DRL-based algorithm. Guided by the reward function that considers to reward for closing to the preset flight paths, UAVs could return to the preset flight path after completing the collision avoidance. The proposed method is trained in simulation scenarios, with model updates implemented using a soft actor-critic (SAC) algorithm. Validation experiments are conducted in several complex multi-UAV flight environments. The results demonstrate the superiority of our method over other advanced methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Robotics and Automation Letters
IEEE Robotics and Automation Letters Computer Science-Computer Science Applications
CiteScore
9.60
自引率
15.40%
发文量
1428
期刊介绍: The scope of this journal is to publish peer-reviewed articles that provide a timely and concise account of innovative research ideas and application results, reporting significant theoretical findings and application case studies in areas of robotics and automation.
期刊最新文献
RA-RRTV*: Risk-Averse RRT* With Local Vine Expansion for Path Planning in Narrow Passages Under Localization Uncertainty Controlling Pneumatic Bending Actuator With Gain-Scheduled Feedforward and Physical Reservoir Computing State Estimation Funabot-Sleeve: A Wearable Device Employing McKibben Artificial Muscles for Haptic Sensation in the Forearm 3D Guidance Law for Flexible Target Enclosing With Inherent Safety Learning Agile Swimming: An End-to-End Approach Without CPGs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1