Obstacle avoidance USV in multi-static obstacle environments based on a deep reinforcement learning approach

IF 1.3 4区 计算机科学 Q4 AUTOMATION & CONTROL SYSTEMS Measurement & Control Pub Date : 2023-10-12 DOI:10.1177/00202940231195937
Dengyao Jiang, Mingzhe Yuan, Junfeng Xiong, Jinchao Xiao, Yong Duan
{"title":"Obstacle avoidance USV in multi-static obstacle environments based on a deep reinforcement learning approach","authors":"Dengyao Jiang, Mingzhe Yuan, Junfeng Xiong, Jinchao Xiao, Yong Duan","doi":"10.1177/00202940231195937","DOIUrl":null,"url":null,"abstract":"Unmanned surface vehicles (USVs) are intelligent platforms for unmanned surface navigation based on artificial intelligence, motion control, environmental awareness, and other professional technologies. Obstacle avoidance is an important part of its autonomous navigation. Although the USV works in the water environment (e.g. monitoring and tracking, search and rescue scenarios), the dynamic and complex operating environment makes the traditional methods not suitable for solving the obstacle avoidance problem of the USV. In this paper, to address the issue of poor convergence of the Twin Delayed Deep Deterministic policy gradient (TD3) algorithm of Deep Reinforcement Learning (DRL) in an unstructured environment and wave current interference, random walk policy is proposed to deposit the pre-exploration policy of the algorithm into the experience pool to accelerate the convergence of the algorithm and thus achieve USV obstacle avoidance, which can achieve collision-free navigation from any start point to a given end point in a dynamic and complex environment without offline trajectory and track point generation. We design a pre-exploration policy for the environment and a virtual simulation environment for training and testing the algorithm and give the reward function and training method. The simulation results show that our proposed algorithm is more manageable to converge than the original algorithm and can perform better in complex environments in terms of obstacle avoidance behavior, reflecting the algorithm’s feasibility and effectiveness.","PeriodicalId":49849,"journal":{"name":"Measurement & Control","volume":null,"pages":null},"PeriodicalIF":1.3000,"publicationDate":"2023-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Measurement & Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/00202940231195937","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Unmanned surface vehicles (USVs) are intelligent platforms for unmanned surface navigation based on artificial intelligence, motion control, environmental awareness, and other professional technologies. Obstacle avoidance is an important part of its autonomous navigation. Although the USV works in the water environment (e.g. monitoring and tracking, search and rescue scenarios), the dynamic and complex operating environment makes the traditional methods not suitable for solving the obstacle avoidance problem of the USV. In this paper, to address the issue of poor convergence of the Twin Delayed Deep Deterministic policy gradient (TD3) algorithm of Deep Reinforcement Learning (DRL) in an unstructured environment and wave current interference, random walk policy is proposed to deposit the pre-exploration policy of the algorithm into the experience pool to accelerate the convergence of the algorithm and thus achieve USV obstacle avoidance, which can achieve collision-free navigation from any start point to a given end point in a dynamic and complex environment without offline trajectory and track point generation. We design a pre-exploration policy for the environment and a virtual simulation environment for training and testing the algorithm and give the reward function and training method. The simulation results show that our proposed algorithm is more manageable to converge than the original algorithm and can perform better in complex environments in terms of obstacle avoidance behavior, reflecting the algorithm’s feasibility and effectiveness.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于深度强化学习方法的多静态障碍物环境下的USV避障
无人水面车辆(usv)是基于人工智能、运动控制、环境感知等专业技术的无人水面导航智能平台。避障是其自主导航的重要组成部分。虽然无人潜航器工作在水中环境(如监测跟踪、搜救等场景),但其运行环境的动态性和复杂性使得传统方法无法解决无人潜航器的避障问题。本文针对深度强化学习(Deep Reinforcement Learning, DRL)的双延迟深度确定性策略梯度(Twin Delayed Deep Deterministic policy gradient, TD3)算法在非结构化环境下收敛性差以及波浪电流干扰的问题,提出随机行走策略,将算法的预探索策略存入经验池中,加速算法收敛,从而实现USV避障。在动态复杂环境下,无需离线轨迹和轨迹点生成,即可实现从任意起点到给定终点的无碰撞导航。设计了环境预探索策略和算法训练测试的虚拟仿真环境,并给出了奖励函数和训练方法。仿真结果表明,本文提出的算法比原算法更易于收敛,在复杂环境下避障行为表现更好,体现了算法的可行性和有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Measurement & Control
Measurement & Control 工程技术-仪器仪表
自引率
10.00%
发文量
164
审稿时长
>12 weeks
期刊介绍: Measurement and Control publishes peer-reviewed practical and technical research and news pieces from both the science and engineering industry and academia. Whilst focusing more broadly on topics of relevance for practitioners in instrumentation and control, the journal also includes updates on both product and business announcements and information on technical advances.
期刊最新文献
Vibration errors compensation method based on self-feature registration for the 3-D dynamic measurement of metallic sealing ring forming surface Research on imperfect condition-based maintenance strategy based on accelerated degradation process A robot path planning method using improved Harris Hawks optimization algorithm Super-twisting sliding mode finite time control of power-line inspection robot with external disturbances and input delays Research on a novel fault diagnosis method for gearbox based on matrix distance feature
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1