A Novel Path Planning Approach for Mobile Robot in Radioactive Environment Based on Improved Deep Q Network Algorithm

IF 2.2 3区综合性期刊 Q2 MULTIDISCIPLINARY SCIENCES Symmetry-Basel Pub Date : 2023-11-11 DOI:10.3390/sym15112048

Zhiqiang Wu, Yebo Yin , Jie Liu, De Zhang, Jie Chen, Wei Jiang

{"title":"A Novel Path Planning Approach for Mobile Robot in Radioactive Environment Based on Improved Deep Q Network Algorithm","authors":"Zhiqiang Wu, Yebo Yin , Jie Liu, De Zhang, Jie Chen, Wei Jiang","doi":"10.3390/sym15112048","DOIUrl":null,"url":null,"abstract":"The path planning problem of nuclear environment robots refers to finding a collision-free path under the constraints of path length and an accumulated radiation dose. To solve this problem, the Improved Dueling Deep Double Q Network algorithm (ID3QN) based on asymmetric neural network structure was proposed. To address the issues of overestimation and low sample utilization in the traditional Deep Q Network (DQN) algorithm, we optimized the neural network structure and used the double network to estimate action values. We also improved the action selection mechanism, adopted a priority experience replay mechanism, and redesigned the reward function. To evaluate the efficiency of the proposed algorithm, we designed simple and complex radioactive grid environments for comparison. We compared the ID3QN algorithm with traditional algorithms and some deep reinforcement learning algorithms. The simulation results indicate that in the simple radioactive grid environment, the ID3QN algorithm outperforms traditional algorithms such as A*, GA, and ACO in terms of path length and accumulated radiation dosage. Compared to other deep reinforcement learning algorithms, including DQN and some improved DQN algorithms, the ID3QN algorithm reduced the path length by 15.6%, decreased the accumulated radiation dose by 23.5%, and converged approximately 2300 episodes faster. In the complex radioactive grid environment, the ID3QN algorithm also outperformed the A*, GA, ACO, and other deep reinforcement learning algorithms in terms of path length and an accumulated radiation dose. Furthermore, the ID3QN algorithm can plan an obstacle-free optimal path with a low radiation dose even in complex environments. These results demonstrate that the ID3QN algorithm is an effective approach for solving robot path planning problems in nuclear environments, thereby enhancing the safety and reliability of robots in such environments.","PeriodicalId":48874,"journal":{"name":"Symmetry-Basel","volume":"33 5","pages":"0"},"PeriodicalIF":2.2000,"publicationDate":"2023-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Symmetry-Basel","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/sym15112048","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

The path planning problem of nuclear environment robots refers to finding a collision-free path under the constraints of path length and an accumulated radiation dose. To solve this problem, the Improved Dueling Deep Double Q Network algorithm (ID3QN) based on asymmetric neural network structure was proposed. To address the issues of overestimation and low sample utilization in the traditional Deep Q Network (DQN) algorithm, we optimized the neural network structure and used the double network to estimate action values. We also improved the action selection mechanism, adopted a priority experience replay mechanism, and redesigned the reward function. To evaluate the efficiency of the proposed algorithm, we designed simple and complex radioactive grid environments for comparison. We compared the ID3QN algorithm with traditional algorithms and some deep reinforcement learning algorithms. The simulation results indicate that in the simple radioactive grid environment, the ID3QN algorithm outperforms traditional algorithms such as A*, GA, and ACO in terms of path length and accumulated radiation dosage. Compared to other deep reinforcement learning algorithms, including DQN and some improved DQN algorithms, the ID3QN algorithm reduced the path length by 15.6%, decreased the accumulated radiation dose by 23.5%, and converged approximately 2300 episodes faster. In the complex radioactive grid environment, the ID3QN algorithm also outperformed the A*, GA, ACO, and other deep reinforcement learning algorithms in terms of path length and an accumulated radiation dose. Furthermore, the ID3QN algorithm can plan an obstacle-free optimal path with a low radiation dose even in complex environments. These results demonstrate that the ID3QN algorithm is an effective approach for solving robot path planning problems in nuclear environments, thereby enhancing the safety and reliability of robots in such environments.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于改进深度Q网络算法的放射性环境下移动机器人路径规划新方法

核环境机器人的路径规划问题是指在路径长度和累积辐射剂量的约束下，寻找一条无碰撞的路径。为了解决这一问题，提出了基于非对称神经网络结构的改进Dueling Deep双Q Network算法(ID3QN)。针对传统Deep Q Network (DQN)算法中存在的预估过高和样本利用率低的问题，优化了神经网络结构，采用双网络对动作值进行预估。我们还改进了行动选择机制，采用了优先体验重放机制，并重新设计了奖励功能。为了评估该算法的效率，我们设计了简单和复杂的放射性网格环境进行比较。我们将ID3QN算法与传统算法和一些深度强化学习算法进行了比较。仿真结果表明，在简单的放射性网格环境下，ID3QN算法在路径长度和累积辐射剂量方面优于传统算法A*、GA和ACO。与其他深度强化学习算法(包括DQN和一些改进的DQN算法)相比，ID3QN算法的路径长度缩短了15.6%，累积辐射剂量减少了23.5%，收敛速度约为2300集。在复杂的放射性网格环境下，ID3QN算法在路径长度和累积辐射剂量方面也优于A*、GA、ACO等深度强化学习算法。此外，即使在复杂的环境中，ID3QN算法也可以规划出低辐射剂量的无障碍最优路径。这些结果表明，ID3QN算法是解决核环境中机器人路径规划问题的有效方法，从而提高了机器人在核环境中的安全性和可靠性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Symmetry-Basel MULTIDISCIPLINARY SCIENCES-

CiteScore

5.40

自引率

11.10%

发文量

2276

审稿时长

14.88 days

期刊介绍： Symmetry (ISSN 2073-8994), an international and interdisciplinary scientific journal, publishes reviews, regular research papers and short notes. Our aim is to encourage scientists to publish their experimental and theoretical research in as much detail as possible. There is no restriction on the length of the papers. Full experimental and/or methodical details must be provided, so that results can be reproduced.