基于自动信息系统数据和通用对手模仿学习的船舶智能避碰决策模型——深度确定性策略梯度

Proceedings of the 2023 7th International Conference on Machine Learning and Soft Computing Pub Date : 2023-01-05 DOI:10.1145/3583788.3583790

Jiao Liu, Guoyou Shi, Kaige Zhu, Jiahui Shi, Yuchuang Wang

{"title":"基于自动信息系统数据和通用对手模仿学习的船舶智能避碰决策模型——深度确定性策略梯度","authors":"Jiao Liu, Guoyou Shi, Kaige Zhu, Jiahui Shi, Yuchuang Wang","doi":"10.1145/3583788.3583790","DOIUrl":null,"url":null,"abstract":"Aiming at the problems that the current decision-making model of ship collision avoidance does not consider International Regulations for Preventing Collisions at Sea (COLREGS), ship maneuverability, and the need for a lot of training time, combined with the advantages of reinforcement learning and imitation learning, a ship intelligent collision avoidance decision-making model based on Generic Adversary Imitation Learning (GAIL) is proposed: Firstly, the collision avoidance data in Automatic Information System (AIS) data is extracted as expert data; Secondly, in the generator part, the environment model is established based on Mathematical Model Group (MMG) and S-57 chart rendering, and the state space, behaviour space and reward function of reinforcement learning are constructed. The deep deterministic policy gradient (DDPG) is used to interact with the environment model to generate ship trajectory data. At the same time, the generator can constantly learn expert data; Finally, a discriminator can distinguish the expert data from the data generated by the generator is constructed and trained. The model training is completed when the discriminator cannot distinguish the two. In order to verify the performance of the model, AIS data near the South China Sea is used to process and extract collision avoidance decision data, and a ship intelligent collision avoidance decision model based on GAIL is established. After the model converges, the final generated data is compared with the expert data. The experimental results verify that the model proposed in this paper can reproduce the expert collision avoidance trajectory and is a practical decision model of ship collision avoidance.","PeriodicalId":292167,"journal":{"name":"Proceedings of the 2023 7th International Conference on Machine Learning and Soft Computing","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Decision Model of Ship Intelligent Collision Avoidance Based on Automatic Information System Data and Generic Adversary Imitation Learning-Deep Deterministic Policy Gradient\",\"authors\":\"Jiao Liu, Guoyou Shi, Kaige Zhu, Jiahui Shi, Yuchuang Wang\",\"doi\":\"10.1145/3583788.3583790\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Aiming at the problems that the current decision-making model of ship collision avoidance does not consider International Regulations for Preventing Collisions at Sea (COLREGS), ship maneuverability, and the need for a lot of training time, combined with the advantages of reinforcement learning and imitation learning, a ship intelligent collision avoidance decision-making model based on Generic Adversary Imitation Learning (GAIL) is proposed: Firstly, the collision avoidance data in Automatic Information System (AIS) data is extracted as expert data; Secondly, in the generator part, the environment model is established based on Mathematical Model Group (MMG) and S-57 chart rendering, and the state space, behaviour space and reward function of reinforcement learning are constructed. The deep deterministic policy gradient (DDPG) is used to interact with the environment model to generate ship trajectory data. At the same time, the generator can constantly learn expert data; Finally, a discriminator can distinguish the expert data from the data generated by the generator is constructed and trained. The model training is completed when the discriminator cannot distinguish the two. In order to verify the performance of the model, AIS data near the South China Sea is used to process and extract collision avoidance decision data, and a ship intelligent collision avoidance decision model based on GAIL is established. After the model converges, the final generated data is compared with the expert data. The experimental results verify that the model proposed in this paper can reproduce the expert collision avoidance trajectory and is a practical decision model of ship collision avoidance.\",\"PeriodicalId\":292167,\"journal\":{\"name\":\"Proceedings of the 2023 7th International Conference on Machine Learning and Soft Computing\",\"volume\":\"46 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2023 7th International Conference on Machine Learning and Soft Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3583788.3583790\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 7th International Conference on Machine Learning and Soft Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3583788.3583790","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

针对当前船舶避碰决策模型未考虑《国际海上避碰规则》(COLREGS)、船舶机动性以及需要大量训练时间等问题，结合强化学习和模仿学习的优点，提出了一种基于通用对手模仿学习(GAIL)的船舶智能避碰决策模型:首先，将自动信息系统(AIS)数据中的避碰数据提取为专家数据;其次，在生成器部分，基于数学模型组(MMG)和S-57图绘制建立环境模型，构造强化学习的状态空间、行为空间和奖励函数;利用深度确定性策略梯度(deep deterministic policy gradient, DDPG)与环境模型交互生成船舶轨迹数据。同时，生成器可以不断学习专家数据;最后，构造并训练了一个鉴别器来区分专家数据和生成器生成的数据。当鉴别器无法区分两者时，模型训练完成。为了验证模型的性能，利用南海附近AIS数据对避碰决策数据进行处理和提取，建立了基于GAIL的船舶智能避碰决策模型。模型收敛后，将最终生成的数据与专家数据进行比较。实验结果表明，该模型能较好地再现专家避碰轨迹，是一种实用的船舶避碰决策模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Decision Model of Ship Intelligent Collision Avoidance Based on Automatic Information System Data and Generic Adversary Imitation Learning-Deep Deterministic Policy Gradient

Aiming at the problems that the current decision-making model of ship collision avoidance does not consider International Regulations for Preventing Collisions at Sea (COLREGS), ship maneuverability, and the need for a lot of training time, combined with the advantages of reinforcement learning and imitation learning, a ship intelligent collision avoidance decision-making model based on Generic Adversary Imitation Learning (GAIL) is proposed: Firstly, the collision avoidance data in Automatic Information System (AIS) data is extracted as expert data; Secondly, in the generator part, the environment model is established based on Mathematical Model Group (MMG) and S-57 chart rendering, and the state space, behaviour space and reward function of reinforcement learning are constructed. The deep deterministic policy gradient (DDPG) is used to interact with the environment model to generate ship trajectory data. At the same time, the generator can constantly learn expert data; Finally, a discriminator can distinguish the expert data from the data generated by the generator is constructed and trained. The model training is completed when the discriminator cannot distinguish the two. In order to verify the performance of the model, AIS data near the South China Sea is used to process and extract collision avoidance decision data, and a ship intelligent collision avoidance decision model based on GAIL is established. After the model converges, the final generated data is compared with the expert data. The experimental results verify that the model proposed in this paper can reproduce the expert collision avoidance trajectory and is a practical decision model of ship collision avoidance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2023 7th International Conference on Machine Learning and Soft Computing

自引率

0.00%

发文量