{"title":"基于近端策略优化的复杂海洋环境中感知受限的多usv动态路径规划","authors":"Xizhe Chen, Shihong Yin, Yujing Li, Zhengrong Xiang","doi":"10.1016/j.oceaneng.2025.120907","DOIUrl":null,"url":null,"abstract":"<div><div>This paper addresses the path planning problem for unmanned surface vehicles (USVs) under distributed control in dynamic maritime environments. A novel proximal policy optimization (PPO)-based algorithm is proposed to overcome the challenges posed by limited sensing capabilities and environmental variability. By integrating the reciprocal velocity obstacle method, the algorithm significantly improves obstacle avoidance efficiency while ensuring compliance with the International Regulations for Preventing Collisions at Sea (COLREGs). To address the sparse reward problem inherent in PPO algorithms, a customized reward mechanism is designed, and a bidirectional gated recurrent unit network is introduced to process variable-length observation data caused by dynamic obstacle scenarios. Extensive simulation results demonstrate that the proposed algorithm achieves notable advantages in convergence, robustness, and real-time decision-making. Furthermore, ablation and extended experiments validate the effectiveness and generalization capability of the algorithm, confirming that the multi-USV system can achieve safe, efficient, and COLREGs-compliant path planning in highly dynamic and complex environments.</div></div>","PeriodicalId":19403,"journal":{"name":"Ocean Engineering","volume":"326 ","pages":"Article 120907"},"PeriodicalIF":5.5000,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dynamic path planning for multi-USV in complex ocean environments with limited perception via proximal policy optimization\",\"authors\":\"Xizhe Chen, Shihong Yin, Yujing Li, Zhengrong Xiang\",\"doi\":\"10.1016/j.oceaneng.2025.120907\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This paper addresses the path planning problem for unmanned surface vehicles (USVs) under distributed control in dynamic maritime environments. A novel proximal policy optimization (PPO)-based algorithm is proposed to overcome the challenges posed by limited sensing capabilities and environmental variability. By integrating the reciprocal velocity obstacle method, the algorithm significantly improves obstacle avoidance efficiency while ensuring compliance with the International Regulations for Preventing Collisions at Sea (COLREGs). To address the sparse reward problem inherent in PPO algorithms, a customized reward mechanism is designed, and a bidirectional gated recurrent unit network is introduced to process variable-length observation data caused by dynamic obstacle scenarios. Extensive simulation results demonstrate that the proposed algorithm achieves notable advantages in convergence, robustness, and real-time decision-making. Furthermore, ablation and extended experiments validate the effectiveness and generalization capability of the algorithm, confirming that the multi-USV system can achieve safe, efficient, and COLREGs-compliant path planning in highly dynamic and complex environments.</div></div>\",\"PeriodicalId\":19403,\"journal\":{\"name\":\"Ocean Engineering\",\"volume\":\"326 \",\"pages\":\"Article 120907\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2025-05-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ocean Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0029801825006201\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/3/11 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, CIVIL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ocean Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0029801825006201","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/11 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
Dynamic path planning for multi-USV in complex ocean environments with limited perception via proximal policy optimization
This paper addresses the path planning problem for unmanned surface vehicles (USVs) under distributed control in dynamic maritime environments. A novel proximal policy optimization (PPO)-based algorithm is proposed to overcome the challenges posed by limited sensing capabilities and environmental variability. By integrating the reciprocal velocity obstacle method, the algorithm significantly improves obstacle avoidance efficiency while ensuring compliance with the International Regulations for Preventing Collisions at Sea (COLREGs). To address the sparse reward problem inherent in PPO algorithms, a customized reward mechanism is designed, and a bidirectional gated recurrent unit network is introduced to process variable-length observation data caused by dynamic obstacle scenarios. Extensive simulation results demonstrate that the proposed algorithm achieves notable advantages in convergence, robustness, and real-time decision-making. Furthermore, ablation and extended experiments validate the effectiveness and generalization capability of the algorithm, confirming that the multi-USV system can achieve safe, efficient, and COLREGs-compliant path planning in highly dynamic and complex environments.
期刊介绍:
Ocean Engineering provides a medium for the publication of original research and development work in the field of ocean engineering. Ocean Engineering seeks papers in the following topics.