Yue Zhang, Tianze Zhang, Yibin Li, Yinghao Zhuang, Daichao Wang
{"title":"A novel reward-shaping-based soft actor–critic for random trajectory tracking of AUVs","authors":"Yue Zhang, Tianze Zhang, Yibin Li, Yinghao Zhuang, Daichao Wang","doi":"10.1016/j.oceaneng.2025.120505","DOIUrl":null,"url":null,"abstract":"<div><div>Current research on autonomous underwater vehicles (AUVs) trajectory tracking mostly focuses on single trajectories, and there is limited research on the generalization of trajectory tracking based on reinforcement learning (RL). This paper introduces a novel RL controller for three-dimensional random trajectory tracking. In this context, a random trajectory includes random obstacles and random reference velocities on the z-axis, and it is designed to improve generalization. The controller integrates value network-based reward shaping (VNRS) with soft actor–critic (SAC). VNRS utilizes a multi-layer perceptron to evaluate the state, which is different from previous work. Simulations demonstrate that VNRS-SAC outperforms SAC in terms of stability and control accuracy. Generalization scenarios, including ocean currents, multiple obstacles, and various trajectories, reveal that the VNRS-SAC controller possesses certain generalization capabilities. Compared with classical S-plane and model predictive control, the VNRS-SAC controller achieves higher control accuracy.</div></div>","PeriodicalId":19403,"journal":{"name":"Ocean Engineering","volume":"322 ","pages":"Article 120505"},"PeriodicalIF":4.6000,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ocean Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0029801825002203","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
引用次数: 0
Abstract
Current research on autonomous underwater vehicles (AUVs) trajectory tracking mostly focuses on single trajectories, and there is limited research on the generalization of trajectory tracking based on reinforcement learning (RL). This paper introduces a novel RL controller for three-dimensional random trajectory tracking. In this context, a random trajectory includes random obstacles and random reference velocities on the z-axis, and it is designed to improve generalization. The controller integrates value network-based reward shaping (VNRS) with soft actor–critic (SAC). VNRS utilizes a multi-layer perceptron to evaluate the state, which is different from previous work. Simulations demonstrate that VNRS-SAC outperforms SAC in terms of stability and control accuracy. Generalization scenarios, including ocean currents, multiple obstacles, and various trajectories, reveal that the VNRS-SAC controller possesses certain generalization capabilities. Compared with classical S-plane and model predictive control, the VNRS-SAC controller achieves higher control accuracy.
期刊介绍:
Ocean Engineering provides a medium for the publication of original research and development work in the field of ocean engineering. Ocean Engineering seeks papers in the following topics.