Yuxiang Zhang;Xiaoling Liang;Dongyu Li;Shuzhi Sam Ge;Bingzhao Gao;Hong Chen;Tong Heng Lee
{"title":"Reinforcement Learning-Based Time-Synchronized Optimized Control for Affine Systems","authors":"Yuxiang Zhang;Xiaoling Liang;Dongyu Li;Shuzhi Sam Ge;Bingzhao Gao;Hong Chen;Tong Heng Lee","doi":"10.1109/TAI.2024.3420261","DOIUrl":null,"url":null,"abstract":"The approach of (fixed-) time-synchronized control (FTSC) aims at attaining the outcome where all the system state-variables converge to the origin simultaneously/synchronously. This type of outcome can be the highly essential performance desired in various real-world high-precision control applications. Toward this objective, this article proposes and investigates the development of a time-synchronized reinforcement learning algorithm (TSRL) applicable to a particular class of first- and second-order affine nonlinear systems. The approach developed here appropriately incorporates the norm-normalized sign function into the optimal system control design, leveraging on the special properties of this norm-normalized sign function in attaining time-synchronized stability and control. Concurrently, the actor–critic framework in reinforcement learning (RL) is invoked, and the dual quantities of system control and gradient term of the cost function are decomposed with appropriate time-synchronized control items and unknown actor/critic part and to be learned independently. By additionally employing the adaptive dynamic programming technique, the solution of the Hamilton–Jacobi–Bellman equation is iteratively approximated under this actor–critic framework. As an outcome, the proposed TSRL method optimizes the system control while attaining the notable time-synchronized convergence property. The performance and effectiveness of the proposed method are demonstrated to be effectively applicable via detailed numerical studies and on an autonomous vehicle nonlinear system motion control problem.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 10","pages":"5216-5231"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10576055/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The approach of (fixed-) time-synchronized control (FTSC) aims at attaining the outcome where all the system state-variables converge to the origin simultaneously/synchronously. This type of outcome can be the highly essential performance desired in various real-world high-precision control applications. Toward this objective, this article proposes and investigates the development of a time-synchronized reinforcement learning algorithm (TSRL) applicable to a particular class of first- and second-order affine nonlinear systems. The approach developed here appropriately incorporates the norm-normalized sign function into the optimal system control design, leveraging on the special properties of this norm-normalized sign function in attaining time-synchronized stability and control. Concurrently, the actor–critic framework in reinforcement learning (RL) is invoked, and the dual quantities of system control and gradient term of the cost function are decomposed with appropriate time-synchronized control items and unknown actor/critic part and to be learned independently. By additionally employing the adaptive dynamic programming technique, the solution of the Hamilton–Jacobi–Bellman equation is iteratively approximated under this actor–critic framework. As an outcome, the proposed TSRL method optimizes the system control while attaining the notable time-synchronized convergence property. The performance and effectiveness of the proposed method are demonstrated to be effectively applicable via detailed numerical studies and on an autonomous vehicle nonlinear system motion control problem.