{"title":"Off-policy Q-learning-based Tracking Control for Stochastic Linear Discrete-Time Systems","authors":"X. Liu, Lei Zhang, Yunjian Peng","doi":"10.1109/ICCR55715.2022.10053863","DOIUrl":null,"url":null,"abstract":"In this paper, an adaptive optimal control is investigated for a stochastic linear discrete-time system with multiplicative state-dependent noise and control-dependent noise without knowledge of the system dynamics. With the framework of Q-learning, an off-policy state feedback solution for stochastic linear quadratic tracking (SLQT) problem has been proposed. First, an augmented system of the original system and the reference command generator is established to solve SLQT problem. Then, we present an optimal control by solving stochastic algebraic Riccati equation (SARE). Next, we present the on-policy and off-policy algorithms to achieve an adaptive optimal control without knowing the system dynamics. Finally, a simulation test is finally setup to verify the performance of the proposed adaptive optimal control.","PeriodicalId":441511,"journal":{"name":"2022 4th International Conference on Control and Robotics (ICCR)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 4th International Conference on Control and Robotics (ICCR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCR55715.2022.10053863","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, an adaptive optimal control is investigated for a stochastic linear discrete-time system with multiplicative state-dependent noise and control-dependent noise without knowledge of the system dynamics. With the framework of Q-learning, an off-policy state feedback solution for stochastic linear quadratic tracking (SLQT) problem has been proposed. First, an augmented system of the original system and the reference command generator is established to solve SLQT problem. Then, we present an optimal control by solving stochastic algebraic Riccati equation (SARE). Next, we present the on-policy and off-policy algorithms to achieve an adaptive optimal control without knowing the system dynamics. Finally, a simulation test is finally setup to verify the performance of the proposed adaptive optimal control.