{"title":"Stochastic Linear Quadratic Game for Discrete-time Systems Based-on Adaptive Dynamic Programming","authors":"Shibo Na, Ruizhuo Song","doi":"10.1109/ICCR55715.2022.10053887","DOIUrl":null,"url":null,"abstract":"In this paper, we proposed an adaptive dynamic programming (ADP) algorithm for discrete time stochastic linear quadratic game without system dynamics. Firstly, we described the problem and converted it into a deterministic form. Then, we solved the Bellman equation to obtain the control gain matrix and disturbance gain matrix when the system dynamics were known. After that, we implemented the ADP algorithm with unknown system through neural networks. Model network, action network, disturbance network and critic network were used to approximate the system model, control gain matrix, disturbance gain matrix and value function respectively. Finally, a simulation example was given to verify the effectiveness of the algorithm.","PeriodicalId":441511,"journal":{"name":"2022 4th International Conference on Control and Robotics (ICCR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 4th International Conference on Control and Robotics (ICCR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCR55715.2022.10053887","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we proposed an adaptive dynamic programming (ADP) algorithm for discrete time stochastic linear quadratic game without system dynamics. Firstly, we described the problem and converted it into a deterministic form. Then, we solved the Bellman equation to obtain the control gain matrix and disturbance gain matrix when the system dynamics were known. After that, we implemented the ADP algorithm with unknown system through neural networks. Model network, action network, disturbance network and critic network were used to approximate the system model, control gain matrix, disturbance gain matrix and value function respectively. Finally, a simulation example was given to verify the effectiveness of the algorithm.