{"title":"基于自适应动态规划的离散系统随机线性二次对策","authors":"Shibo Na, Ruizhuo Song","doi":"10.1109/ICCR55715.2022.10053887","DOIUrl":null,"url":null,"abstract":"In this paper, we proposed an adaptive dynamic programming (ADP) algorithm for discrete time stochastic linear quadratic game without system dynamics. Firstly, we described the problem and converted it into a deterministic form. Then, we solved the Bellman equation to obtain the control gain matrix and disturbance gain matrix when the system dynamics were known. After that, we implemented the ADP algorithm with unknown system through neural networks. Model network, action network, disturbance network and critic network were used to approximate the system model, control gain matrix, disturbance gain matrix and value function respectively. Finally, a simulation example was given to verify the effectiveness of the algorithm.","PeriodicalId":441511,"journal":{"name":"2022 4th International Conference on Control and Robotics (ICCR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Stochastic Linear Quadratic Game for Discrete-time Systems Based-on Adaptive Dynamic Programming\",\"authors\":\"Shibo Na, Ruizhuo Song\",\"doi\":\"10.1109/ICCR55715.2022.10053887\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we proposed an adaptive dynamic programming (ADP) algorithm for discrete time stochastic linear quadratic game without system dynamics. Firstly, we described the problem and converted it into a deterministic form. Then, we solved the Bellman equation to obtain the control gain matrix and disturbance gain matrix when the system dynamics were known. After that, we implemented the ADP algorithm with unknown system through neural networks. Model network, action network, disturbance network and critic network were used to approximate the system model, control gain matrix, disturbance gain matrix and value function respectively. Finally, a simulation example was given to verify the effectiveness of the algorithm.\",\"PeriodicalId\":441511,\"journal\":{\"name\":\"2022 4th International Conference on Control and Robotics (ICCR)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 4th International Conference on Control and Robotics (ICCR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCR55715.2022.10053887\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 4th International Conference on Control and Robotics (ICCR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCR55715.2022.10053887","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Stochastic Linear Quadratic Game for Discrete-time Systems Based-on Adaptive Dynamic Programming
In this paper, we proposed an adaptive dynamic programming (ADP) algorithm for discrete time stochastic linear quadratic game without system dynamics. Firstly, we described the problem and converted it into a deterministic form. Then, we solved the Bellman equation to obtain the control gain matrix and disturbance gain matrix when the system dynamics were known. After that, we implemented the ADP algorithm with unknown system through neural networks. Model network, action network, disturbance network and critic network were used to approximate the system model, control gain matrix, disturbance gain matrix and value function respectively. Finally, a simulation example was given to verify the effectiveness of the algorithm.