{"title":"基于非策略强化学习技术的零和微分博弈问题鲁棒控制设计","authors":"Hongji Zhuang, Hongxu Zhu, Shufan Wu, Xiaoliang Wang, Zhongcheng Mu, Qiang Shen","doi":"10.1007/s42401-023-00263-0","DOIUrl":null,"url":null,"abstract":"<div><p>This paper aims to figure out the robust zero-sum differential game problem using an off-policy reinforcement learning technique. The robust system model is first established based on the nominal one. The control strategy is proposed with the asymptotic stability and optimality being strictly proved. The off-policy reinforcement learning technique is built from the Bellman equation to generate the control policy. A potentially inaccurate system dynamic model’s influence is avoided because the outcome is attained from the system data set obtained. It is the first-time application of the off-policy RL algorithm on this robust two-player zero-sum differential game problem. Additionally, the final algorithm’s convergence is demonstrated, and a simulation example is run to confirm its efficacy.</p></div>","PeriodicalId":36309,"journal":{"name":"Aerospace Systems","volume":"7 2","pages":"261 - 269"},"PeriodicalIF":0.0000,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Robust control design for zero-sum differential games problem based on off-policy reinforcement learning technique\",\"authors\":\"Hongji Zhuang, Hongxu Zhu, Shufan Wu, Xiaoliang Wang, Zhongcheng Mu, Qiang Shen\",\"doi\":\"10.1007/s42401-023-00263-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>This paper aims to figure out the robust zero-sum differential game problem using an off-policy reinforcement learning technique. The robust system model is first established based on the nominal one. The control strategy is proposed with the asymptotic stability and optimality being strictly proved. The off-policy reinforcement learning technique is built from the Bellman equation to generate the control policy. A potentially inaccurate system dynamic model’s influence is avoided because the outcome is attained from the system data set obtained. It is the first-time application of the off-policy RL algorithm on this robust two-player zero-sum differential game problem. Additionally, the final algorithm’s convergence is demonstrated, and a simulation example is run to confirm its efficacy.</p></div>\",\"PeriodicalId\":36309,\"journal\":{\"name\":\"Aerospace Systems\",\"volume\":\"7 2\",\"pages\":\"261 - 269\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Aerospace Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s42401-023-00263-0\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Earth and Planetary Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Aerospace Systems","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s42401-023-00263-0","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Earth and Planetary Sciences","Score":null,"Total":0}
Robust control design for zero-sum differential games problem based on off-policy reinforcement learning technique
This paper aims to figure out the robust zero-sum differential game problem using an off-policy reinforcement learning technique. The robust system model is first established based on the nominal one. The control strategy is proposed with the asymptotic stability and optimality being strictly proved. The off-policy reinforcement learning technique is built from the Bellman equation to generate the control policy. A potentially inaccurate system dynamic model’s influence is avoided because the outcome is attained from the system data set obtained. It is the first-time application of the off-policy RL algorithm on this robust two-player zero-sum differential game problem. Additionally, the final algorithm’s convergence is demonstrated, and a simulation example is run to confirm its efficacy.
期刊介绍:
Aerospace Systems provides an international, peer-reviewed forum which focuses on system-level research and development regarding aeronautics and astronautics. The journal emphasizes the unique role and increasing importance of informatics on aerospace. It fills a gap in current publishing coverage from outer space vehicles to atmospheric vehicles by highlighting interdisciplinary science, technology and engineering.
Potential topics include, but are not limited to:
Trans-space vehicle systems design and integration
Air vehicle systems
Space vehicle systems
Near-space vehicle systems
Aerospace robotics and unmanned system
Communication, navigation and surveillance
Aerodynamics and aircraft design
Dynamics and control
Aerospace propulsion
Avionics system
Opto-electronic system
Air traffic management
Earth observation
Deep space exploration
Bionic micro-aircraft/spacecraft
Intelligent sensing and Information fusion