{"title":"Reinforcement learning for inverse linear-quadratic dynamic non-cooperative games","authors":"","doi":"10.1016/j.sysconle.2024.105883","DOIUrl":null,"url":null,"abstract":"<div><p>The paper addresses the inverse problem in the case of linear-quadratic discrete-time dynamic non-cooperative games. We consider a game with some unknown cost function parameters, referred to as the observed game, that has a set of known feedback laws constituting a Nash equilibrium. The inverse problem is to find values of the cost function parameters that together with the observed game dynamics form a new game, equivalent to the observed one in the sense that it has the same Nash equilibrium. We present a model-based algorithm to solve this problem. We prove the convergence of the algorithm and show that the given set of feedback laws is a Nash equilibrium for the designed game. We also demonstrate how to generate new games with the required properties without repeatedly running the complete algorithm. Moreover, the model-based algorithm is extended to a model-free version that operates without requiring the knowledge of the system matrices, but relies on the ability to collect sufficient data. Simulation results validate the effectiveness of the proposed algorithms.</p></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167691124001713/pdfft?md5=a1258d39b2325c890d1a133aa921c377&pid=1-s2.0-S0167691124001713-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systems & Control Letters","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167691124001713","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
The paper addresses the inverse problem in the case of linear-quadratic discrete-time dynamic non-cooperative games. We consider a game with some unknown cost function parameters, referred to as the observed game, that has a set of known feedback laws constituting a Nash equilibrium. The inverse problem is to find values of the cost function parameters that together with the observed game dynamics form a new game, equivalent to the observed one in the sense that it has the same Nash equilibrium. We present a model-based algorithm to solve this problem. We prove the convergence of the algorithm and show that the given set of feedback laws is a Nash equilibrium for the designed game. We also demonstrate how to generate new games with the required properties without repeatedly running the complete algorithm. Moreover, the model-based algorithm is extended to a model-free version that operates without requiring the knowledge of the system matrices, but relies on the ability to collect sufficient data. Simulation results validate the effectiveness of the proposed algorithms.
期刊介绍:
Founded in 1981 by two of the pre-eminent control theorists, Roger Brockett and Jan Willems, Systems & Control Letters is one of the leading journals in the field of control theory. The aim of the journal is to allow dissemination of relatively concise but highly original contributions whose high initial quality enables a relatively rapid review process. All aspects of the fields of systems and control are covered, especially mathematically-oriented and theoretical papers that have a clear relevance to engineering, physical and biological sciences, and even economics. Application-oriented papers with sophisticated and rigorous mathematical elements are also welcome.