{"title":"A Scalable Game Theoretic Approach for Coordination of Multiple Dynamic Systems","authors":"Mostafa M. Shibl;Vijay Gupta","doi":"10.1109/LCSYS.2024.3501155","DOIUrl":null,"url":null,"abstract":"Learning in games provides a powerful framework to design control policies for self-interested agents that may be coupled through their dynamics, costs, or constraints. We consider the case where the dynamics of the coupled system can be modeled as a Markov potential game. In this case, distributed learning ensures agents’ control policies converge to a Nash equilibrium. However, standard algorithms like natural policy gradient require global state and action knowledge, which does not scale well with more agents. We show that by limiting information flow to local neighborhoods, we can still converge to near-optimal policies. If a game’s global cost function can be decomposed into local costs that align with agent policies at equilibrium, this approach benefits team coordination. We demonstrate this with a sensor coverage problem.","PeriodicalId":37235,"journal":{"name":"IEEE Control Systems Letters","volume":"8 ","pages":"2535-2540"},"PeriodicalIF":2.4000,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Control Systems Letters","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10755096/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Learning in games provides a powerful framework to design control policies for self-interested agents that may be coupled through their dynamics, costs, or constraints. We consider the case where the dynamics of the coupled system can be modeled as a Markov potential game. In this case, distributed learning ensures agents’ control policies converge to a Nash equilibrium. However, standard algorithms like natural policy gradient require global state and action knowledge, which does not scale well with more agents. We show that by limiting information flow to local neighborhoods, we can still converge to near-optimal policies. If a game’s global cost function can be decomposed into local costs that align with agent policies at equilibrium, this approach benefits team coordination. We demonstrate this with a sensor coverage problem.