{"title":"Teaching Coordination to Selfish Learning Agents in Resource-Constrained Partially Observable Markov Games","authors":"Georgios Tsaousoglou","doi":"10.1109/TAC.2024.3515651","DOIUrl":null,"url":null,"abstract":"Of increasing relevance to engineering systems are problems that include online resource allocation to agents that feature adaptation and learning capabilities. This article considers the case where a coordinator gets to design a resource allocation mechanism (i.e., a bidding-allocation-rewards protocol) to efficiently allocate a resource to selfish agents that try to gain access by learning to communicate strategically. Toward aligning the agents' incentives with the social objective, a critical-value-based mechanism is proposed. Analytic results are presented for a simple, stylized setting, whereas simulation results for a use case with reinforcement learning agents controlling flexible loads in the smart grid demonstrate the mechanism's ability to teach coordinated behavior to the distributed learners.","PeriodicalId":13201,"journal":{"name":"IEEE Transactions on Automatic Control","volume":"70 5","pages":"3449-3455"},"PeriodicalIF":7.0000,"publicationDate":"2024-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automatic Control","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10792660/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Of increasing relevance to engineering systems are problems that include online resource allocation to agents that feature adaptation and learning capabilities. This article considers the case where a coordinator gets to design a resource allocation mechanism (i.e., a bidding-allocation-rewards protocol) to efficiently allocate a resource to selfish agents that try to gain access by learning to communicate strategically. Toward aligning the agents' incentives with the social objective, a critical-value-based mechanism is proposed. Analytic results are presented for a simple, stylized setting, whereas simulation results for a use case with reinforcement learning agents controlling flexible loads in the smart grid demonstrate the mechanism's ability to teach coordinated behavior to the distributed learners.
期刊介绍:
In the IEEE Transactions on Automatic Control, the IEEE Control Systems Society publishes high-quality papers on the theory, design, and applications of control engineering. Two types of contributions are regularly considered:
1) Papers: Presentation of significant research, development, or application of control concepts.
2) Technical Notes and Correspondence: Brief technical notes, comments on published areas or established control topics, corrections to papers and notes published in the Transactions.
In addition, special papers (tutorials, surveys, and perspectives on the theory and applications of control systems topics) are solicited.