{"title":"Multi-Agent Reinforcement Learning for Strategic Bidding in Power Markets","authors":"A. C. Tellidou, A. Bakirtzis, Senior Member","doi":"10.1109/IS.2006.348454","DOIUrl":null,"url":null,"abstract":"In the agent-based simulation discussed in this paper, we study the dynamics of the power market, when suppliers act following a Q-learning based bidding strategy. Power suppliers aim to satisfy two objectives: the maximization of their profit and their utilization rate. To meet with success their goals, they need to acquire a complex behavior by learning through a continuous exploiting and exploring process. Reinforcement learning theory provides a formal framework, along with a family of learning methods. In this paper we use Q-learning algorithm, perhaps the most popular among temporal difference methods. Q-learning offers suppliers the ability to evaluate their actions and to retain the most profitable of them. A five bus power system is used for our case studies; our experiments are contacted with three supplier-agents in all cases but the last one where sine agents participate. The locational marginal pricing (LMP) system serves as the market clearing mechanism","PeriodicalId":116809,"journal":{"name":"2006 3rd International IEEE Conference Intelligent Systems","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 3rd International IEEE Conference Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IS.2006.348454","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22
Abstract
In the agent-based simulation discussed in this paper, we study the dynamics of the power market, when suppliers act following a Q-learning based bidding strategy. Power suppliers aim to satisfy two objectives: the maximization of their profit and their utilization rate. To meet with success their goals, they need to acquire a complex behavior by learning through a continuous exploiting and exploring process. Reinforcement learning theory provides a formal framework, along with a family of learning methods. In this paper we use Q-learning algorithm, perhaps the most popular among temporal difference methods. Q-learning offers suppliers the ability to evaluate their actions and to retain the most profitable of them. A five bus power system is used for our case studies; our experiments are contacted with three supplier-agents in all cases but the last one where sine agents participate. The locational marginal pricing (LMP) system serves as the market clearing mechanism