{"title":"机组调度问题的离散-连续强化学习算法","authors":"Ping Zheng, Yuezu Lv","doi":"10.1109/ICUS55513.2022.9987086","DOIUrl":null,"url":null,"abstract":"With increasing uncertainties in power systems, reinforcement learning evolves as a promising approach for decision and control problems. This paper focuses on the unit commitment and dispatch problem, with startup and shutdown power trajectories involved, investigating it via reinforcement learning. First, we convert the problem into a Markov decision process, where constraints are tackled by projections and elaborate reward. Then, to cope with discrete commitment actions and continuous power outputs simultaneously, a discrete-continuous reinforcement learning algorithm is proposed by combining deep Q-network with soft actor-critic algorithm. Finally, numerical examples are done, verifying the effectiveness of the presented algorithm.","PeriodicalId":345773,"journal":{"name":"2022 IEEE International Conference on Unmanned Systems (ICUS)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Discrete-Continuous Reinforcement Learning Algorithm for Unit Commitment and Dispatch Problem\",\"authors\":\"Ping Zheng, Yuezu Lv\",\"doi\":\"10.1109/ICUS55513.2022.9987086\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With increasing uncertainties in power systems, reinforcement learning evolves as a promising approach for decision and control problems. This paper focuses on the unit commitment and dispatch problem, with startup and shutdown power trajectories involved, investigating it via reinforcement learning. First, we convert the problem into a Markov decision process, where constraints are tackled by projections and elaborate reward. Then, to cope with discrete commitment actions and continuous power outputs simultaneously, a discrete-continuous reinforcement learning algorithm is proposed by combining deep Q-network with soft actor-critic algorithm. Finally, numerical examples are done, verifying the effectiveness of the presented algorithm.\",\"PeriodicalId\":345773,\"journal\":{\"name\":\"2022 IEEE International Conference on Unmanned Systems (ICUS)\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Unmanned Systems (ICUS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICUS55513.2022.9987086\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Unmanned Systems (ICUS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICUS55513.2022.9987086","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Discrete-Continuous Reinforcement Learning Algorithm for Unit Commitment and Dispatch Problem
With increasing uncertainties in power systems, reinforcement learning evolves as a promising approach for decision and control problems. This paper focuses on the unit commitment and dispatch problem, with startup and shutdown power trajectories involved, investigating it via reinforcement learning. First, we convert the problem into a Markov decision process, where constraints are tackled by projections and elaborate reward. Then, to cope with discrete commitment actions and continuous power outputs simultaneously, a discrete-continuous reinforcement learning algorithm is proposed by combining deep Q-network with soft actor-critic algorithm. Finally, numerical examples are done, verifying the effectiveness of the presented algorithm.