Cooperative price-based demand response program for multiple aggregators based on multi-agent reinforcement learning and Shapley-value

IF 4.8 2区工程技术 Q2 ENERGY & FUELS Sustainable Energy Grids & Networks Pub Date : 2024-11-09 DOI:10.1016/j.segan.2024.101560

Alejandro Fraija , Nilson Henao , Kodjo Agbossou , Sousso Kelouwani , Michaël Fournier

{"title":"Cooperative price-based demand response program for multiple aggregators based on multi-agent reinforcement learning and Shapley-value","authors":"Alejandro Fraija , Nilson Henao , Kodjo Agbossou , Sousso Kelouwani , Michaël Fournier","doi":"10.1016/j.segan.2024.101560","DOIUrl":null,"url":null,"abstract":"<div><div>Demand response (DR) plays an essential role in power system management. To facilitate the implementation of these techniques, many aggregators have appeared in response as new mediating entities in the electricity market. These actors exploit the technologies to engage customers in DR programs, offering grid services like load scheduling. However, the growing number of aggregators has become a new challenge, making it difficult for utilities to manage the load scheduling problem. This paper presents a multi-agent reinforcement Learning (MARL) approach to a price-based DR program for multiple aggregators. A dynamic pricing scheme based on discounts is proposed to encourage residential customers to change their consumption patterns. This strategy is based on a cooperative framework for a set of DR Aggregators (DRAs). The DRAs take advantage of a reward offered by a Distribution System Operator (DSO) for performing a peak-shaving over the total system aggregated demand. Furthermore, a Shapley-Value-based reward sharing mechanism is implemented to fairly determine the individual contribution and calculate the individual reward for each DRA. Simulation results verify the merits of the proposed model for a multi-aggregator system, improving DRAs’ pricing strategies considering the overall objectives of the system. Consumption peaks were managed by reducing the Peak-to-Average Ratio (PAR) by 15%, and the MARL mechanism’s performance was improved in terms of reward function maximization and convergence time, the latter being reduced by 29%.</div></div>","PeriodicalId":56142,"journal":{"name":"Sustainable Energy Grids & Networks","volume":"40 ","pages":"Article 101560"},"PeriodicalIF":4.8000,"publicationDate":"2024-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sustainable Energy Grids & Networks","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S235246772400290X","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENERGY & FUELS","Score":null,"Total":0}

引用次数: 0

Abstract

Demand response (DR) plays an essential role in power system management. To facilitate the implementation of these techniques, many aggregators have appeared in response as new mediating entities in the electricity market. These actors exploit the technologies to engage customers in DR programs, offering grid services like load scheduling. However, the growing number of aggregators has become a new challenge, making it difficult for utilities to manage the load scheduling problem. This paper presents a multi-agent reinforcement Learning (MARL) approach to a price-based DR program for multiple aggregators. A dynamic pricing scheme based on discounts is proposed to encourage residential customers to change their consumption patterns. This strategy is based on a cooperative framework for a set of DR Aggregators (DRAs). The DRAs take advantage of a reward offered by a Distribution System Operator (DSO) for performing a peak-shaving over the total system aggregated demand. Furthermore, a Shapley-Value-based reward sharing mechanism is implemented to fairly determine the individual contribution and calculate the individual reward for each DRA. Simulation results verify the merits of the proposed model for a multi-aggregator system, improving DRAs’ pricing strategies considering the overall objectives of the system. Consumption peaks were managed by reducing the Peak-to-Average Ratio (PAR) by 15%, and the MARL mechanism’s performance was improved in terms of reward function maximization and convergence time, the latter being reduced by 29%.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于多代理强化学习和 Shapley 值的多聚合器合作价格需求响应计划

需求响应（DR）在电力系统管理中发挥着至关重要的作用。为了促进这些技术的实施，许多聚合器作为电力市场的新中介实体应运而生。这些参与者利用技术让客户参与需求响应计划，并提供负荷调度等电网服务。然而，越来越多的聚合器已成为新的挑战，使电力公司难以管理负荷调度问题。本文提出了一种多代理强化学习（MARL）方法，为多个聚合器提供基于价格的 DR 计划。本文提出了一种基于折扣的动态定价方案，以鼓励住宅用户改变其消费模式。该策略基于一组 DR 聚合器 (DRA) 的合作框架。DRA 利用配电系统运营商 (DSO) 提供的奖励，对系统总需求进行削峰。此外，还实施了基于 Shapley-Value 的奖励共享机制，以公平确定每个 DRA 的个人贡献并计算个人奖励。仿真结果验证了针对多聚合器系统提出的模型的优点，在考虑系统总体目标的情况下改进了 DRA 的定价策略。通过将峰均比（PAR）降低 15%，消费峰值得到了控制，MARL 机制在奖励函数最大化和收敛时间方面的性能也得到了改善，后者缩短了 29%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Sustainable Energy Grids & Networks Energy-Energy Engineering and Power Technology

CiteScore

7.90

自引率

13.00%

发文量

206

审稿时长

49 days

期刊介绍： Sustainable Energy, Grids and Networks (SEGAN)is an international peer-reviewed publication for theoretical and applied research dealing with energy, information grids and power networks, including smart grids from super to micro grid scales. SEGAN welcomes papers describing fundamental advances in mathematical, statistical or computational methods with application to power and energy systems, as well as papers on applications, computation and modeling in the areas of electrical and energy systems with coupled information and communication technologies.