Tong Ye , Yuping Huang , Weijia Yang , Guotian Cai , Yuyao Yang , Feng Pan
{"title":"Safe multi-agent deep reinforcement learning for decentralized low-carbon operation in active distribution networks and multi-microgrids","authors":"Tong Ye , Yuping Huang , Weijia Yang , Guotian Cai , Yuyao Yang , Feng Pan","doi":"10.1016/j.apenergy.2025.125609","DOIUrl":null,"url":null,"abstract":"<div><div>Due to fundamental differences in operational entities between distribution networks and microgrids, the equitable allocation of carbon responsibilities remains challenging. Furthermore, achieving real-time, efficient, and secure low-carbon economic dispatch in decentralized multi-entities continues to face obstacles. Therefore, we propose a co-optimization framework for Active Distribution Networks (ADNs) and multi-Microgrids (MMGs) to improve operational efficiency and reduce carbon emissions through adaptive coordination and decision-making. To facilitate decentralized low-carbon decision-making, we introduce the Spatiotemporal Carbon Intensity Equalization Method (STCIEM). This method ensures privacy and fairness by processing local data and equitably distributing carbon responsibilities. Additionally, we propose a non-cooperative optimization strategy that enables entities to optimize their operations independently while considering both economic and environmental interests. To address the challenges of real-time decision-making and the non-convex nature of low-carbon optimization inherent in traditional approaches, we have developed the Enhanced Action Projection Multi-Agent Twin Delayed Deep Deterministic Policy Gradient (EAP-MATD3) algorithm. This algorithm enhances the actor's objective to address the actor-critic mismatch problem, thereby outperforming conventional safe multi-agent deep reinforcement learning methods by generating optimized actions that adhere to physical system constraints. Experiments conducted on the modified IEEE 33-bus network and IEEE 123-bus network demonstrate the superiority of our approach in effectively balancing economic and environmental objectives within complex energy systems.</div></div>","PeriodicalId":246,"journal":{"name":"Applied Energy","volume":"387 ","pages":"Article 125609"},"PeriodicalIF":10.1000,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Energy","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306261925003393","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
引用次数: 0
Abstract
Due to fundamental differences in operational entities between distribution networks and microgrids, the equitable allocation of carbon responsibilities remains challenging. Furthermore, achieving real-time, efficient, and secure low-carbon economic dispatch in decentralized multi-entities continues to face obstacles. Therefore, we propose a co-optimization framework for Active Distribution Networks (ADNs) and multi-Microgrids (MMGs) to improve operational efficiency and reduce carbon emissions through adaptive coordination and decision-making. To facilitate decentralized low-carbon decision-making, we introduce the Spatiotemporal Carbon Intensity Equalization Method (STCIEM). This method ensures privacy and fairness by processing local data and equitably distributing carbon responsibilities. Additionally, we propose a non-cooperative optimization strategy that enables entities to optimize their operations independently while considering both economic and environmental interests. To address the challenges of real-time decision-making and the non-convex nature of low-carbon optimization inherent in traditional approaches, we have developed the Enhanced Action Projection Multi-Agent Twin Delayed Deep Deterministic Policy Gradient (EAP-MATD3) algorithm. This algorithm enhances the actor's objective to address the actor-critic mismatch problem, thereby outperforming conventional safe multi-agent deep reinforcement learning methods by generating optimized actions that adhere to physical system constraints. Experiments conducted on the modified IEEE 33-bus network and IEEE 123-bus network demonstrate the superiority of our approach in effectively balancing economic and environmental objectives within complex energy systems.
期刊介绍:
Applied Energy serves as a platform for sharing innovations, research, development, and demonstrations in energy conversion, conservation, and sustainable energy systems. The journal covers topics such as optimal energy resource use, environmental pollutant mitigation, and energy process analysis. It welcomes original papers, review articles, technical notes, and letters to the editor. Authors are encouraged to submit manuscripts that bridge the gap between research, development, and implementation. The journal addresses a wide spectrum of topics, including fossil and renewable energy technologies, energy economics, and environmental impacts. Applied Energy also explores modeling and forecasting, conservation strategies, and the social and economic implications of energy policies, including climate change mitigation. It is complemented by the open-access journal Advances in Applied Energy.