Pub Date : 2026-01-12DOI: 10.1016/j.apenergy.2026.127358
Yihao Meng , Yuan Zou , Guodong Du , Xudong Zhang , Zhaolong Zhang
Driven by the low-carbon economy imperative, charging stations (CSs) integrated with renewable energy microgrids (MGs) have gained significant attention as critical infrastructure for advancing transportation electrification. However, the integration combines their inherent uncertainties, leading to suboptimal operational performance. To address this challenge, a cost-oriented bi-layer dispatch framework is developed by incorporating proximal policy optimization (PPO) into a model predictive control (MPC) foundation. This framework simultaneously optimizes the microgrid-integrated charging stations' (MGCSs) low-carbon economic operating costs and the charging fulfillment of electric vehicles (EVs). The proposed framework bypasses the explicit prediction of uncertainties inherent in the traditional “predict-then-optimize” framework and reduces MPC's reliance on precise parameter settings. Additionally, a power allocation strategy based on a cooperative game model (CGM) is established, which ensures fair charging among EVs through dynamic urgency indicators and enables a closed-loop optimization for maximizing charging fulfillment through the aggregated urgency feedback. Simulations using real-world EV data demonstrate the effectiveness of the proposed framework, outperforming various MPC-based benchmarks.
{"title":"Low-carbon economic dispatch for microgrid-integrated charging stations: A cost-oriented bi-layer optimization framework","authors":"Yihao Meng , Yuan Zou , Guodong Du , Xudong Zhang , Zhaolong Zhang","doi":"10.1016/j.apenergy.2026.127358","DOIUrl":"10.1016/j.apenergy.2026.127358","url":null,"abstract":"<div><div>Driven by the low-carbon economy imperative, charging stations (CSs) integrated with renewable energy microgrids (MGs) have gained significant attention as critical infrastructure for advancing transportation electrification. However, the integration combines their inherent uncertainties, leading to suboptimal operational performance. To address this challenge, a cost-oriented bi-layer dispatch framework is developed by incorporating proximal policy optimization (PPO) into a model predictive control (MPC) foundation. This framework simultaneously optimizes the microgrid-integrated charging stations' (MGCSs) low-carbon economic operating costs and the charging fulfillment of electric vehicles (EVs). The proposed framework bypasses the explicit prediction of uncertainties inherent in the traditional “predict-then-optimize” framework and reduces MPC's reliance on precise parameter settings. Additionally, a power allocation strategy based on a cooperative game model (CGM) is established, which ensures fair charging among EVs through dynamic urgency indicators<!--> <!--> and enables a closed-loop optimization for maximizing charging fulfillment through the aggregated urgency feedback. Simulations using real-world EV data demonstrate the effectiveness of the proposed framework, outperforming various MPC-based benchmarks.</div></div>","PeriodicalId":246,"journal":{"name":"Applied Energy","volume":"407 ","pages":"Article 127358"},"PeriodicalIF":11.0,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145974201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1016/j.apenergy.2026.127353
Feifan Huang, Yeqing Ling, Long Wang, Peng Yuan, Li Sun, Tao Li
Solid oxide fuel cells (SOFCs) are promising power sources for unmanned aerial vehicles (UAVs), yet their widespread application is hindered by the inherent contradiction between the demand for rapid dynamic response and the constraints of heat and mass transfer. Traditional decoupled control strategies struggle to resolve the complex, coupled conflicts between heat and mass transfer under high dynamic loads. To address this challenge, this study utilizes a validated multi-physics model to develop and validate an advanced integrated synergistic control strategy. The investigation first reveals the highly asymmetric dynamic response of the single cell to flow velocity adjustments and innovatively proposes that this response signature can be used for online efficiency optimization. Furthermore, the study demonstrates that single feedforward strategies are inherently flawed: aggressive pre-heating induces power overshoot and fuel starvation, while a simple flow velocity increase prolongs thermal stabilization due to its convective cooling effect. To resolve this dilemma, an integrated synergistic control strategy that intelligently couples active pre-heating and dynamic flow velocity is proposed. Validation under a typical UAV mission profile shows that, compared to baseline control, the synergistic strategy shortens the stabilization time by 90 %, completely eliminates dynamic undershoot, delivers a steady-state power output up to 70 % higher during maneuvering, and reduces the peak thermal stress by over 35 %. Additionally, the superiority of a multi-channel anode design in mitigating coking risk is confirmed. Overall, the proposed synergistic strategy effectively resolves the conflict between rapid response and stability offering critical insights and a practical framework for managing transient thermos-electrochemical couplings which constitutes a necessary step toward realizing high performance SOFC propulsion in actual UAV flight missions.
{"title":"Lightweight and efficient tubular SOFC design for UAV applications: multi-physics modeling and performance optimization","authors":"Feifan Huang, Yeqing Ling, Long Wang, Peng Yuan, Li Sun, Tao Li","doi":"10.1016/j.apenergy.2026.127353","DOIUrl":"10.1016/j.apenergy.2026.127353","url":null,"abstract":"<div><div>Solid oxide fuel cells (SOFCs) are promising power sources for unmanned aerial vehicles (UAVs), yet their widespread application is hindered by the inherent contradiction between the demand for rapid dynamic response and the constraints of heat and mass transfer. Traditional decoupled control strategies struggle to resolve the complex, coupled conflicts between heat and mass transfer under high dynamic loads. To address this challenge, this study utilizes a validated multi-physics model to develop and validate an advanced integrated synergistic control strategy. The investigation first reveals the highly asymmetric dynamic response of the single cell to flow velocity adjustments and innovatively proposes that this response signature can be used for online efficiency optimization. Furthermore, the study demonstrates that single feedforward strategies are inherently flawed: aggressive pre-heating induces power overshoot and fuel starvation, while a simple flow velocity increase prolongs thermal stabilization due to its convective cooling effect. To resolve this dilemma, an integrated synergistic control strategy that intelligently couples active pre-heating and dynamic flow velocity is proposed. Validation under a typical UAV mission profile shows that, compared to baseline control, the synergistic strategy shortens the stabilization time by 90 %, completely eliminates dynamic undershoot, delivers a steady-state power output up to 70 % higher during maneuvering, and reduces the peak thermal stress by over 35 %. Additionally, the superiority of a multi-channel anode design in mitigating coking risk is confirmed. Overall, the proposed synergistic strategy effectively resolves the conflict between rapid response and stability offering critical insights and a practical framework for managing transient thermos-electrochemical couplings which constitutes a necessary step toward realizing high performance SOFC propulsion in actual UAV flight missions.</div></div>","PeriodicalId":246,"journal":{"name":"Applied Energy","volume":"408 ","pages":"Article 127353"},"PeriodicalIF":11.0,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145975429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1016/j.apenergy.2026.127411
Rajanie Prabha , Zhecheng Wang , Chad Zanocco , June Flora , Ram Rajagopal
The transition to renewable energy is essential for achieving sustainable development goals and mitigating the impacts of climate change. Solar energy, in particular, has emerged as a viable and cost-effective option, yet its adoption exhibits significant disparities. While existing research has extensively explored equity issues in energy access, a critical gap remains in understanding the broader distributional disparities in residential rooftop solar adoption at local, state, and national scales. This study addresses this gap by quantifying spatial inequalities in rooftop solar adoption, leveraging a novel dataset created using vision transformers to identify rooftop-installed photovoltaic (PV) systems across the United States through 2022. By employing Lorenz curves, an approach traditionally used to analyze income inequality, the distribution of residential rooftop solar at multiple geographic and temporal scales is assessed to uncover areas with disproportionate adoption levels. This analysis identifies key factors that drive these inequalities, including economic conditions, policy frameworks, and demographic characteristics. The findings reveal pronounced inequalities in solar adoption within and between counties and states, despite a doubling of rooftop PV installations nationwide between 2017 and 2022 from 1.47 to 2.95 million systems. While policies and incentives have helped some disadvantaged communities overcome barriers related to lower household income, they have often fallen short of achieving broader spatial equality. The dataset, DeepSolar-3M, has been released as an open resource to support policymakers, researchers, energy analysts, and others in advancing data-driven energy solutions.
{"title":"Nationwide insights on solar deployment trends and spatial inequalities revealed by vision transformer models","authors":"Rajanie Prabha , Zhecheng Wang , Chad Zanocco , June Flora , Ram Rajagopal","doi":"10.1016/j.apenergy.2026.127411","DOIUrl":"10.1016/j.apenergy.2026.127411","url":null,"abstract":"<div><div>The transition to renewable energy is essential for achieving sustainable development goals and mitigating the impacts of climate change. Solar energy, in particular, has emerged as a viable and cost-effective option, yet its adoption exhibits significant disparities. While existing research has extensively explored equity issues in energy access, a critical gap remains in understanding the broader distributional disparities in residential rooftop solar adoption at local, state, and national scales. This study addresses this gap by quantifying spatial inequalities in rooftop solar adoption, leveraging a novel dataset created using vision transformers to identify rooftop-installed photovoltaic (PV) systems across the United States through 2022. By employing Lorenz curves, an approach traditionally used to analyze income inequality, the distribution of residential rooftop solar at multiple geographic and temporal scales is assessed to uncover areas with disproportionate adoption levels. This analysis identifies key factors that drive these inequalities, including economic conditions, policy frameworks, and demographic characteristics. The findings reveal pronounced inequalities in solar adoption within and between counties and states, despite a doubling of rooftop PV installations nationwide between 2017 and 2022 from 1.47 to 2.95 million systems. While policies and incentives have helped some disadvantaged communities overcome barriers related to lower household income, they have often fallen short of achieving broader spatial equality. The dataset, DeepSolar-3M, has been released as an open resource to support policymakers, researchers, energy analysts, and others in advancing data-driven energy solutions.</div></div>","PeriodicalId":246,"journal":{"name":"Applied Energy","volume":"408 ","pages":"Article 127411"},"PeriodicalIF":11.0,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145947753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1016/j.apenergy.2025.127343
Guotao Wang , Zhenjia Lin , Xiaoqing Zhong , Haoran Ji , Peng Li , Yuntian Chen , Jinyue Yan
Addressing uncertainties in power systems is critical for enhancing renewable energy integration and ensuring overall system reliability. Existing research typically treats forecasting and optimization as separate processes, without effectively integrating the impact of predictive accuracy on the robustness and quality of downstream energy management decisions. To fill this gap, this study proposes a novel framework, Artificial Intelligence for Robust Optimization (AIROpti), which tightly integrates a forecasting model with a subsequent data-driven robust optimization model. Two cases are examined: a distributed energy system comprising twenty households and a market level energy system. Both analyses account for uncertainty in demand and electricity prices. Regarding predictive performance, the day ahead load forecasting model attains a symmetric mean absolute percentage error (SMAPE) of 17.99 % for the aggregated demand across twenty households and 3.82 % at the market level, demonstrating high accuracy. However, our experiments also show that improved forecast accuracy does not necessarily translate into more robust downstream energy management. AIROpti yields a 25.94 % to 44.34 % reduction in regret relative to high accuracy prediction models and a 3.99 % to 75.74 % reduction relative to traditional decision-focused learning baselines. Moreover, it enhances the robustness of operational strategies, thereby reducing worst case performance bounds by 13.77 % and 59.71 % across the two cases. These results highlight AIROpti's ability to produce forecasts that prioritize the robustness of energy management rather than merely achieving higher prediction accuracy.
{"title":"Decision-focused learning integrated with data-driven robust optimization for energy management","authors":"Guotao Wang , Zhenjia Lin , Xiaoqing Zhong , Haoran Ji , Peng Li , Yuntian Chen , Jinyue Yan","doi":"10.1016/j.apenergy.2025.127343","DOIUrl":"10.1016/j.apenergy.2025.127343","url":null,"abstract":"<div><div>Addressing uncertainties in power systems is critical for enhancing renewable energy integration and ensuring overall system reliability. Existing research typically treats forecasting and optimization as separate processes, without effectively integrating the impact of predictive accuracy on the robustness and quality of downstream energy management decisions. To fill this gap, this study proposes a novel framework, Artificial Intelligence for Robust Optimization (AIROpti), which tightly integrates a forecasting model with a subsequent data-driven robust optimization model. Two cases are examined: a distributed energy system comprising twenty households and a market level energy system. Both analyses account for uncertainty in demand and electricity prices. Regarding predictive performance, the day ahead load forecasting model attains a symmetric mean absolute percentage error (SMAPE) of 17.99 % for the aggregated demand across twenty households and 3.82 % at the market level, demonstrating high accuracy. However, our experiments also show that improved forecast accuracy does not necessarily translate into more robust downstream energy management. AIROpti yields a 25.94 % to 44.34 % reduction in regret relative to high accuracy prediction models and a 3.99 % to 75.74 % reduction relative to traditional decision-focused learning baselines. Moreover, it enhances the robustness of operational strategies, thereby reducing worst case performance bounds by 13.77 % and 59.71 % across the two cases. These results highlight AIROpti's ability to produce forecasts that prioritize the robustness of energy management rather than merely achieving higher prediction accuracy.</div></div>","PeriodicalId":246,"journal":{"name":"Applied Energy","volume":"408 ","pages":"Article 127343"},"PeriodicalIF":11.0,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145947897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1016/j.apenergy.2026.127374
Shangyang He , Suhan Zhang , Yuanzheng Li , Wei Gu , Chi-yung Chung
The global decarbonization is driving integrated energy systems (IES) toward more efficient and low-carbon operations, with tighter coupling between natural gas systems (NGS) and electric power systems (EPS) to accommodate diverse renewable sources and energy carriers. The bidirectional energy flow problem in IES with a partial differential algebraic equations (PDAE) form remains a challenge for model-based solvers due to privacy concerns and computational complexity, as it may require full parameters and dozens of minutes or hours to obtain a feasible solution in large-scale systems. To address this obstacle, this study proposes a physics-informed neural operator for energy flow calculations in IES. A novel Differential-Algebraic Gas Flow Neural Operator (DAGFNO) is proposed to embed physical constraints of PDAE into the neural operator, which not only obtains accurate heterogeneous gas states but also provides a privacy-preserved interface for EPS analysis. Besides DAGFNO, we also developed a novel Masked Differential and Algebraic Coupling Constraint Loss function (MDACloss) to represent the degree of constraint violation and enable its parallel computing ability through the masking technique. By doing so, the MDACloss could guarantee the satisfaction of constraints in the energy flow calculation of IES obtained by the DAGFNO as much as possible. Case studies on two NGS and EPS coupled IESs reveal the effectiveness of the proposed method.
{"title":"Physics-informed neural network for dynamic energy flow calculation in integrated electricity and gas systems","authors":"Shangyang He , Suhan Zhang , Yuanzheng Li , Wei Gu , Chi-yung Chung","doi":"10.1016/j.apenergy.2026.127374","DOIUrl":"10.1016/j.apenergy.2026.127374","url":null,"abstract":"<div><div>The global decarbonization is driving integrated energy systems (IES) toward more efficient and low-carbon operations, with tighter coupling between natural gas systems (NGS) and electric power systems (EPS) to accommodate diverse renewable sources and energy carriers. The bidirectional energy flow problem in IES with a partial differential algebraic equations (PDAE) form remains a challenge for model-based solvers due to privacy concerns and computational complexity, as it may require full parameters and dozens of minutes or hours to obtain a feasible solution in large-scale systems. To address this obstacle, this study proposes a physics-informed neural operator for energy flow calculations in IES. A novel Differential-Algebraic Gas Flow Neural Operator (DAGFNO) is proposed to embed physical constraints of PDAE into the neural operator, which not only obtains accurate heterogeneous gas states but also provides a privacy-preserved interface for EPS analysis. Besides DAGFNO, we also developed a novel Masked Differential and Algebraic Coupling Constraint Loss function (MDACloss) to represent the degree of constraint violation and enable its parallel computing ability through the masking technique. By doing so, the MDACloss could guarantee the satisfaction of constraints in the energy flow calculation of IES obtained by the DAGFNO as much as possible. Case studies on two NGS and EPS coupled IESs reveal the effectiveness of the proposed method.</div></div>","PeriodicalId":246,"journal":{"name":"Applied Energy","volume":"407 ","pages":"Article 127374"},"PeriodicalIF":11.0,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145974079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Extreme events can trigger coupled failures across distribution, information, and traffic networks, thereby compromising the safe operation of distribution networks. Mobile energy storage systems (MESS) provide spatiotemporal flexibility in energy supply and operate in a complementary manner to fixed power sources. Unmanned aerial vehicles (UAVs) offer mobile communication capabilities and facilitate rapid communication recovery. Additionally, fault repair can achieve stepwise load restoration through dynamic network reconfiguration. However, the integrated scheduling of these emergency resources across interdependent systems remains an urgent research challenge. Therefore, this paper proposes a post-disaster load restoration strategy for distribution networks, incorporating multi-resource collaborative scheduling under cyber-physical-traffic coupled failure scenarios. First, a cyber-physical-traffic coupled failure architecture is established. Second, the impact of traffic network failures on vehicle routing is considered, and Dijkstra's algorithm is employed to compute travel times. Simultaneously, to address the reduction in situational awareness and control caused by information network failures, a Location Set Covering Problem (LSCP) algorithm is adopted to optimize UAV site selection. A collaborative scheduling model is subsequently developed, integrating ESS, GAS, MESS, UAVs, and fault repair. The model aims to minimize both load shedding and the cost of emergency resource scheduling, using a multi-timeframe optimization approach to dynamically determine MESS deployment locations and fault repair sequences. Finally, simulations based on a modified IEEE-33-bus distribution system demonstrate that the proposed strategy can effectively reduce post-disaster load losses and shorten load restoration times.
{"title":"Load restoration strategy for post-disaster distribution networks considering cyber-physical-traffic coupling with multi-resource collaboration","authors":"Ying Wang, Chunming Liu, Yulong Zhao, Xinyu Li, Yuhan Wu","doi":"10.1016/j.apenergy.2026.127373","DOIUrl":"10.1016/j.apenergy.2026.127373","url":null,"abstract":"<div><div>Extreme events can trigger coupled failures across distribution, information, and traffic networks, thereby compromising the safe operation of distribution networks. Mobile energy storage systems (MESS) provide spatiotemporal flexibility in energy supply and operate in a complementary manner to fixed power sources. Unmanned aerial vehicles (UAVs) offer mobile communication capabilities and facilitate rapid communication recovery. Additionally, fault repair can achieve stepwise load restoration through dynamic network reconfiguration. However, the integrated scheduling of these emergency resources across interdependent systems remains an urgent research challenge. Therefore, this paper proposes a post-disaster load restoration strategy for distribution networks, incorporating multi-resource collaborative scheduling under cyber-physical-traffic coupled failure scenarios. First, a cyber-physical-traffic coupled failure architecture is established. Second, the impact of traffic network failures on vehicle routing is considered, and Dijkstra's algorithm is employed to compute travel times. Simultaneously, to address the reduction in situational awareness and control caused by information network failures, a Location Set Covering Problem (LSCP) algorithm is adopted to optimize UAV site selection. A collaborative scheduling model is subsequently developed, integrating ESS, GAS, MESS, UAVs, and fault repair. The model aims to minimize both load shedding and the cost of emergency resource scheduling, using a multi-timeframe optimization approach to dynamically determine MESS deployment locations and fault repair sequences. Finally, simulations based on a modified IEEE-33-bus distribution system demonstrate that the proposed strategy can effectively reduce post-disaster load losses and shorten load restoration times.</div></div>","PeriodicalId":246,"journal":{"name":"Applied Energy","volume":"407 ","pages":"Article 127373"},"PeriodicalIF":11.0,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145974110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1016/j.apenergy.2026.127392
Majid Mohsenpour, Yangang Xing
The operational phase of buildings represents a major share of global energy consumption, underscoring its importance in achieving sustainability goals. Reinforcement learning, with its ability to manage both continuous and discrete control tasks, shows strong performance for enhancing building systems' efficiency. Existing reviews on reinforcement learning applications in building energy systems primarily focus on heating, ventilation, and air conditioning systems and often overlook critical distinctions between system-centric and occupant-centric control strategies, as well as the role of data acquisition for training. To address these gaps, this study conducts systematic review of reinforcement learning and hybrid reinforcement learning approaches to answer the question of how reinforcement learning methods improve the performance of heating, ventilation, and air conditioning systems, lighting systems, and window systems in terms of energy efficiency, thermal comfort, and indoor air quality. This study summarizes the states, actions, rewards, and performance of reinforcement learning methods. Through a critical analysis of more than seventy papers, this review distinguishes between system-centric and occupant-centric control models in terms of publication trends, design frameworks, and simulation and co-simulation tools. This review also goes beyond the simulation stage and investigates reinforcement learning challenges and methods, training strategies, and data-collection techniques for real-world deployment. In addition, this study proposes a novel co-adaptive reinforcement learning framework for further research on real-world deployment, considering occupants as the core of the design stage. Finally, this study identifies and discusses ten future research directions, outlining current limitations and opportunities for advancing reinforcement learning in building system control.
{"title":"Hybrid Reinforcement Learning for occupant-centric building control: A review and deployment framework for co-optimizing energy, comfort, and indoor air quality","authors":"Majid Mohsenpour, Yangang Xing","doi":"10.1016/j.apenergy.2026.127392","DOIUrl":"10.1016/j.apenergy.2026.127392","url":null,"abstract":"<div><div>The operational phase of buildings represents a major share of global energy consumption, underscoring its importance in achieving sustainability goals. Reinforcement learning, with its ability to manage both continuous and discrete control tasks, shows strong performance for enhancing building systems' efficiency. Existing reviews on reinforcement learning applications in building energy systems primarily focus on heating, ventilation, and air conditioning systems and often overlook critical distinctions between system-centric and occupant-centric control strategies, as well as the role of data acquisition for training. To address these gaps, this study conducts systematic review of reinforcement learning and hybrid reinforcement learning approaches to answer the question of how reinforcement learning methods improve the performance of heating, ventilation, and air conditioning systems, lighting systems, and window systems in terms of energy efficiency, thermal comfort, and indoor air quality. This study summarizes the states, actions, rewards, and performance of reinforcement learning methods. Through a critical analysis of more than seventy papers, this review distinguishes between system-centric and occupant-centric control models in terms of publication trends, design frameworks, and simulation and co-simulation tools. This review also goes beyond the simulation stage and investigates reinforcement learning challenges and methods, training strategies, and data-collection techniques for real-world deployment. In addition, this study proposes a novel co-adaptive reinforcement learning framework for further research on real-world deployment, considering occupants as the core of the design stage. Finally, this study identifies and discusses ten future research directions, outlining current limitations and opportunities for advancing reinforcement learning in building system control.</div></div>","PeriodicalId":246,"journal":{"name":"Applied Energy","volume":"408 ","pages":"Article 127392"},"PeriodicalIF":11.0,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145975360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1016/j.apenergy.2026.127363
Yingjun Wu , Junyu Feng , Xuejie Chen , Yujian Ye , Zhiwei Lin , Jiangfan Yuan , Xueyan He , Zhengxi Yin , Jiayan Lu
Extreme weather events increasingly challenge the operational resilience of distribution systems by introducing dynamic and uncertain security limits (SLs), alongside data sparsity. Traditional model-based approaches often rely on static assumptions and require complete system modeling, making them difficult to adapt to rapidly evolving weather-induced constraints. To address these limitations, this paper proposes a model-free resilience enhancement framework based on deep reinforcement learning (DRL), integrating real-time weather-aware SL identification and adaptive dispatch. First, an ensemble Bagging-XGBoost model is developed to classify weather severity levels and determine whether static or dynamic SLs should be applied, enabling scenario-adaptive SL switching. Second, a hybrid convolutional neural network–gated recurrent unit (CNN-GRU) model, enhanced by transfer learning, is designed to accurately estimate dynamic SLs under varying weather conditions. The CNN captures spatial meteorological patterns, while the GRU models temporal evolution; transfer learning improves generalization under limited training data. Third, the dispatch problem is formulated as a constrained Markov decision process (CMDP), and solved using a primal–dual deep deterministic policy gradient (PD-DDPG) algorithm that explicitly incorporates SL constraints into the policy learning process. An attention-based meteorological data reconstruction model is further integrated to enhance the quality of input data and training efficiency. Case studies on the improved IEEE-123 test feeder demonstrate that the proposed method reduces average load loss by 23.30 % and 12.10 % compared to CNN-only and GRU-only baselines, respectively. Moreover, it achieves an 88.77 % improvement in computational efficiency over conventional model-based resilience strategies, highlighting its robustness and applicability under limited data and high-impact weather conditions.
{"title":"Enhancing power grid resilience through weather-aware security constraints: A deep reinforcement learning approach with hybrid CNN-GRU architecture","authors":"Yingjun Wu , Junyu Feng , Xuejie Chen , Yujian Ye , Zhiwei Lin , Jiangfan Yuan , Xueyan He , Zhengxi Yin , Jiayan Lu","doi":"10.1016/j.apenergy.2026.127363","DOIUrl":"10.1016/j.apenergy.2026.127363","url":null,"abstract":"<div><div>Extreme weather events increasingly challenge the operational resilience of distribution systems by introducing dynamic and uncertain security limits (SLs), alongside data sparsity. Traditional model-based approaches often rely on static assumptions and require complete system modeling, making them difficult to adapt to rapidly evolving weather-induced constraints. To address these limitations, this paper proposes a model-free resilience enhancement framework based on deep reinforcement learning (DRL), integrating real-time weather-aware SL identification and adaptive dispatch. First, an ensemble Bagging-XGBoost model is developed to classify weather severity levels and determine whether static or dynamic SLs should be applied, enabling scenario-adaptive SL switching. Second, a hybrid convolutional neural network–gated recurrent unit (CNN-GRU) model, enhanced by transfer learning, is designed to accurately estimate dynamic SLs under varying weather conditions. The CNN captures spatial meteorological patterns, while the GRU models temporal evolution; transfer learning improves generalization under limited training data. Third, the dispatch problem is formulated as a constrained Markov decision process (CMDP), and solved using a primal–dual deep deterministic policy gradient (PD-DDPG) algorithm that explicitly incorporates SL constraints into the policy learning process. An attention-based meteorological data reconstruction model is further integrated to enhance the quality of input data and training efficiency. Case studies on the improved IEEE-123 test feeder demonstrate that the proposed method reduces average load loss by 23.30 % and 12.10 % compared to CNN-only and GRU-only baselines, respectively. Moreover, it achieves an 88.77 % improvement in computational efficiency over conventional model-based resilience strategies, highlighting its robustness and applicability under limited data and high-impact weather conditions.</div></div>","PeriodicalId":246,"journal":{"name":"Applied Energy","volume":"407 ","pages":"Article 127363"},"PeriodicalIF":11.0,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145974080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1016/j.apenergy.2026.127355
Qingshu Guan , Hui Cao , Tiansen Niu , Lixin Jia , Dapeng Yan , Badong Chen
Electric vehicle routing problems (EVRPs) have attracted growing interest in the pursuit of sustainable transportation, driven by the environmental benefits and energy efficiency of electric vehicles (EVs). Nevertheless, mainstream approaches predominantly focus on minimizing travel distance rather than propulsion energy and overlook key logistical constraints such as time windows and pickup-delivery demands, which are critical in modern express operations. To address these limitations, we investigate an energy-optimal EVRP with pickup-delivery and time windows (EVRP-PDTW) and develop a high-resolution energy consumption model that integrates time-dependent driving dynamics, detailed road information, and battery charging behavior. Building on this foundation, we cast the routing task as a Markov decision process and propose a Heterogeneous Attention-driven Deep Reinforcement Learning (HA-DRL) framework. The encoder leverages a heterogeneous attention mechanism to capture role-specific interactions among depots, customers, and charging stations, while the decoder incorporates a dynamic-aware context embedding to capture state transitions and temporal feasibility. We analyze how these design choices structure the decision space and align the learned policy with the underlying energy model, thereby explaining the observed energy savings. Experiments on synthetic and real-world datasets show that HA-DRL outperforms a suite of heuristic and DRL-based methods by a clear margin, reducing average energy consumption by 51.30 kWh (a 6.43% improvement in optimality gap) over the competitive NCS approach in large-scale scenarios involving 200 customers and 40 charging stations. These achievements underscore the promise of HA-DRL in advancing energy-aware routing solutions for real-world EV logistics systems.
{"title":"Heterogeneous attention-driven deep reinforcement learning for solving EVRPs with pickup-delivery and time windows","authors":"Qingshu Guan , Hui Cao , Tiansen Niu , Lixin Jia , Dapeng Yan , Badong Chen","doi":"10.1016/j.apenergy.2026.127355","DOIUrl":"10.1016/j.apenergy.2026.127355","url":null,"abstract":"<div><div>Electric vehicle routing problems (EVRPs) have attracted growing interest in the pursuit of sustainable transportation, driven by the environmental benefits and energy efficiency of electric vehicles (EVs). Nevertheless, mainstream approaches predominantly focus on minimizing travel distance rather than propulsion energy and overlook key logistical constraints such as time windows and pickup-delivery demands, which are critical in modern express operations. To address these limitations, we investigate an energy-optimal EVRP with pickup-delivery and time windows (EVRP-PDTW) and develop a high-resolution energy consumption model that integrates time-dependent driving dynamics, detailed road information, and battery charging behavior. Building on this foundation, we cast the routing task as a Markov decision process and propose a Heterogeneous Attention-driven Deep Reinforcement Learning (HA-DRL) framework. The encoder leverages a heterogeneous attention mechanism to capture role-specific interactions among depots, customers, and charging stations, while the decoder incorporates a dynamic-aware context embedding to capture state transitions and temporal feasibility. We analyze how these design choices structure the decision space and align the learned policy with the underlying energy model, thereby explaining the observed energy savings. Experiments on synthetic and real-world datasets show that HA-DRL outperforms a suite of heuristic and DRL-based methods by a clear margin, reducing average energy consumption by 51.30 kWh (a 6.43% improvement in optimality gap) over the competitive NCS approach in large-scale scenarios involving 200 customers and 40 charging stations. These achievements underscore the promise of HA-DRL in advancing energy-aware routing solutions for real-world EV logistics systems.</div></div>","PeriodicalId":246,"journal":{"name":"Applied Energy","volume":"407 ","pages":"Article 127355"},"PeriodicalIF":11.0,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145974077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1016/j.apenergy.2026.127389
M. Ferrara , F. Mottola , D. Proto , A. Ricca , M. Valenti
This paper addresses the integration of green hydrogen production and electrical energy storage in renewable energy communities. An optimal approach is proposed for sizing an electrolyzer and a battery energy storage system within the community which includes photovoltaic generation and loads. The method tackles key planning challenges by incorporating uncertainty handled through decision theory techniques. Multiple scenarios are defined based on variations in photovoltaic generation, load demand, and electricity price profiles to capture a wide range of operating conditions. The proposed planning model includes scheduling strategies aimed at facilitating the integration of the distributed resources by coordinating their power flows, with the dual objectives of maximizing green hydrogen production and enhancing energy sharing within the community. The scheduling is solved through mixed-integer linear programming, whose combination with decision theory reduces computational effort by exhaustively considering a range of scenarios through their probabilities of occurrence and distinct characteristics. To evaluate the impact of the resource contributions under the community’s self-consumption incentive policy, the paper includes the formulation of shared energy models for three distinct system configurations, each adapted from a general framework to address the specific characteristics of the respective configurations. The results of numerical applications provide evidence of the effectiveness of the proposed procedure and present an analysis of economic viability. The analysis shows that, as the hydrogen selling price increases, the optimal planning procedure leads to increased hydrogen production, which in turn boosts the net economic benefit. The proposed approach provides a flexible decision–support tool for planners and policymakers, enabling tailored insights into optimal system design based on the specific objectives and available information.
{"title":"Integrating green hydrogen production and electrical energy storage in energy communities under uncertainty","authors":"M. Ferrara , F. Mottola , D. Proto , A. Ricca , M. Valenti","doi":"10.1016/j.apenergy.2026.127389","DOIUrl":"10.1016/j.apenergy.2026.127389","url":null,"abstract":"<div><div>This paper addresses the integration of green hydrogen production and electrical energy storage in renewable energy communities. An optimal approach is proposed for sizing an electrolyzer and a battery energy storage system within the community which includes photovoltaic generation and loads. The method tackles key planning challenges by incorporating uncertainty handled through decision theory techniques. Multiple scenarios are defined based on variations in photovoltaic generation, load demand, and electricity price profiles to capture a wide range of operating conditions. The proposed planning model includes scheduling strategies aimed at facilitating the integration of the distributed resources by coordinating their power flows, with the dual objectives of maximizing green hydrogen production and enhancing energy sharing within the community. The scheduling is solved through mixed-integer linear programming, whose combination with decision theory reduces computational effort by exhaustively considering a range of scenarios through their probabilities of occurrence and distinct characteristics. To evaluate the impact of the resource contributions under the community’s self-consumption incentive policy, the paper includes the formulation of shared energy models for three distinct system configurations, each adapted from a general framework to address the specific characteristics of the respective configurations. The results of numerical applications provide evidence of the effectiveness of the proposed procedure and present an analysis of economic viability. The analysis shows that, as the hydrogen selling price increases, the optimal planning procedure leads to increased hydrogen production, which in turn boosts the net economic benefit. The proposed approach provides a flexible decision–support tool for planners and policymakers, enabling tailored insights into optimal system design based on the specific objectives and available information.</div></div>","PeriodicalId":246,"journal":{"name":"Applied Energy","volume":"407 ","pages":"Article 127389"},"PeriodicalIF":11.0,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145974112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}