Pub Date : 2026-03-01Epub Date: 2025-12-13DOI: 10.1016/j.compchemeng.2025.109528
Yi Liu , Bingbing Shen , David Shan-Hill Wong , Mingwei Jia , Yuan Yao
Modern processes rely on thousands of sensors, yet operators still lack trustworthy tools for measurement-driven fault detection and diagnosis. Deep learning excels at capturing nonlinearity and dynamics, but its opaque decision-making limits use in safety-critical processes. This survey fills that gap by introducing a measurement-oriented taxonomy of explainable neural networks (XNNs) and explainable graph neural networks (XGNNs) from the view of instrumentation and measurement. XNNs are posited as variable-centered instruments that assign calibrated importance scores to individual sensors for different faults. XGNNs are framed as topology-centric instruments, allowing direct measurement of interaction strength and causal propagation among units and control loops. This review delivers step-by-step guidelines that convert historical data into explainable detectors, trackers, and diagnostic meters. A comparison highlights when an XNN suffices and when an XGNN is mandatory, giving instrumentation engineers a decision chart. Different from prior surveys, we show that graphs are the faithful way to integrate P&IDs, material balances, and causal knowledge into deep learning measurements: XGNN explanations map directly onto process diagrams, creating on-screen instruments that display both the alarm and its physical trail. Finally, it concludes by identifying open challenges and recommending future directions for industrial deployment.
{"title":"Explainable Neural Network meets Graph Neural Network: Recent advances in process fault detection and diagnosis","authors":"Yi Liu , Bingbing Shen , David Shan-Hill Wong , Mingwei Jia , Yuan Yao","doi":"10.1016/j.compchemeng.2025.109528","DOIUrl":"10.1016/j.compchemeng.2025.109528","url":null,"abstract":"<div><div>Modern processes rely on thousands of sensors, yet operators still lack trustworthy tools for measurement-driven fault detection and diagnosis. Deep learning excels at capturing nonlinearity and dynamics, but its opaque decision-making limits use in safety-critical processes. This survey fills that gap by introducing a measurement-oriented taxonomy of explainable neural networks (XNNs) and explainable graph neural networks (XGNNs) from the view of instrumentation and measurement. XNNs are posited as variable-centered instruments that assign calibrated importance scores to individual sensors for different faults. XGNNs are framed as topology-centric instruments, allowing direct measurement of interaction strength and causal propagation among units and control loops. This review delivers step-by-step guidelines that convert historical data into explainable detectors, trackers, and diagnostic meters. A comparison highlights when an XNN suffices and when an XGNN is mandatory, giving instrumentation engineers a decision chart. Different from prior surveys, we show that graphs are the faithful way to integrate P&IDs, material balances, and causal knowledge into deep learning measurements: XGNN explanations map directly onto process diagrams, creating on-screen instruments that display both the alarm and its physical trail. Finally, it concludes by identifying open challenges and recommending future directions for industrial deployment.</div></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"206 ","pages":"Article 109528"},"PeriodicalIF":3.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145797289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-10DOI: 10.1016/j.compchemeng.2025.109519
Edward Hendrik Bras, Tobias Muller Louw, Steven Martin Bradshaw
Reinforcement learning (RL) is a data-driven optimal control technique that has seen limited adoption in the process industries owing to high operational data requirements and the need to balance safe control with exploration. In contrast, Model Predictive Control (MPC) has been established as the benchmark method for optimal control in industrial applications but relies on a dynamic model of the controlled process. In this work, MPC is used as a starting point for online, continuing actor-critic RL applied to the simulated quadruple tank benchmark. Optimal control actions were precomputed as a function of the state space using the plant model, and a neural network was fitted to these data to generate an explicit MPC policy. Subsequently, this policy was adapted during closed-loop interaction of the RL agent with the (simulated) true plant, which exhibited different dynamics to the nominal plant model. The RL controller resolved the effects of plant-model mismatch on closed-loop control performance in the minimum-phase operating region by finding the optimal policy incrementally. In the non-minimum phase operating region, inverse response prevented the RL agent from operating effectively. From multivariable zero analysis, it was shown that process zeros very close to the origin or in the right half plane introduce a significant risk of closed-loop instability under RL control. This work’s findings show both the potential and limitations of RL by adapting precomputed optimal control policies and optimal cost functions (value functions) developed using non-linear MPC.
{"title":"Robust control by applying reinforcement learning to adapt explicit model predictive control policies","authors":"Edward Hendrik Bras, Tobias Muller Louw, Steven Martin Bradshaw","doi":"10.1016/j.compchemeng.2025.109519","DOIUrl":"10.1016/j.compchemeng.2025.109519","url":null,"abstract":"<div><div>Reinforcement learning (RL) is a data-driven optimal control technique that has seen limited adoption in the process industries owing to high operational data requirements and the need to balance safe control with exploration. In contrast, Model Predictive Control (MPC) has been established as the benchmark method for optimal control in industrial applications but relies on a dynamic model of the controlled process. In this work, MPC is used as a starting point for online, continuing actor-critic RL applied to the simulated quadruple tank benchmark. Optimal control actions were precomputed as a function of the state space using the plant model, and a neural network was fitted to these data to generate an explicit MPC policy. Subsequently, this policy was adapted during closed-loop interaction of the RL agent with the (simulated) true plant, which exhibited different dynamics to the nominal plant model. The RL controller resolved the effects of plant-model mismatch on closed-loop control performance in the minimum-phase operating region by finding the optimal policy incrementally. In the non-minimum phase operating region, inverse response prevented the RL agent from operating effectively. From multivariable zero analysis, it was shown that process zeros very close to the origin or in the right half plane introduce a significant risk of closed-loop instability under RL control. This work’s findings show both the potential and limitations of RL by adapting precomputed optimal control policies and optimal cost functions (value functions) developed using non-linear MPC.</div></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"206 ","pages":"Article 109519"},"PeriodicalIF":3.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145797287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-10DOI: 10.1016/j.compchemeng.2025.109520
Yee Hung Hong , Jinglin Wang , Chao Peng , Jinsong Zhao
At LNG receiving terminals, the daily send-out target fluctuates with seawater temperature, the accumulated runtime of parallel units, and planned start-up or shutdown sequences. Operators must still decide which pumps or vaporizers to activate under time pressure and incomplete information. In practice, these choices often rely on experience and short-term intuition rather than systematic evaluation, which can lead to uneven runtime distribution, maintenance bottlenecks, and unnecessary energy consumption. This study asks whether such human decision gaps can be reduced within the actual physical and organizational constraints of a working terminal. We develop RALT-DT, a risk-aware learning and control digital twin that integrates deep reinforcement learning with process-level physical models. The “risk-aware” feature is embodied in three aspects: (1) all policy decisions are constrained by explicit mass and heat-balance equations and safety interlocks, ensuring that the control actions remain within certified operating envelopes; (2) the learning reward explicitly penalizes excessive switching, uneven runtime dispersion, and deviations from preventive-maintenance requirements, treating long-term mechanical wear and operational stability as quantifiable risks; and (3) the system continuously monitors plant–model mismatch and adapts its confidence weighting, so that recommendations are moderated when uncertainty grows. To make the solution practical, the plant’s many operating devices are grouped into four core classes—low-pressure (LP) pumps, high-pressure (HP) pumps, open-rack vaporizers (ORV), and submerged-combustion vaporizers (SCV). A two-time-scale roster generator translates continuous policy outputs into binary start–stop schedules that meet maintenance lock-out and switch-inertia constraints. The resulting framework forms a closed learning loop that is both deterministic and interpretable. A one-month on-site shadow test was carried out, in which the algorithm’s decisions were compared with real operator schedules under live conditions. The digital twin achieved an average electrical energy reduction of 9.4 %, equivalent to a saving of about 754 MWh, without violating throughput or switching limits. When calibrated against plant telemetry and vendor performance curves, the model maintained consistent accuracy across varying load and temperature conditions. These results indicate that a physics-grounded and risk-aware learning framework can systematically enhance human decision quality in large-scale terminal operations. It converts intuition-driven scheduling into a reproducible and auditable policy that improves both electrical energy efficiency and asset reliability.
{"title":"A risk-aware LNG terminal scheduling digital twin based on deep reinforcement learning","authors":"Yee Hung Hong , Jinglin Wang , Chao Peng , Jinsong Zhao","doi":"10.1016/j.compchemeng.2025.109520","DOIUrl":"10.1016/j.compchemeng.2025.109520","url":null,"abstract":"<div><div>At LNG receiving terminals, the daily send-out target fluctuates with seawater temperature, the accumulated runtime of parallel units, and planned start-up or shutdown sequences. Operators must still decide which pumps or vaporizers to activate under time pressure and incomplete information. In practice, these choices often rely on experience and short-term intuition rather than systematic evaluation, which can lead to uneven runtime distribution, maintenance bottlenecks, and unnecessary energy consumption. This study asks whether such human decision gaps can be reduced within the actual physical and organizational constraints of a working terminal. We develop RALT-DT, a risk-aware learning and control digital twin that integrates deep reinforcement learning with process-level physical models. The “risk-aware” feature is embodied in three aspects: (1) all policy decisions are constrained by explicit mass and heat-balance equations and safety interlocks, ensuring that the control actions remain within certified operating envelopes; (2) the learning reward explicitly penalizes excessive switching, uneven runtime dispersion, and deviations from preventive-maintenance requirements, treating long-term mechanical wear and operational stability as quantifiable risks; and (3) the system continuously monitors plant–model mismatch and adapts its confidence weighting, so that recommendations are moderated when uncertainty grows. To make the solution practical, the plant’s many operating devices are grouped into four core classes—low-pressure (LP) pumps, high-pressure (HP) pumps, open-rack vaporizers (ORV), and submerged-combustion vaporizers (SCV). A two-time-scale roster generator translates continuous policy outputs into binary start–stop schedules that meet maintenance lock-out and switch-inertia constraints. The resulting framework forms a closed learning loop that is both deterministic and interpretable. A one-month on-site shadow test was carried out, in which the algorithm’s decisions were compared with real operator schedules under live conditions. The digital twin achieved an average electrical energy reduction of 9.4 %, equivalent to a saving of about 754 MWh, without violating throughput or switching limits. When calibrated against plant telemetry and vendor performance curves, the model maintained consistent accuracy across varying load and temperature conditions. These results indicate that a physics-grounded and risk-aware learning framework can systematically enhance human decision quality in large-scale terminal operations. It converts intuition-driven scheduling into a reproducible and auditable policy that improves both electrical energy efficiency and asset reliability.</div></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"206 ","pages":"Article 109520"},"PeriodicalIF":3.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145797285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fire and explosion accidents pose significant risks in the chemical and process industries. This study develops an integrated modeling framework to assess the vulnerability of oil storage tanks exposed to simultaneous fire and explosion hazards. The methodology integrates fault tree analysis for identifying contributing factors to simultaneous fire and explosion hazards, a hybrid Fuzzy Cognitive Maps-Bayesian Networks (FCM-BN) approach to quantify probabilistic relationships, graph theory metrics for criticality analysis, and Petri net simulations to model domino effect propagation. The FCM-BN model identifies three key contributors to domino effects, including domino effects from adjacent equipment, unprotected electrical equipment, and hot work and maintenance activities near leaks or flammable vapors. Graph theory analysis identifies Tank No. 4 as the most critical unit based on centrality metrics ("betweenness" and "closeness"), while Petri net simulations show that adjacent tanks, particularly Tanks No. 8 and No. 5, are highly vulnerable to explosion impacts. The framework provides both theoretical insights into domino effect mechanisms and practical tools for risk management, enabling targeted safety interventions and optimal resource allocation in storage facilities. These findings establish a foundation for preventing domino accidents through evidence-based vulnerability assessment and dynamic propagation modeling.
{"title":"A Petri net-based approach for modeling and analyzing the vulnerability of storage tanks under simultaneous fire and explosion hazards","authors":"Zahra Khodabakhsh , Khadijeh Mostafaee Dolatabad , Matin Aleahmad , Leila Omidi","doi":"10.1016/j.compchemeng.2025.109501","DOIUrl":"10.1016/j.compchemeng.2025.109501","url":null,"abstract":"<div><div>Fire and explosion accidents pose significant risks in the chemical and process industries. This study develops an integrated modeling framework to assess the vulnerability of oil storage tanks exposed to simultaneous fire and explosion hazards. The methodology integrates fault tree analysis for identifying contributing factors to simultaneous fire and explosion hazards, a hybrid Fuzzy Cognitive Maps-Bayesian Networks (FCM-BN) approach to quantify probabilistic relationships, graph theory metrics for criticality analysis, and Petri net simulations to model domino effect propagation. The FCM-BN model identifies three key contributors to domino effects, including domino effects from adjacent equipment, unprotected electrical equipment, and hot work and maintenance activities near leaks or flammable vapors. Graph theory analysis identifies Tank No. 4 as the most critical unit based on centrality metrics (\"betweenness\" and \"closeness\"), while Petri net simulations show that adjacent tanks, particularly Tanks No. 8 and No. 5, are highly vulnerable to explosion impacts. The framework provides both theoretical insights into domino effect mechanisms and practical tools for risk management, enabling targeted safety interventions and optimal resource allocation in storage facilities. These findings establish a foundation for preventing domino accidents through evidence-based vulnerability assessment and dynamic propagation modeling.</div></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"206 ","pages":"Article 109501"},"PeriodicalIF":3.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145691664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-19DOI: 10.1016/j.compchemeng.2025.109533
Norfamila Che Mat , Edelin Emirra Empir , Kasih Syazwina Qistina Syaheezam , Muhammad Abdul Qyyum
Hybrid membrane–cryogenic nitrogen rejection units (NRUs) are increasingly proposed for upgrading sub-quality natural gas (NG), yet the mechanistic basis for their performance benefits remains insufficiently understood. This work develops a surrogate-assisted multi-objective optimization framework that integrates perturbation-expansion membrane modelling with Super Learner ensemble surrogates and analytical Bayesian uncertainty quantification to enable computationally efficient hybrid system design under data-sparse conditions. The optimization explores a 14-dimensional design space, yielding Pareto-optimal trade-offs with CH₄ recovery spanning 0.80–0.97 and specific power consumption (SPC) of 0.32–0.52 kWh/kg CH₄. Permutation-based sensitivity analysis identifies functional decoupling between system domains: membrane parameters predominantly govern SPC, while cryogenic column variables control CH₄ recovery, with minimal cross-influence. These results challenge the common assumption that membrane pre-concentration directly reduces energy demand. Instead, optimal hybrid configurations achieve >93% recovery at 0.49–0.51 kWh/kg CH₄—comparable to standalone cryogenic performance—demonstrating that hybrid value arises primarily from operational flexibility, enabling independent manipulation of product quality and energy consumption. Analytical Bayesian inference achieves 24–30% reduction in predictive uncertainty with cross-validation consistency <0.06. The framework produces performance–confidence trade-off maps across the design space, supporting systematic selection of operating strategies based on specified confidence thresholds. By reducing optimization time from 23 to 5 h while quantifying prediction reliability, the proposed approach offers an alternative to empirical tuning practices and provides clearer visibility into performance–flexibility interactions for hybrid NRU design.
{"title":"Uncertainty-aware design optimization of hybrid membrane–cryogenic nitrogen rejection units (NRU): decoupling energy and product quality","authors":"Norfamila Che Mat , Edelin Emirra Empir , Kasih Syazwina Qistina Syaheezam , Muhammad Abdul Qyyum","doi":"10.1016/j.compchemeng.2025.109533","DOIUrl":"10.1016/j.compchemeng.2025.109533","url":null,"abstract":"<div><div>Hybrid membrane–cryogenic nitrogen rejection units (NRUs) are increasingly proposed for upgrading sub-quality natural gas (NG), yet the mechanistic basis for their performance benefits remains insufficiently understood. This work develops a surrogate-assisted multi-objective optimization framework that integrates perturbation-expansion membrane modelling with Super Learner ensemble surrogates and analytical Bayesian uncertainty quantification to enable computationally efficient hybrid system design under data-sparse conditions. The optimization explores a 14-dimensional design space, yielding Pareto-optimal trade-offs with CH₄ recovery spanning 0.80–0.97 and specific power consumption (SPC) of 0.32–0.52 kWh/kg CH₄. Permutation-based sensitivity analysis identifies functional decoupling between system domains: membrane parameters predominantly govern SPC, while cryogenic column variables control CH₄ recovery, with minimal cross-influence. These results challenge the common assumption that membrane pre-concentration directly reduces energy demand. Instead, optimal hybrid configurations achieve >93% recovery at 0.49–0.51 kWh/kg CH₄—comparable to standalone cryogenic performance—demonstrating that hybrid value arises primarily from operational flexibility, enabling independent manipulation of product quality and energy consumption. Analytical Bayesian inference achieves 24–30% reduction in predictive uncertainty with cross-validation consistency <0.06. The framework produces performance–confidence trade-off maps across the design space, supporting systematic selection of operating strategies based on specified confidence thresholds. By reducing optimization time from 23 to 5 h while quantifying prediction reliability, the proposed approach offers an alternative to empirical tuning practices and provides clearer visibility into performance–flexibility interactions for hybrid NRU design.</div></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"206 ","pages":"Article 109533"},"PeriodicalIF":3.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145836368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-12DOI: 10.1016/j.compchemeng.2025.109523
Martin Bubel, Tobias Seidel, Michael Bortz
Surrogate modeling is a powerful methodology in chemical process engineering, frequently employed to accelerate optimization tasks. Despite their popularity, most surrogate models are trained for a narrow range of fixed chemical systems and operating conditions, which limits their reusability. This work introduces a paradigm shift towards reusable surrogates by developing a single model for distillation columns that generalizes across a vast design space. The key enabler is a novel ML-fueled modelfluid representation which allows for the generation of datasets of more than samples. This allows the surrogate to generalize not only over column specifications but also over the entire chemical space of homogeneous ternary vapor–liquid mixtures. We validate the model’s accuracy and demonstrate its practical utility in a case study on entrainer distillation, where it successfully screens and ranks candidate entrainers, significantly reducing the computational effort compared to rigorous optimization.
{"title":"Reusable surrogate models for distillation columns","authors":"Martin Bubel, Tobias Seidel, Michael Bortz","doi":"10.1016/j.compchemeng.2025.109523","DOIUrl":"10.1016/j.compchemeng.2025.109523","url":null,"abstract":"<div><div>Surrogate modeling is a powerful methodology in chemical process engineering, frequently employed to accelerate optimization tasks. Despite their popularity, most surrogate models are trained for a narrow range of fixed chemical systems and operating conditions, which limits their reusability. This work introduces a paradigm shift towards reusable surrogates by developing a single model for distillation columns that generalizes across a vast design space. The key enabler is a novel ML-fueled modelfluid representation which allows for the generation of datasets of more than <span><math><mrow><mn>1000000</mn></mrow></math></span> samples. This allows the surrogate to generalize not only over column specifications but also over the entire chemical space of homogeneous ternary vapor–liquid mixtures. We validate the model’s accuracy and demonstrate its practical utility in a case study on entrainer distillation, where it successfully screens and ranks candidate entrainers, significantly reducing the computational effort compared to rigorous optimization.</div></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"206 ","pages":"Article 109523"},"PeriodicalIF":3.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145797283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-24DOI: 10.1016/j.compchemeng.2025.109535
Alex Durkin , Jasper Stolte , Matthew Jones , Raghuraman Pitchumani , Bei Li , Christian Michler , Mehmet Mercangöz
Offline reinforcement learning (offline RL) offers a promising framework for developing control strategies in chemical process systems using historical data, without the risks or costs of online experimentation. This work investigates the application of offline RL to the safe and efficient control of an exothermic polymerisation continuous stirred-tank reactor. We introduce a Gymnasium-compatible simulation environment that captures the reactor’s nonlinear dynamics, including reaction kinetics, energy balances, and operational constraints. The environment supports three industrially relevant scenarios: startup, grade change down, and grade change up. It also includes reproducible offline datasets generated from proportional–integral controllers with randomised tunings, providing a benchmark for evaluating offline RL algorithms in realistic process control tasks.
We assess behaviour cloning and implicit Q-learning as baseline algorithms, highlighting the challenges offline agents face, including steady-state offsets and degraded performance near setpoints. To address these issues, we propose a novel deployment-time safety layer that performs gradient-based action correction using partially input convex neural networks (PICNNs) as learned cost models. The PICNN enables real-time, differentiable correction of policy actions by descending a convex, state-conditioned cost surface, without requiring retraining or environment interaction.
Experimental results show that offline RL, particularly when combined with convex action correction, can outperform traditional control approaches and maintain stability across all scenarios. These findings demonstrate the feasibility of integrating offline RL with interpretable and safety-aware corrections for high-stakes chemical process control, and lay the groundwork for more reliable data-driven automation in industrial systems.
{"title":"Safe deployment of offline reinforcement learning via input convex action correction","authors":"Alex Durkin , Jasper Stolte , Matthew Jones , Raghuraman Pitchumani , Bei Li , Christian Michler , Mehmet Mercangöz","doi":"10.1016/j.compchemeng.2025.109535","DOIUrl":"10.1016/j.compchemeng.2025.109535","url":null,"abstract":"<div><div>Offline reinforcement learning (offline RL) offers a promising framework for developing control strategies in chemical process systems using historical data, without the risks or costs of online experimentation. This work investigates the application of offline RL to the safe and efficient control of an exothermic polymerisation continuous stirred-tank reactor. We introduce a Gymnasium-compatible simulation environment that captures the reactor’s nonlinear dynamics, including reaction kinetics, energy balances, and operational constraints. The environment supports three industrially relevant scenarios: startup, grade change down, and grade change up. It also includes reproducible offline datasets generated from proportional–integral controllers with randomised tunings, providing a benchmark for evaluating offline RL algorithms in realistic process control tasks.</div><div>We assess behaviour cloning and implicit Q-learning as baseline algorithms, highlighting the challenges offline agents face, including steady-state offsets and degraded performance near setpoints. To address these issues, we propose a novel deployment-time safety layer that performs gradient-based action correction using partially input convex neural networks (PICNNs) as learned cost models. The PICNN enables real-time, differentiable correction of policy actions by descending a convex, state-conditioned cost surface, without requiring retraining or environment interaction.</div><div>Experimental results show that offline RL, particularly when combined with convex action correction, can outperform traditional control approaches and maintain stability across all scenarios. These findings demonstrate the feasibility of integrating offline RL with interpretable and safety-aware corrections for high-stakes chemical process control, and lay the groundwork for more reliable data-driven automation in industrial systems.</div></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"206 ","pages":"Article 109535"},"PeriodicalIF":3.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145880235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-11DOI: 10.1016/j.compchemeng.2025.109525
Etienne Ayotte-Sauvé , Robert Yandon , Philippe Navarri , Robert Symonds , Robin Hughes , Marzieh Shokrollahi , Rebecca Modler
To reach decarbonization objectives across the globe, scenario modelling studies indicate that carbon capture, utilization and storage (CCUS) will be essential. Planning CCUS deployment at the regional and national levels requires balancing complex trade-offs – such as when to phase-in infrastructure, where and how much CO2 to capture, transport, utilize and store, as well as which policy measures to adopt.
We introduce a new multiperiod mixed-integer linear programming (MILP) model for the strategic planning of CCUS, with CO2 transportation via pipelines, ships, trains and trucks. This model features the selection of capture units and their rates, geographically explicit pipeline network design (including reuse), vehicle fleet estimations, auxiliary transshipment processes, buffer storage, inland reservoirs as well as injection via ships in addition to offshore pipelines. To handle large scale case studies, a simple heuristic algorithm is presented.
The features of the proposed approach are demonstrated on an illustrative Eastern Canada case study involving pipelines, trains and ships. The influence of the temporal resolution chosen by the modeller on the quality of cost estimates and on calculation times is quantified. Compared to leading commercial algorithms, for large model instances the proposed heuristic is shown to produce better results (e.g. 10 % lower cost) with much less computation time (minutes instead of days). This widens the scope of potential use cases, including broader geographical regions and larger sensitivity studies. The annual evolution of an Eastern Canada CCUS value chain is analyzed, paying special attention to the interplay between CO2 storage reservoir and transport constraints.
{"title":"Multiperiod strategic planning of CO2 capture, transport, utilization and storage with multimodal transportation","authors":"Etienne Ayotte-Sauvé , Robert Yandon , Philippe Navarri , Robert Symonds , Robin Hughes , Marzieh Shokrollahi , Rebecca Modler","doi":"10.1016/j.compchemeng.2025.109525","DOIUrl":"10.1016/j.compchemeng.2025.109525","url":null,"abstract":"<div><div>To reach decarbonization objectives across the globe, scenario modelling studies indicate that carbon capture, utilization and storage (CCUS) will be essential. Planning CCUS deployment at the regional and national levels requires balancing complex trade-offs – such as when to phase-in infrastructure, where and how much CO<sub>2</sub> to capture, transport, utilize and store, as well as which policy measures to adopt.</div><div>We introduce a new multiperiod mixed-integer linear programming (MILP) model for the strategic planning of CCUS, with CO<sub>2</sub> transportation via pipelines, ships, trains and trucks. This model features the selection of capture units and their rates, geographically explicit pipeline network design (including reuse), vehicle fleet estimations, auxiliary transshipment processes, buffer storage, inland reservoirs as well as injection via ships in addition to offshore pipelines. To handle large scale case studies, a simple heuristic algorithm is presented.</div><div>The features of the proposed approach are demonstrated on an illustrative Eastern Canada case study involving pipelines, trains and ships. The influence of the temporal resolution chosen by the modeller on the quality of cost estimates and on calculation times is quantified. Compared to leading commercial algorithms, for large model instances the proposed heuristic is shown to produce better results (e.g. 10 % lower cost) with much less computation time (minutes instead of days). This widens the scope of potential use cases, including broader geographical regions and larger sensitivity studies. The annual evolution of an Eastern Canada CCUS value chain is analyzed, paying special attention to the interplay between CO<sub>2</sub> storage reservoir and transport constraints.</div></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"206 ","pages":"Article 109525"},"PeriodicalIF":3.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145797286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-11DOI: 10.1016/j.compchemeng.2025.109524
Roberto Cifuentes García , Evan D. Erickson , Mariano Martín , Víctor M. Zavala
A plastic waste upcycling value chain model has been applied to assess the potential of processing packaging waste in Spain using thermo-chemical technologies to produce low-density polyethylene (LDPE) and polypropylene (PP), which are highly valuable materials. The model projects an annual profit of 120.6 M$/yr, with a capital investment of 789.3 M$, generating 3285 jobs and contributing 65.5 M$/yr to Spain’s economy. The achieved circularity rate of the waste processing infrastructure exceeds 40 %, incorporating recycled HDPE and PET. Despite these advantages, regulatory gaps and market hesitancy toward recycled materials due to quality concerns hinder adoption. Additionally, economies of scale remain underutilized in Spain due to lower plastic waste collection levels compared to countries such as the United States. This network, while less profitable, is environmentally superior, yielding upcycled products with a Global Warming Potential 20–35 % lower than their virgin, fossil-fuel counterparts, confirming this as a viable and sustainable alternative.
{"title":"Evaluating the potential of plastic waste upcycling using thermochemical technologies: A case study in Spain","authors":"Roberto Cifuentes García , Evan D. Erickson , Mariano Martín , Víctor M. Zavala","doi":"10.1016/j.compchemeng.2025.109524","DOIUrl":"10.1016/j.compchemeng.2025.109524","url":null,"abstract":"<div><div>A plastic waste upcycling value chain model has been applied to assess the potential of processing packaging waste in Spain using thermo-chemical technologies to produce low-density polyethylene (LDPE) and polypropylene (PP), which are highly valuable materials. The model projects an annual profit of 120.6 M$/yr, with a capital investment of 789.3 M$, generating 3285 jobs and contributing 65.5 M$/yr to Spain’s economy. The achieved circularity rate of the waste processing infrastructure exceeds 40 %, incorporating recycled HDPE and PET. Despite these advantages, regulatory gaps and market hesitancy toward recycled materials due to quality concerns hinder adoption. Additionally, economies of scale remain underutilized in Spain due to lower plastic waste collection levels compared to countries such as the United States. This network, while less profitable, is environmentally superior, yielding upcycled products with a Global Warming Potential 20–35 % lower than their virgin, fossil-fuel counterparts, confirming this as a viable and sustainable alternative.</div></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"206 ","pages":"Article 109524"},"PeriodicalIF":3.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145797284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-11-27DOI: 10.1016/j.compchemeng.2025.109500
Martin F. Luna , Federico M. Mione , Ernesto C. Martinez , M. Nicolas Cruz Bournazou
For efficiency and reproducibility, modern biotech laboratories increasingly rely on robotic platforms to perform complex, dynamic experiments to generate informative data for bioprocess development and knowledge discovery. Furthering the goal of self-driving biolabs imposes the need of automating cognitive demanding tasks such as redesigning online an experiment to maximize information gain in the face of different sources of uncertainty including microorganism behavior, hardware failure, and noisy measurements. In this work, a reinforcement learning (RL) based formulation of the online experimental redesign problem for dynamic experiments is proposed. Simulation-based learning of a redesign policy for parallel cultivations in a high-throughput platform is discussed to provide implementation details regarding the RL agent design (perceptions and actions) and the reward function used. The simulated environment and a training workflow for sequential information control are integrated with the Proximal Policy Optimization (PPO) algorithm to learn how to modify ’on the fly’ an offline design based solely on previous observations and actions in a given experiment. Results obtained demonstrate the feasibility of using deep RL to guarantee the quality of the generated data and increase the level of automation that can be used on high-throughput platforms.
{"title":"Online redesign of dynamic experiments for high-throughput bioprocess development using deep reinforcement learning","authors":"Martin F. Luna , Federico M. Mione , Ernesto C. Martinez , M. Nicolas Cruz Bournazou","doi":"10.1016/j.compchemeng.2025.109500","DOIUrl":"10.1016/j.compchemeng.2025.109500","url":null,"abstract":"<div><div>For efficiency and reproducibility, modern biotech laboratories increasingly rely on robotic platforms to perform complex, dynamic experiments to generate informative data for bioprocess development and knowledge discovery. Furthering the goal of self-driving biolabs imposes the need of automating cognitive demanding tasks such as redesigning online an experiment to maximize information gain in the face of different sources of uncertainty including microorganism behavior, hardware failure, and noisy measurements. In this work, a reinforcement learning (RL) based formulation of the online experimental redesign problem for dynamic experiments is proposed. Simulation-based learning of a redesign policy for parallel cultivations in a high-throughput platform is discussed to provide implementation details regarding the RL agent design (perceptions and actions) and the reward function used. The simulated environment and a training workflow for sequential information control are integrated with the Proximal Policy Optimization (PPO) algorithm to learn how to modify ’on the fly’ an offline design based solely on previous observations and actions in a given experiment. Results obtained demonstrate the feasibility of using deep RL to guarantee the quality of the generated data and increase the level of automation that can be used on high-throughput platforms.</div></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"206 ","pages":"Article 109500"},"PeriodicalIF":3.9,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145691665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}