Model-free adjustment of reducing agent for SCR device under label deficiency: Regulation-oriented stage-wise reward deep Q-learning with transfer-learned state
Han Jiang, Shucai Zhang, Jingru Liu, Xin Peng, Weimin Zhong
{"title":"Model-free adjustment of reducing agent for SCR device under label deficiency: Regulation-oriented stage-wise reward deep Q-learning with transfer-learned state","authors":"Han Jiang, Shucai Zhang, Jingru Liu, Xin Peng, Weimin Zhong","doi":"10.1016/j.psep.2024.12.126","DOIUrl":null,"url":null,"abstract":"Data-driven methods of nitrogen oxides (<ce:italic>NO</ce:italic><ce:inf loc=\"post\"><ce:italic>X</ce:italic></ce:inf>) soft-sensing and selective catalytic reduction (SCR) operation for fluid catalytic cracking (FCC) process have two terms of issues. Firstly, labeled data might be deficient to train a prediction model due to the lack of monitoring devices. Secondly, the operational data can not be directly used to learn a reinforcement learning model. To address these issues, a latent temporal feature adaptation transfer learning and long-short reward deep q-learning network (LTFATL-LSRDQN) is proposed. It transfers the knowledge from another similar FCC process to realize the soft-sensing of <ce:italic>NO</ce:italic><ce:inf loc=\"post\"><ce:italic>X</ce:italic></ce:inf>. Maximum mean discrepancy loss is introduced to the objective function of autoencoder (AE) to unify the probability distribution of transformed latent features. The operation of the treatment device is abstracted to a Markov decision process. A long- and short-term reward mechanism is introduced to DQN to constrain the selection of action. The effectiveness of LTFATL-LSRDQN is verified with the data from industrial FCC processes. The introduction of domain adaptation successfully aligns the latent features, and achieves higher soft-sensing accuracy than some state-of-the-art methods. Using the results from LTFATL as inputs, LSRDQN accomplishes more enduringly continual operation of SCR device on the premise of the regulations and constrained actions.","PeriodicalId":20743,"journal":{"name":"Process Safety and Environmental Protection","volume":"29 1","pages":""},"PeriodicalIF":6.9000,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Process Safety and Environmental Protection","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1016/j.psep.2024.12.126","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Data-driven methods of nitrogen oxides (NOX) soft-sensing and selective catalytic reduction (SCR) operation for fluid catalytic cracking (FCC) process have two terms of issues. Firstly, labeled data might be deficient to train a prediction model due to the lack of monitoring devices. Secondly, the operational data can not be directly used to learn a reinforcement learning model. To address these issues, a latent temporal feature adaptation transfer learning and long-short reward deep q-learning network (LTFATL-LSRDQN) is proposed. It transfers the knowledge from another similar FCC process to realize the soft-sensing of NOX. Maximum mean discrepancy loss is introduced to the objective function of autoencoder (AE) to unify the probability distribution of transformed latent features. The operation of the treatment device is abstracted to a Markov decision process. A long- and short-term reward mechanism is introduced to DQN to constrain the selection of action. The effectiveness of LTFATL-LSRDQN is verified with the data from industrial FCC processes. The introduction of domain adaptation successfully aligns the latent features, and achieves higher soft-sensing accuracy than some state-of-the-art methods. Using the results from LTFATL as inputs, LSRDQN accomplishes more enduringly continual operation of SCR device on the premise of the regulations and constrained actions.
期刊介绍:
The Process Safety and Environmental Protection (PSEP) journal is a leading international publication that focuses on the publication of high-quality, original research papers in the field of engineering, specifically those related to the safety of industrial processes and environmental protection. The journal encourages submissions that present new developments in safety and environmental aspects, particularly those that show how research findings can be applied in process engineering design and practice.
PSEP is particularly interested in research that brings fresh perspectives to established engineering principles, identifies unsolved problems, or suggests directions for future research. The journal also values contributions that push the boundaries of traditional engineering and welcomes multidisciplinary papers.
PSEP's articles are abstracted and indexed by a range of databases and services, which helps to ensure that the journal's research is accessible and recognized in the academic and professional communities. These databases include ANTE, Chemical Abstracts, Chemical Hazards in Industry, Current Contents, Elsevier Engineering Information database, Pascal Francis, Web of Science, Scopus, Engineering Information Database EnCompass LIT (Elsevier), and INSPEC. This wide coverage facilitates the dissemination of the journal's content to a global audience interested in process safety and environmental engineering.