Eugeniu Strelet, Zhenyu Wang, You Peng, Ivan Castillo, Ricardo Rendall, Marco S. Reis
{"title":"Regularized Bayesian Fusion for Multimodal Data Integration in Industrial Processes","authors":"Eugeniu Strelet, Zhenyu Wang, You Peng, Ivan Castillo, Ricardo Rendall, Marco S. Reis","doi":"10.1021/acs.iecr.4c02956","DOIUrl":null,"url":null,"abstract":"The collection of data from multiple sources with distinct modalities and varying levels of quality is pervasive in modern industry. Furthermore, associated with each source are often different sampling rates, and some sources may not even have a regular acquisition pattern. These aspects pose significant challenges when developing machine learning (ML) models for predicting target variables, such as product properties, or process key performance indicators (KPIs). Data imputation schemes are a common solution but often require case-by-case analysis to mitigate the risk of introducing unrealistic artifacts, complicating the analysis pipeline and making the process more complex and less scalable. This work introduces a flexible solution for combining redundant sources of information with respect to a target response, considering their associated quality, while accommodating for different sampling rates and information quality. The proposed Regularized Bayesian Fusion (RegBF) approach aims to produce estimates of the target variable with an expected smoothness level, being at the same time compatible with the dominant dynamic mode of the industrial process. The methodology is scalable and flexible, as it can incorporate new data sources, at any time, in the form of either dynamic first-principle models, data-driven ML models, or instrumental information sources (e.g., online or laboratory analytical instruments). The proposed approach is tested in two case studies: one from a Kamyr digester process and the other from a wastewater treatment plant operation.","PeriodicalId":39,"journal":{"name":"Industrial & Engineering Chemistry Research","volume":"128 1","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Industrial & Engineering Chemistry Research","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1021/acs.iecr.4c02956","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}
引用次数: 0
Abstract
The collection of data from multiple sources with distinct modalities and varying levels of quality is pervasive in modern industry. Furthermore, associated with each source are often different sampling rates, and some sources may not even have a regular acquisition pattern. These aspects pose significant challenges when developing machine learning (ML) models for predicting target variables, such as product properties, or process key performance indicators (KPIs). Data imputation schemes are a common solution but often require case-by-case analysis to mitigate the risk of introducing unrealistic artifacts, complicating the analysis pipeline and making the process more complex and less scalable. This work introduces a flexible solution for combining redundant sources of information with respect to a target response, considering their associated quality, while accommodating for different sampling rates and information quality. The proposed Regularized Bayesian Fusion (RegBF) approach aims to produce estimates of the target variable with an expected smoothness level, being at the same time compatible with the dominant dynamic mode of the industrial process. The methodology is scalable and flexible, as it can incorporate new data sources, at any time, in the form of either dynamic first-principle models, data-driven ML models, or instrumental information sources (e.g., online or laboratory analytical instruments). The proposed approach is tested in two case studies: one from a Kamyr digester process and the other from a wastewater treatment plant operation.
期刊介绍:
ndustrial & Engineering Chemistry, with variations in title and format, has been published since 1909 by the American Chemical Society. Industrial & Engineering Chemistry Research is a weekly publication that reports industrial and academic research in the broad fields of applied chemistry and chemical engineering with special focus on fundamentals, processes, and products.