Gergely Horváth , Vilaboy José Trujillo , József Réti , Zoltán Kozár , Tamás Varga , Alex Kummer
{"title":"Soft-sensor development for product quality estimation with time delay and feature selection in industrial MDI production","authors":"Gergely Horváth , Vilaboy José Trujillo , József Réti , Zoltán Kozár , Tamás Varga , Alex Kummer","doi":"10.1016/j.ceja.2025.100751","DOIUrl":null,"url":null,"abstract":"<div><div>Methylenediphenyl diisocyanate (MDI) is an aromatic isocyanate produced in the highest quantities globally and serves as the raw material for numerous polyurethane products. The reaction system of MDI is intricate, characterized by multiple reactions, side reactions, and by-products with variations in quantity and quality, which pose challenges for analytical identification and monitoring. As such, presently, there exists no kinetic model in the scientific literature with adequate precision to accurately describe the synthesis of MDI or predict its coloration. Consequently, our aim is to develop soft sensors leveraging real industrial data to estimate the coloration of MDI mixtures in an explainable and interpretable manner.</div><div>In the course of our study, we employed five distinct feature selection techniques: MRMR, F-test, RReliefF, correlation-based methods, and their combined results, to derive an optimal feature set. Correlation-based techniques were utilized for each operational parameter to determine and incorporate the optimal time delays, which significantly influenced the model accuracy. We evaluated the performance of five different machine learning models, incorporating Bayesian hyperparameter optimization where applicable, namely Linear Regression, Regression Tree, Neural Network, Support Vector Machine Learning, and Gaussian Process Regression, among which the Gaussian models exhibited superior performance. To clarify the results of the Gaussian model, Partial Dependence Plots were generated, displayed and evaluated in an explainable way based on industrial experience and knowledge. Ultimately, a sensitivity analysis was conducted to evaluate the robustness of the optimal solution and to assess the responsiveness of the objective function to variations in each operational parameter.</div></div>","PeriodicalId":9749,"journal":{"name":"Chemical Engineering Journal Advances","volume":"22 ","pages":"Article 100751"},"PeriodicalIF":7.1000,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemical Engineering Journal Advances","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666821125000481","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Methylenediphenyl diisocyanate (MDI) is an aromatic isocyanate produced in the highest quantities globally and serves as the raw material for numerous polyurethane products. The reaction system of MDI is intricate, characterized by multiple reactions, side reactions, and by-products with variations in quantity and quality, which pose challenges for analytical identification and monitoring. As such, presently, there exists no kinetic model in the scientific literature with adequate precision to accurately describe the synthesis of MDI or predict its coloration. Consequently, our aim is to develop soft sensors leveraging real industrial data to estimate the coloration of MDI mixtures in an explainable and interpretable manner.
In the course of our study, we employed five distinct feature selection techniques: MRMR, F-test, RReliefF, correlation-based methods, and their combined results, to derive an optimal feature set. Correlation-based techniques were utilized for each operational parameter to determine and incorporate the optimal time delays, which significantly influenced the model accuracy. We evaluated the performance of five different machine learning models, incorporating Bayesian hyperparameter optimization where applicable, namely Linear Regression, Regression Tree, Neural Network, Support Vector Machine Learning, and Gaussian Process Regression, among which the Gaussian models exhibited superior performance. To clarify the results of the Gaussian model, Partial Dependence Plots were generated, displayed and evaluated in an explainable way based on industrial experience and knowledge. Ultimately, a sensitivity analysis was conducted to evaluate the robustness of the optimal solution and to assess the responsiveness of the objective function to variations in each operational parameter.