Toward the development of an ML-driven decision support system for wastewater treatment: A bacterial inactivation prediction approach in solar photochemical processes.
Pavel Pascacio, David J Vicente, Ilaria Berruti, Samira Nahim Granados, Isabel Oller, M Inmaculada Polo-López, Fernando Salazar
{"title":"Toward the development of an ML-driven decision support system for wastewater treatment: A bacterial inactivation prediction approach in solar photochemical processes.","authors":"Pavel Pascacio, David J Vicente, Ilaria Berruti, Samira Nahim Granados, Isabel Oller, M Inmaculada Polo-López, Fernando Salazar","doi":"10.1016/j.jenvman.2024.123537","DOIUrl":null,"url":null,"abstract":"<p><p>The design of efficient bacterial inactivation treatment in wastewater is challenging due to its numerous parameters and the complex composition of wastewater. Although solar photochemical processes (PCPs) provide energy-saving benefits, a balance must be maintained between bacterial inactivation efficiency and experimental costs. Predictive decision tools for bacterial inactivation under various conditions would significantly contribute to optimizing PCP design resources. This study evaluated four machine learning algorithms (ML) (i.e., Artificial Neural Network (ANN), Random Forest (RF), Support Vector Machine (SVM), and Extreme Gradient Boost (XGBoost)) for predicting bacterial inactivation behavior, using Escherichia coli, Enterococcus spp., and Salmonella spp. Several oxidant types, bacterial concentrations, and aqueous matrices were evaluated in two scenarios simulating real-world conditions. Results demonstrated that decision tree-based models (RF and XGBoost) outperformed SVM and ANN in accuracy. In Scenario I (prediction of intermediate experimental values over time) the XGBoost model was most effective, achieving a Root Mean Square Error (RMSE) of 0.81, 0.76 and 0.55 and an R<sup>2</sup> of 0.84, 0.79, and 0.87 for the three bacteria, respectively. In Scenario II (prediction of full experimental values over time), the RF model excelled for Escherichia coli and Salmonella spp. with an RMSE of 0.88 for both and an R<sup>2</sup> of 0.80 and 0.71, respectively. The XGBoost model showed moderate effectiveness for Enterococcus sp. with an RMSE of 1.31 and R<sup>2</sup> of 0.50. Overall, the decision tree-based models demonstrated their potential for prediction in tests of a wide range of PCP parameters without requiring additional trials.</p>","PeriodicalId":356,"journal":{"name":"Journal of Environmental Management","volume":"373 ","pages":"123537"},"PeriodicalIF":8.0000,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Environmental Management","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1016/j.jenvman.2024.123537","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
The design of efficient bacterial inactivation treatment in wastewater is challenging due to its numerous parameters and the complex composition of wastewater. Although solar photochemical processes (PCPs) provide energy-saving benefits, a balance must be maintained between bacterial inactivation efficiency and experimental costs. Predictive decision tools for bacterial inactivation under various conditions would significantly contribute to optimizing PCP design resources. This study evaluated four machine learning algorithms (ML) (i.e., Artificial Neural Network (ANN), Random Forest (RF), Support Vector Machine (SVM), and Extreme Gradient Boost (XGBoost)) for predicting bacterial inactivation behavior, using Escherichia coli, Enterococcus spp., and Salmonella spp. Several oxidant types, bacterial concentrations, and aqueous matrices were evaluated in two scenarios simulating real-world conditions. Results demonstrated that decision tree-based models (RF and XGBoost) outperformed SVM and ANN in accuracy. In Scenario I (prediction of intermediate experimental values over time) the XGBoost model was most effective, achieving a Root Mean Square Error (RMSE) of 0.81, 0.76 and 0.55 and an R2 of 0.84, 0.79, and 0.87 for the three bacteria, respectively. In Scenario II (prediction of full experimental values over time), the RF model excelled for Escherichia coli and Salmonella spp. with an RMSE of 0.88 for both and an R2 of 0.80 and 0.71, respectively. The XGBoost model showed moderate effectiveness for Enterococcus sp. with an RMSE of 1.31 and R2 of 0.50. Overall, the decision tree-based models demonstrated their potential for prediction in tests of a wide range of PCP parameters without requiring additional trials.
期刊介绍:
The Journal of Environmental Management is a journal for the publication of peer reviewed, original research for all aspects of management and the managed use of the environment, both natural and man-made.Critical review articles are also welcome; submission of these is strongly encouraged.