{"title":"A Machine Learning Framework for Enhanced Assessment of Sewer System Operation under Data Constraints and Skewed Distributions","authors":"Wan-Xin Yin, Yu-Qi Wang, Jia-Qiang Lv, Jia-Ji Chen, Shuai Liu, Zheng Pang, Ye Yuan, Hong-Xu Bao, Hong-Cheng Wang* and Ai-Jie Wang*, ","doi":"10.1021/acsestengg.4c0047710.1021/acsestengg.4c00477","DOIUrl":null,"url":null,"abstract":"<p >In the realm of sewer management, precise machine learning simulations of physicobiochemical processes during sewage transport are essential yet are hindered by skewed distributions and data constraints. To address this issue, the present study introduces an innovative algorithm, the Automatic Synthetic Minority Over-Sampling Technique for Regression with Gaussian Noise (AutoSMOGN), designed to mitigate the adverse effects of skewed data set distributions. The findings reveal that the integration of the AutoSMOGN algorithm with ML models significantly enhances the precision of gaseous H<sub>2</sub>S concentration predictions. Of these approaches, ensemble learning models demonstrated superior accuracy in forecasting gaseous H<sub>2</sub>S concentrations within sewer environments, achieving the highest coefficient of determination (<i>R</i><sup>2</sup>) of 0.80. Furthermore, the study validates the effectiveness of the AutoSMOGN algorithm in addressing skewed distribution, as evidenced by its acceptable predictive performance on a full-sequence data set (<i>R</i><sup>2</sup> of 0.52) and when applied to multiple variables, yielding <i>R</i><sup>2</sup> values of 0.88 for total nitrogen and 0.66 for total organic carbon, respectively. These results underscore the potential of the AutoSMOGN algorithm to significantly contribute to the development of new control and optimization strategies, thereby enhancing the maintenance and operational efficacy of sewer systems.</p>","PeriodicalId":7008,"journal":{"name":"ACS ES&T engineering","volume":"5 1","pages":"126–136 126–136"},"PeriodicalIF":7.4000,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS ES&T engineering","FirstCategoryId":"1085","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acsestengg.4c00477","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ENVIRONMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
In the realm of sewer management, precise machine learning simulations of physicobiochemical processes during sewage transport are essential yet are hindered by skewed distributions and data constraints. To address this issue, the present study introduces an innovative algorithm, the Automatic Synthetic Minority Over-Sampling Technique for Regression with Gaussian Noise (AutoSMOGN), designed to mitigate the adverse effects of skewed data set distributions. The findings reveal that the integration of the AutoSMOGN algorithm with ML models significantly enhances the precision of gaseous H2S concentration predictions. Of these approaches, ensemble learning models demonstrated superior accuracy in forecasting gaseous H2S concentrations within sewer environments, achieving the highest coefficient of determination (R2) of 0.80. Furthermore, the study validates the effectiveness of the AutoSMOGN algorithm in addressing skewed distribution, as evidenced by its acceptable predictive performance on a full-sequence data set (R2 of 0.52) and when applied to multiple variables, yielding R2 values of 0.88 for total nitrogen and 0.66 for total organic carbon, respectively. These results underscore the potential of the AutoSMOGN algorithm to significantly contribute to the development of new control and optimization strategies, thereby enhancing the maintenance and operational efficacy of sewer systems.
期刊介绍:
ACS ES&T Engineering publishes impactful research and review articles across all realms of environmental technology and engineering, employing a rigorous peer-review process. As a specialized journal, it aims to provide an international platform for research and innovation, inviting contributions on materials technologies, processes, data analytics, and engineering systems that can effectively manage, protect, and remediate air, water, and soil quality, as well as treat wastes and recover resources.
The journal encourages research that supports informed decision-making within complex engineered systems and is grounded in mechanistic science and analytics, describing intricate environmental engineering systems. It considers papers presenting novel advancements, spanning from laboratory discovery to field-based application. However, case or demonstration studies lacking significant scientific advancements and technological innovations are not within its scope.
Contributions containing experimental and/or theoretical methods, rooted in engineering principles and integrated with knowledge from other disciplines, are welcomed.