A novel hybrid variable cross layer-based machine learning model improves the accuracy and interpretation of energy intensity prediction of wastewater treatment plant.

IF 8 2区 环境科学与生态学 Q1 ENVIRONMENTAL SCIENCES Journal of Environmental Management Pub Date : 2024-11-13 DOI:10.1016/j.jenvman.2024.123209
Yucheng Li, Chen Cai, Erwu Liu, Xiaofeng Lin, Ying Zhang, Hongjing Chen, Zhongqing Wei, Xiangfeng Huang, Ru Guo, Kaiming Peng, Jia Liu
{"title":"A novel hybrid variable cross layer-based machine learning model improves the accuracy and interpretation of energy intensity prediction of wastewater treatment plant.","authors":"Yucheng Li, Chen Cai, Erwu Liu, Xiaofeng Lin, Ying Zhang, Hongjing Chen, Zhongqing Wei, Xiangfeng Huang, Ru Guo, Kaiming Peng, Jia Liu","doi":"10.1016/j.jenvman.2024.123209","DOIUrl":null,"url":null,"abstract":"<p><p>Energy intensity (EI) prediction in wastewater treatment plants (WWTPs) suffers from inaccuracy and non-interpretability due to poor data quality, complex mechanisms and various confounding variables. In this study, the novel hybrid variable cross layer-based machine learning (VCL-ML) model was devised, which generates new knowledge with monitoring indicators (e.g., COD, etc.) and then embeds both domain knowledge and monitoring indicators into the ML model. This novel hybrid VCL-ML model achieves a root-mean-square error (RMSE) of 0.021 kW h/m³ with an 8.7% improvement over the conventional ML (Con-ML) model. The Shapley additive explanation demonstrated that domain knowledge features are ranked high and have important interpretable implications for the model, such as capacity utilization (CU), which measures the efficiency of resource use, and total nitrogen remaining rate (TN_rr), which indicates the nitrogen retention in a system. Partially dependent interactions between domain knowledge (e.g., sludge yield) and monitoring indexes (e.g., influent pH) could contribute to the interpretation of reality. By comparing the feature categorization between VCL-ML and Con-ML models, temporal information (e.g., month) and removal information (e.g., TN_rr) played an important role in the model's performance improvement. This result highlights the strong correlation between wastewater treatment plant energy intensity with pollutant removal and temporal information while weakening the contribution of other redundant features. This VCL-ML model improves the predicting accuracy and interpretation of the EI of WWTPs, which can be used in the optimal operation and sustainable management of WWTPs.</p>","PeriodicalId":356,"journal":{"name":"Journal of Environmental Management","volume":"371 ","pages":"123209"},"PeriodicalIF":8.0000,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Environmental Management","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1016/j.jenvman.2024.123209","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Energy intensity (EI) prediction in wastewater treatment plants (WWTPs) suffers from inaccuracy and non-interpretability due to poor data quality, complex mechanisms and various confounding variables. In this study, the novel hybrid variable cross layer-based machine learning (VCL-ML) model was devised, which generates new knowledge with monitoring indicators (e.g., COD, etc.) and then embeds both domain knowledge and monitoring indicators into the ML model. This novel hybrid VCL-ML model achieves a root-mean-square error (RMSE) of 0.021 kW h/m³ with an 8.7% improvement over the conventional ML (Con-ML) model. The Shapley additive explanation demonstrated that domain knowledge features are ranked high and have important interpretable implications for the model, such as capacity utilization (CU), which measures the efficiency of resource use, and total nitrogen remaining rate (TN_rr), which indicates the nitrogen retention in a system. Partially dependent interactions between domain knowledge (e.g., sludge yield) and monitoring indexes (e.g., influent pH) could contribute to the interpretation of reality. By comparing the feature categorization between VCL-ML and Con-ML models, temporal information (e.g., month) and removal information (e.g., TN_rr) played an important role in the model's performance improvement. This result highlights the strong correlation between wastewater treatment plant energy intensity with pollutant removal and temporal information while weakening the contribution of other redundant features. This VCL-ML model improves the predicting accuracy and interpretation of the EI of WWTPs, which can be used in the optimal operation and sustainable management of WWTPs.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于交叉层的新型混合变量机器学习模型提高了污水处理厂能源强度预测的准确性和解释能力。
由于数据质量差、机制复杂和各种混杂变量,污水处理厂(WWTPs)的能源强度(EI)预测存在不准确和不可解释性的问题。本研究设计了基于变量交叉层的新型混合机器学习(VCL-ML)模型,通过监测指标(如 COD 等)生成新知识,然后将领域知识和监测指标嵌入 ML 模型。这种新型混合 VCL-ML 模型的均方根误差(RMSE)为 0.021 kW h/m³,比传统的 ML(Con-ML)模型提高了 8.7%。夏普利加法解释表明,领域知识特征排名靠前,对模型具有重要的可解释性影响,如容量利用率(CU)(衡量资源利用效率)和总氮剩余率(TN_rr)(表示系统中的氮留存率)。领域知识(如污泥产量)与监测指标(如进水 pH 值)之间的部分依赖性相互作用有助于解释现实情况。通过比较 VCL-ML 模型和 Con-ML 模型的特征分类,时间信息(如月份)和去除信息(如 TN_rr)对模型性能的提高起到了重要作用。这一结果凸显了污水处理厂能源强度与污染物去除率和时间信息之间的紧密相关性,同时削弱了其他冗余特征的贡献。该 VCL-ML 模型提高了污水处理厂能耗强度的预测精度和解释能力,可用于污水处理厂的优化运行和可持续管理。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Environmental Management
Journal of Environmental Management 环境科学-环境科学
CiteScore
13.70
自引率
5.70%
发文量
2477
审稿时长
84 days
期刊介绍: The Journal of Environmental Management is a journal for the publication of peer reviewed, original research for all aspects of management and the managed use of the environment, both natural and man-made.Critical review articles are also welcome; submission of these is strongly encouraged.
期刊最新文献
A new modeling approach for microplastic drag and settling velocity. Achieving China's CO2 reduction targets: Insights from a hybrid PPA-PPR forecasting model. Can big data policy drive urban carbon unlocking efficiency? A new approach based on double machine learning. Deadwood manipulation and type determine assemblage composition of saproxylic beetles and fungi after a decade. Environmental taxes promote the synergy between pollution and carbon reduction: Provincial evidence from China.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1