Graph Neural Network Integrating Self-Supervised Pretraining for Precise and Interpretable Prediction of Micropollutant Treatability by HO•-Based Advanced Oxidation Processes

IF 7.4 Q1 ENGINEERING, ENVIRONMENTAL ACS ES&T engineering Pub Date : 2024-09-18 DOI:10.1021/acsestengg.4c00389

Jingyi Zhu, Yuanxi Huang, Lingjun Bu, Yangtao Wu, Shiqing Zhou

{"title":"Graph Neural Network Integrating Self-Supervised Pretraining for Precise and Interpretable Prediction of Micropollutant Treatability by HO•-Based Advanced Oxidation Processes","authors":"Jingyi Zhu, Yuanxi Huang, Lingjun Bu, Yangtao Wu, Shiqing Zhou","doi":"10.1021/acsestengg.4c00389","DOIUrl":null,"url":null,"abstract":"Machine learning (ML) has become a crucial tool to accelerate research in advanced oxidation processes via predicting reaction parameters to evaluate the treatability of micropollutants (MPs). However, insufficient data sets and an incomplete prediction mechanism remain obstacles toward the precise prediction of MP treatability by a hydroxyl radical (HO•), especially when k values approach the diffusion-controlled limit. Herein, we propose a novel graph neural network (GNN) model integrating self-supervised pretraining on a large unlabeled data set (∼10 million) to predict the kHO values on MPs. Our model outperforms the common-seen and literature-established ML models on both whole data sets and diffusion-controlled limit data sets. Benefiting from the pretraining process, we demonstrate that k-value-related chemistry wisdom contained in the pretrained data set is fully exploited, and the learned knowledge can be transferred among data sets. In comparison with molecular fingerprints, we identify that molecular graphs (MGs) cover more structural information beyond substituents, facilitating a k-value prediction near the diffusion-controlled limit. In particular, we observe that mechanistic pathways of HO•-initiated reactions could be automatically classified and mapped out on the penultimate layer of our model. The phenomenon shows that the GNN model can be trained to excavate mechanistic knowledge by analyzing the kinetic parameters. These findings not only well interpret the robust model performance but also extrapolate the k-value prediction model to mechanistic elucidation, leading to better decision making in water treatment.","PeriodicalId":7008,"journal":{"name":"ACS ES&T engineering","volume":"18 1","pages":""},"PeriodicalIF":7.4000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS ES&T engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1021/acsestengg.4c00389","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ENVIRONMENTAL","Score":null,"Total":0}

引用次数: 0

Abstract

Machine learning (ML) has become a crucial tool to accelerate research in advanced oxidation processes via predicting reaction parameters to evaluate the treatability of micropollutants (MPs). However, insufficient data sets and an incomplete prediction mechanism remain obstacles toward the precise prediction of MP treatability by a hydroxyl radical (HO^•), especially when k values approach the diffusion-controlled limit. Herein, we propose a novel graph neural network (GNN) model integrating self-supervised pretraining on a large unlabeled data set (∼10 million) to predict the k_HO values on MPs. Our model outperforms the common-seen and literature-established ML models on both whole data sets and diffusion-controlled limit data sets. Benefiting from the pretraining process, we demonstrate that k-value-related chemistry wisdom contained in the pretrained data set is fully exploited, and the learned knowledge can be transferred among data sets. In comparison with molecular fingerprints, we identify that molecular graphs (MGs) cover more structural information beyond substituents, facilitating a k-value prediction near the diffusion-controlled limit. In particular, we observe that mechanistic pathways of HO^•-initiated reactions could be automatically classified and mapped out on the penultimate layer of our model. The phenomenon shows that the GNN model can be trained to excavate mechanistic knowledge by analyzing the kinetic parameters. These findings not only well interpret the robust model performance but also extrapolate the k-value prediction model to mechanistic elucidation, leading to better decision making in water treatment.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

图神经网络与自监督预训练相结合，可精确预测基于 HO 的高级氧化工艺对微污染物的处理能力

通过预测反应参数来评估微污染物（MPs）的可处理性，机器学习（ML）已成为加速高级氧化过程研究的重要工具。然而，数据集不足和预测机制不完整仍然是精确预测羟基自由基（HO-）对 MP 的可处理性的障碍，尤其是当 k 值接近扩散控制极限时。在此，我们提出了一种新颖的图神经网络（GNN）模型，该模型整合了对大量无标记数据集（1000 万）的自监督预训练，用于预测 MP 的 kHO 值。在整个数据集和扩散控制极限数据集上，我们的模型都优于常见的和文献中建立的 ML 模型。得益于预训练过程，我们证明了预训练数据集中包含的与 k 值相关的化学智慧得到了充分利用，并且所学知识可以在数据集之间转移。与分子指纹相比，我们发现分子图（MGs）涵盖了取代基以外的更多结构信息，有助于在扩散控制极限附近进行 k 值预测。特别是，我们观察到，在我们模型的倒数第二层，HO--引发反应的机理路径可以自动分类和绘制。这一现象表明，通过分析动力学参数，可以训练 GNN 模型挖掘机理知识。这些发现不仅很好地诠释了稳健模型的性能，而且将 k 值预测模型推向了机理阐释，从而在水处理方面做出更好的决策。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACS ES&T engineering ENGINEERING, ENVIRONMENTAL-

CiteScore

8.50

自引率

0.00%

发文量

期刊介绍： ACS ES&T Engineering publishes impactful research and review articles across all realms of environmental technology and engineering, employing a rigorous peer-review process. As a specialized journal, it aims to provide an international platform for research and innovation, inviting contributions on materials technologies, processes, data analytics, and engineering systems that can effectively manage, protect, and remediate air, water, and soil quality, as well as treat wastes and recover resources. The journal encourages research that supports informed decision-making within complex engineered systems and is grounded in mechanistic science and analytics, describing intricate environmental engineering systems. It considers papers presenting novel advancements, spanning from laboratory discovery to field-based application. However, case or demonstration studies lacking significant scientific advancements and technological innovations are not within its scope. Contributions containing experimental and/or theoretical methods, rooted in engineering principles and integrated with knowledge from other disciplines, are welcomed.

期刊最新文献

Issue Editorial Masthead Issue Publication Information Recognizing Excellence in Environmental Engineering Research: The 2023 ACS ES&T Engineering’s Best Paper Awards Review of Current and Future Indoor Air Purifying Technologies The Removal and Recovery of Non-orthophosphate from Wastewater: Current Practices and Future Directions