Prioritization of monitoring compounds from SNTS identified organic micropollutants in contaminated groundwater using a machine learning optimized ToxPi model
{"title":"Prioritization of monitoring compounds from SNTS identified organic micropollutants in contaminated groundwater using a machine learning optimized ToxPi model","authors":"Okon Dominic Ekpe, Haeran Moon, JongCheol Pyo, Jeong-Eun Oh","doi":"10.1016/j.watres.2024.122824","DOIUrl":null,"url":null,"abstract":"Advanced suspect and non-target screening (SNTS) approach can identify a large number of potential hazardous micropollutants in groundwater, underscoring the need for pinpointing priority pollutants among detected chemicals. This present study therefore demonstrates a novel multi-criteria decision making (MCDM) framework utilizing machine learning (ML) algorithms coupled with toxicological prioritization index tool (i.e., ml_ToxPi) to rank 251 chemicals of interest in groundwater for subsequent targeted analysis. The MCDM framework integrated chemical analysis data (i.e., peak area and detection frequency), toxicity profiles (i.e., bioactivity ratio, human exposure metadata, and carcinogenicity metadata), as well as the environmental fate and transport information (i.e., octanol-water partition coefficient (log K<sub>ow</sub>), water solubility, biodegradation half-life, and soil adsorption coefficient (K<sub>oc</sub>)) for ranking the identified pollutants, and the random forest machine learning model was useful for systematically determining the weighting factors of each variable according to their variable importance scores (R<sup>2</sup> = 0.808 and 0.778 for training and testing datasets, respectively, while RMSE = 0.042 in both cases). A total of 47 unique high priority compounds (i.e., ml_ToxPi score ≥ 0.55) were identified across the investigated sampling regions, which constituted diverse groups of compounds classified according to their chemical uses, such as alkylated polycyclic aromatic hydrocarbons (alkyl-PAHs), organophosphate flame retardants (OPFRs), parent PAHs, personal care products (PCPs), pesticides, pharmaceuticals, phenols, plasticizers, transformation product (TPs), and other industrial use chemicals. By incorporating relevant variables into the proposed ML-optimized ToxPi MCDM framework, the prioritization approach described here may be adopted in future SNTS assessment of environmental and biological media.","PeriodicalId":443,"journal":{"name":"Water Research","volume":"14 1","pages":""},"PeriodicalIF":11.4000,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Water Research","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1016/j.watres.2024.122824","RegionNum":1,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ENVIRONMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Advanced suspect and non-target screening (SNTS) approach can identify a large number of potential hazardous micropollutants in groundwater, underscoring the need for pinpointing priority pollutants among detected chemicals. This present study therefore demonstrates a novel multi-criteria decision making (MCDM) framework utilizing machine learning (ML) algorithms coupled with toxicological prioritization index tool (i.e., ml_ToxPi) to rank 251 chemicals of interest in groundwater for subsequent targeted analysis. The MCDM framework integrated chemical analysis data (i.e., peak area and detection frequency), toxicity profiles (i.e., bioactivity ratio, human exposure metadata, and carcinogenicity metadata), as well as the environmental fate and transport information (i.e., octanol-water partition coefficient (log Kow), water solubility, biodegradation half-life, and soil adsorption coefficient (Koc)) for ranking the identified pollutants, and the random forest machine learning model was useful for systematically determining the weighting factors of each variable according to their variable importance scores (R2 = 0.808 and 0.778 for training and testing datasets, respectively, while RMSE = 0.042 in both cases). A total of 47 unique high priority compounds (i.e., ml_ToxPi score ≥ 0.55) were identified across the investigated sampling regions, which constituted diverse groups of compounds classified according to their chemical uses, such as alkylated polycyclic aromatic hydrocarbons (alkyl-PAHs), organophosphate flame retardants (OPFRs), parent PAHs, personal care products (PCPs), pesticides, pharmaceuticals, phenols, plasticizers, transformation product (TPs), and other industrial use chemicals. By incorporating relevant variables into the proposed ML-optimized ToxPi MCDM framework, the prioritization approach described here may be adopted in future SNTS assessment of environmental and biological media.
期刊介绍:
Water Research, along with its open access companion journal Water Research X, serves as a platform for publishing original research papers covering various aspects of the science and technology related to the anthropogenic water cycle, water quality, and its management worldwide. The audience targeted by the journal comprises biologists, chemical engineers, chemists, civil engineers, environmental engineers, limnologists, and microbiologists. The scope of the journal include:
•Treatment processes for water and wastewaters (municipal, agricultural, industrial, and on-site treatment), including resource recovery and residuals management;
•Urban hydrology including sewer systems, stormwater management, and green infrastructure;
•Drinking water treatment and distribution;
•Potable and non-potable water reuse;
•Sanitation, public health, and risk assessment;
•Anaerobic digestion, solid and hazardous waste management, including source characterization and the effects and control of leachates and gaseous emissions;
•Contaminants (chemical, microbial, anthropogenic particles such as nanoparticles or microplastics) and related water quality sensing, monitoring, fate, and assessment;
•Anthropogenic impacts on inland, tidal, coastal and urban waters, focusing on surface and ground waters, and point and non-point sources of pollution;
•Environmental restoration, linked to surface water, groundwater and groundwater remediation;
•Analysis of the interfaces between sediments and water, and between water and atmosphere, focusing specifically on anthropogenic impacts;
•Mathematical modelling, systems analysis, machine learning, and beneficial use of big data related to the anthropogenic water cycle;
•Socio-economic, policy, and regulations studies.