Fazila Younas, Muhammad Fahad Sardar, Zahid Ullah, Jawad Ali, Xiaona Yu, Pengcheng Zhu, Weihua Guo, Khalid Mashay Al-Anazi, Mohammad Abul Farah, Zhaojie Cui
{"title":"Assessment of groundwater chemistry to predict arsenic contamination from a canal commanded area: applications of different machine learning models.","authors":"Fazila Younas, Muhammad Fahad Sardar, Zahid Ullah, Jawad Ali, Xiaona Yu, Pengcheng Zhu, Weihua Guo, Khalid Mashay Al-Anazi, Mohammad Abul Farah, Zhaojie Cui","doi":"10.1007/s10653-024-02334-3","DOIUrl":null,"url":null,"abstract":"<p><p>Groundwater arsenic (As), contamination is a significant issue worldwide including China and Pakistan, particularly in canal command areas. In this study, 131 groundwater samples were collected, and three machine learning models [Random Forest (RF), Logistic Regression (LR), and Artificial Neural Network (ANN)] were employed to predict As concentration. Descriptive statistics helped to conclude that all of the samples were inside the permitted limit of WHO for pH, Ca, Mg, Turbidity, Cl, K, Na, SO<sub>4</sub>, NO<sub>3</sub>, F and beyond limit of WHO for EC, HCO<sub>3</sub>, TDS, and As. RF suggested a median drop in Gini node impurity across all tree divisions. This predicted As contamination in samples due to presence of TDS, EC, HCO<sub>3</sub><sup>-</sup> and turbidity in upper end of graph which expressed significance of these factors in contaminating water with Arsenic. Moreover, these factors were found positively correlated with Ar contamination. LR model expressed about best fitness of model. ANN classified large data set into two classes i.e. (1) Inside limit of WHO and (2) and outside limit of WHO. Total dissolved solids (TDS), turbidity, sodium (Na) and electrical conductivity (EC) were positively correlated with Ar (Arsenic concentration) in the collected samples. pH and K were negatively associated with Arsenic concentration of the observed samples. Confusion matrices and ROC-AUC scores evaluated that RF, model outperforming than LR, and ANN, in accuracy and sensitivity. Key variables influencing As concentration in the groundwater resources of the study area were identified, such parameters include TDS, chloride (Cl), bicarbonate (HCO<sub>3</sub><sup>-</sup>) and turbidity. The study provided the complete profile of the 131 water samples which can be used to make strategies for the minimization of ground Water contamination for Rohri canal command area. Moreover, the steps can be taken to control the discussed parameters inside the WHO limit.</p>","PeriodicalId":11759,"journal":{"name":"Environmental Geochemistry and Health","volume":"47 2","pages":"46"},"PeriodicalIF":3.2000,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Geochemistry and Health","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1007/s10653-024-02334-3","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ENVIRONMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Groundwater arsenic (As), contamination is a significant issue worldwide including China and Pakistan, particularly in canal command areas. In this study, 131 groundwater samples were collected, and three machine learning models [Random Forest (RF), Logistic Regression (LR), and Artificial Neural Network (ANN)] were employed to predict As concentration. Descriptive statistics helped to conclude that all of the samples were inside the permitted limit of WHO for pH, Ca, Mg, Turbidity, Cl, K, Na, SO4, NO3, F and beyond limit of WHO for EC, HCO3, TDS, and As. RF suggested a median drop in Gini node impurity across all tree divisions. This predicted As contamination in samples due to presence of TDS, EC, HCO3- and turbidity in upper end of graph which expressed significance of these factors in contaminating water with Arsenic. Moreover, these factors were found positively correlated with Ar contamination. LR model expressed about best fitness of model. ANN classified large data set into two classes i.e. (1) Inside limit of WHO and (2) and outside limit of WHO. Total dissolved solids (TDS), turbidity, sodium (Na) and electrical conductivity (EC) were positively correlated with Ar (Arsenic concentration) in the collected samples. pH and K were negatively associated with Arsenic concentration of the observed samples. Confusion matrices and ROC-AUC scores evaluated that RF, model outperforming than LR, and ANN, in accuracy and sensitivity. Key variables influencing As concentration in the groundwater resources of the study area were identified, such parameters include TDS, chloride (Cl), bicarbonate (HCO3-) and turbidity. The study provided the complete profile of the 131 water samples which can be used to make strategies for the minimization of ground Water contamination for Rohri canal command area. Moreover, the steps can be taken to control the discussed parameters inside the WHO limit.
期刊介绍:
Environmental Geochemistry and Health publishes original research papers and review papers across the broad field of environmental geochemistry. Environmental geochemistry and health establishes and explains links between the natural or disturbed chemical composition of the earth’s surface and the health of plants, animals and people.
Beneficial elements regulate or promote enzymatic and hormonal activity whereas other elements may be toxic. Bedrock geochemistry controls the composition of soil and hence that of water and vegetation. Environmental issues, such as pollution, arising from the extraction and use of mineral resources, are discussed. The effects of contaminants introduced into the earth’s geochemical systems are examined. Geochemical surveys of soil, water and plants show how major and trace elements are distributed geographically. Associated epidemiological studies reveal the possibility of causal links between the natural or disturbed geochemical environment and disease. Experimental research illuminates the nature or consequences of natural or disturbed geochemical processes.
The journal particularly welcomes novel research linking environmental geochemistry and health issues on such topics as: heavy metals (including mercury), persistent organic pollutants (POPs), and mixed chemicals emitted through human activities, such as uncontrolled recycling of electronic-waste; waste recycling; surface-atmospheric interaction processes (natural and anthropogenic emissions, vertical transport, deposition, and physical-chemical interaction) of gases and aerosols; phytoremediation/restoration of contaminated sites; food contamination and safety; environmental effects of medicines; effects and toxicity of mixed pollutants; speciation of heavy metals/metalloids; effects of mining; disturbed geochemistry from human behavior, natural or man-made hazards; particle and nanoparticle toxicology; risk and the vulnerability of populations, etc.