Harnessing machine learning tools for water quality assessment in the Kebili shallow aquifers, Southwestern Tunisia

IF 1.4 4区 地球科学 Q3 GEOCHEMISTRY & GEOPHYSICS Acta Geochimica Pub Date : 2024-04-30 DOI:10.1007/s11631-024-00689-z
Zohra Kraiem, Kamel Zouari, Rim Trabelsi
{"title":"Harnessing machine learning tools for water quality assessment in the Kebili shallow aquifers, Southwestern Tunisia","authors":"Zohra Kraiem,&nbsp;Kamel Zouari,&nbsp;Rim Trabelsi","doi":"10.1007/s11631-024-00689-z","DOIUrl":null,"url":null,"abstract":"<div><p>An integrated method that implements multivariate statistical analysis and ML methods to evaluate groundwater quality of the shallow aquifers of the Djerid and Kebili district, Southern Tunisia, was adopted. An evaluation of their suitability for irrigation and/or drinking purposes is necessary. A comprehensive hydrochemical assessment of 52 samples with entropy weighted water quality index (EWQI) was also proposed. Eleven water parameters were calculated to ascertain the potential use of those resources in irrigation and drinking. Multivariate analysis showed two main components with Dim1 (variance = 62.3%) and Dim.2 (variance = 22%), due to the bicarbonate, dissolution, and evaporation and the intrusion of drainage water. The evaluation of water quality has been carried out using EWQI model. The calculated EWQI for the Djerid and Kebili waters (i.e., 52 samples) varied between 7.5 and 152.62, indicating a range of 145.12. A mean of 79.12 was lower than the median (88.47). From the calculation of EWQI, only 14 samples are not suitable for irrigation because of their poor to extremely poor quality (26.92%). The bivariate plot showed high correlation for EWQI ~ TH (r = 0.93), EWQI ~ SAR(r = 0.87), indicating that water quality depended on those parameters. Different ML algorithms were successfully applied for the water quality classification. Our results indicated high prediction accuracy (SVM &gt; LDA &gt; ANN &gt; kNN) and perfect classification for kNN, LDA and Naive Bayes. For the purposes of developing the prediction models, the dataset was divided into two groups: training (80%) and testing (20%). To evaluate the models’ performance, RMSE, MSE, MAE and R<sup>2</sup> metrics were used. kNN (R<sup>2</sup> = 0.9359, MAE = 6.49, MSE = 79.00) and LDA (accuracy = 97.56%; kappa = 96.21%) achieved high accuracy. Moreover, linear regression indicated high correlation for both training (R<sup>2</sup> = 0.9727) and testing data (0.9890). This well confirmed the validity of LDA algorithm in predicting water quality. Cross validation showed a high accuracy (92.31%), high sensitivity (89.47%) and high specificity (95%). These findings are fundamentally important for an integrated water resource management in a larger context of sustainable development of the Kebili district.</p></div>","PeriodicalId":7151,"journal":{"name":"Acta Geochimica","volume":"43 6","pages":"1065 - 1086"},"PeriodicalIF":1.4000,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Geochimica","FirstCategoryId":"89","ListUrlMain":"https://link.springer.com/article/10.1007/s11631-024-00689-z","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GEOCHEMISTRY & GEOPHYSICS","Score":null,"Total":0}
引用次数: 0

Abstract

An integrated method that implements multivariate statistical analysis and ML methods to evaluate groundwater quality of the shallow aquifers of the Djerid and Kebili district, Southern Tunisia, was adopted. An evaluation of their suitability for irrigation and/or drinking purposes is necessary. A comprehensive hydrochemical assessment of 52 samples with entropy weighted water quality index (EWQI) was also proposed. Eleven water parameters were calculated to ascertain the potential use of those resources in irrigation and drinking. Multivariate analysis showed two main components with Dim1 (variance = 62.3%) and Dim.2 (variance = 22%), due to the bicarbonate, dissolution, and evaporation and the intrusion of drainage water. The evaluation of water quality has been carried out using EWQI model. The calculated EWQI for the Djerid and Kebili waters (i.e., 52 samples) varied between 7.5 and 152.62, indicating a range of 145.12. A mean of 79.12 was lower than the median (88.47). From the calculation of EWQI, only 14 samples are not suitable for irrigation because of their poor to extremely poor quality (26.92%). The bivariate plot showed high correlation for EWQI ~ TH (r = 0.93), EWQI ~ SAR(r = 0.87), indicating that water quality depended on those parameters. Different ML algorithms were successfully applied for the water quality classification. Our results indicated high prediction accuracy (SVM > LDA > ANN > kNN) and perfect classification for kNN, LDA and Naive Bayes. For the purposes of developing the prediction models, the dataset was divided into two groups: training (80%) and testing (20%). To evaluate the models’ performance, RMSE, MSE, MAE and R2 metrics were used. kNN (R2 = 0.9359, MAE = 6.49, MSE = 79.00) and LDA (accuracy = 97.56%; kappa = 96.21%) achieved high accuracy. Moreover, linear regression indicated high correlation for both training (R2 = 0.9727) and testing data (0.9890). This well confirmed the validity of LDA algorithm in predicting water quality. Cross validation showed a high accuracy (92.31%), high sensitivity (89.47%) and high specificity (95%). These findings are fundamentally important for an integrated water resource management in a larger context of sustainable development of the Kebili district.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用机器学习工具评估突尼斯西南部凯比利浅含水层的水质
突尼斯南部杰里德和克比利地区浅层含水层的地下水质量评估采用了一种综合方法,即多元统计分析和多变量方法。有必要对地下水是否适合灌溉和/或饮用进行评估。此外,还建议使用熵权水质指数(EWQI)对 52 个样本进行全面的水化学评估。计算了 11 个水参数,以确定这些资源在灌溉和饮用方面的潜在用途。多变量分析表明,Dim1(方差 = 62.3%)和 Dim.2(方差 = 22%)是两个主要成分,原因是碳酸氢盐、溶解、蒸发和排水侵入。水质评估采用 EWQI 模型进行。计算得出的杰里德和凯比利水域(即 52 个样本)的 EWQI 值介于 7.5 和 152.62 之间,范围为 145.12。平均值 79.12 低于中位数(88.47)。从 EWQI 的计算结果来看,只有 14 个样本因质量较差或极差而不适合灌溉(占 26.92%)。双变量图显示,EWQI ~ TH(r = 0.93)、EWQI ~ SAR(r = 0.87)具有高度相关性,表明水质取决于这些参数。不同的 ML 算法被成功应用于水质分类。结果表明,SVM > LDA > ANN > kNN 的预测准确率较高,而 kNN、LDA 和 Naive Bayes 的分类结果完美。为了开发预测模型,数据集被分为两组:训练组(80%)和测试组(20%)。kNN (R2 = 0.9359, MAE = 6.49, MSE = 79.00) 和 LDA (准确率 = 97.56%; kappa = 96.21%)获得了较高的准确率。此外,线性回归结果表明,训练数据(R2 = 0.9727)和测试数据(0.9890)的相关性都很高。这充分证实了 LDA 算法在预测水质方面的有效性。交叉验证结果表明,LDA 算法具有高准确率(92.31%)、高灵敏度(89.47%)和高特异性(95%)。这些发现对于在凯比利地区可持续发展的大背景下进行水资源综合管理具有重要意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Acta Geochimica
Acta Geochimica GEOCHEMISTRY & GEOPHYSICS-
CiteScore
2.80
自引率
6.20%
发文量
1134
期刊介绍: Acta Geochimica serves as the international forum for essential research on geochemistry, the science that uses the tools and principles of chemistry to explain the mechanisms behind major geological systems such as the Earth‘s crust, its oceans and the entire Solar System, as well as a number of processes including mantle convection, the formation of planets and the origins of granite and basalt. The journal focuses on, but is not limited to the following aspects: • Cosmochemistry • Mantle Geochemistry • Ore-deposit Geochemistry • Organic Geochemistry • Environmental Geochemistry • Computational Geochemistry • Isotope Geochemistry • NanoGeochemistry All research articles published in this journal have undergone rigorous peer review. In addition to original research articles, Acta Geochimica publishes reviews and short communications, aiming to rapidly disseminate the research results of timely interest, and comprehensive reviews of emerging topics in all the areas of geochemistry.
期刊最新文献
The discovery of Late Triassic hypabyssal mafic dykes in the Huozhou complex and their geological significance: Evidence from petrology, geochemistry, and geochronology Mineralogical study and significance of the basalt-hosted Carlin-type Au deposits in southwestern Guizhou Province, China Precise and accurate Ga isotope ratio measurements of geological samples by multi-collector inductively coupled plasma mass spectrometry Geology and S-Pb isotope geochemistry of the Hatu gold deposit in West Junggar, NW China: Insights into ore genesis and metal source Ore-forming mechanism of Huxu Au-dominated polymetallic deposit in the Dongxiang Basin, South China: Constraints from in-situ trace elements and S–Pb isotopes of pyrite
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1