Optimizing coastal groundwater quality predictions: A novel data mining framework with cross-validation, bootstrapping, and entropy analysis.

IF 3.5 3区 环境科学与生态学 Q2 ENVIRONMENTAL SCIENCES Journal of contaminant hydrology Pub Date : 2024-12-10 DOI:10.1016/j.jconhyd.2024.104480
Abu Reza Md Towfiqul Islam, Md Abdullah-Al Mamun, Mehedi Hasan, Mst Nazneen Aktar, Md Nashir Uddin, Md Abu Bakar Siddique, Mohaiminul Haider Chowdhury, Md Saiful Islam, A B M Mainul Bari, Abubakr M Idris, Venkatramanan Senapathi
{"title":"Optimizing coastal groundwater quality predictions: A novel data mining framework with cross-validation, bootstrapping, and entropy analysis.","authors":"Abu Reza Md Towfiqul Islam, Md Abdullah-Al Mamun, Mehedi Hasan, Mst Nazneen Aktar, Md Nashir Uddin, Md Abu Bakar Siddique, Mohaiminul Haider Chowdhury, Md Saiful Islam, A B M Mainul Bari, Abubakr M Idris, Venkatramanan Senapathi","doi":"10.1016/j.jconhyd.2024.104480","DOIUrl":null,"url":null,"abstract":"<p><p>Investigating the potential of novel data mining algorithms (DMAs) for modeling groundwater quality in coastal areas is an important requirement for groundwater resource management, especially in the coastal region of Bangladesh where groundwater is highly contaminated. In this work, the applicability of DMA, including Gaussian Process Regression (GPR), Bayesian Ridge Regression (BRR) and Artificial Neural Network (ANN), for predicting groundwater quality in coastal areas was investigated. The optuna-based optimized hyperparameter is proposed to improve the accuracy of the models, including optuna-GPR and optuna-BRR as benchmark models. Combined cross-validation (CV) and bootstrapping (B) methods were used to build six predictive models. The entropy-based coastal groundwater quality index (ECWQI) was converted into a normalized index (ECWQIn), which was divided into five classes from very poor to excellent. The self-organizing map (SOM), spatial autocorrelation and fuzzy logic model were used to identify spatial groundwater quality patterns based on 12 physicochemical variables collected from 67 groundwater wells. The SOM analysis identified four distinct spatial patterns, including EC-TDS-Cl<sup>-</sup>, MgpH, Ca<sup>2+</sup>K<sup>+</sup>NO₃<sup>-</sup>, and HCO₃<sup>-</sup>SO₄<sup>2-</sup>Na<sup>+</sup>F<sup>-</sup>. The results showed that both the ANN (CV) and ANN (B) models performed better than other optuna-based models during the test phase (RMSE = 0.041, MAE = 0.026, R2 = 0.971, RAE = 0.15 = 21 and CC = 0.986) and (RMSE = 0.041, MAE = 0.025, R2 = 0.969, RAE = 0.119 and CC = 0.975), respectively. SO<sub>4</sub><sup>2-</sup>, Cl<sup>-</sup> and F<sup>-</sup> played an important role in the prediction accuracy. F- and SO<sub>4</sub><sup>2-</sup> showed higher spatial autocorrelation, which affected groundwater quality degradation. In addition, the ANN (CV) and ANN (B) models showed a Gaussian distribution of model errors (small standard error, <1 %), indicating the stability of the model. These results indicate the efficiency of the ANN model in predicting groundwater quality in coastal areas, which would help regional water managers in real-time monitoring and management of sustainable groundwater resources.</p>","PeriodicalId":15530,"journal":{"name":"Journal of contaminant hydrology","volume":"269 ","pages":"104480"},"PeriodicalIF":3.5000,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of contaminant hydrology","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1016/j.jconhyd.2024.104480","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Investigating the potential of novel data mining algorithms (DMAs) for modeling groundwater quality in coastal areas is an important requirement for groundwater resource management, especially in the coastal region of Bangladesh where groundwater is highly contaminated. In this work, the applicability of DMA, including Gaussian Process Regression (GPR), Bayesian Ridge Regression (BRR) and Artificial Neural Network (ANN), for predicting groundwater quality in coastal areas was investigated. The optuna-based optimized hyperparameter is proposed to improve the accuracy of the models, including optuna-GPR and optuna-BRR as benchmark models. Combined cross-validation (CV) and bootstrapping (B) methods were used to build six predictive models. The entropy-based coastal groundwater quality index (ECWQI) was converted into a normalized index (ECWQIn), which was divided into five classes from very poor to excellent. The self-organizing map (SOM), spatial autocorrelation and fuzzy logic model were used to identify spatial groundwater quality patterns based on 12 physicochemical variables collected from 67 groundwater wells. The SOM analysis identified four distinct spatial patterns, including EC-TDS-Cl-, MgpH, Ca2+K+NO₃-, and HCO₃-SO₄2-Na+F-. The results showed that both the ANN (CV) and ANN (B) models performed better than other optuna-based models during the test phase (RMSE = 0.041, MAE = 0.026, R2 = 0.971, RAE = 0.15 = 21 and CC = 0.986) and (RMSE = 0.041, MAE = 0.025, R2 = 0.969, RAE = 0.119 and CC = 0.975), respectively. SO42-, Cl- and F- played an important role in the prediction accuracy. F- and SO42- showed higher spatial autocorrelation, which affected groundwater quality degradation. In addition, the ANN (CV) and ANN (B) models showed a Gaussian distribution of model errors (small standard error, <1 %), indicating the stability of the model. These results indicate the efficiency of the ANN model in predicting groundwater quality in coastal areas, which would help regional water managers in real-time monitoring and management of sustainable groundwater resources.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of contaminant hydrology
Journal of contaminant hydrology 环境科学-地球科学综合
CiteScore
6.80
自引率
2.80%
发文量
129
审稿时长
68 days
期刊介绍: The Journal of Contaminant Hydrology is an international journal publishing scientific articles pertaining to the contamination of subsurface water resources. Emphasis is placed on investigations of the physical, chemical, and biological processes influencing the behavior and fate of organic and inorganic contaminants in the unsaturated (vadose) and saturated (groundwater) zones, as well as at groundwater-surface water interfaces. The ecological impacts of contaminants transported both from and to aquifers are of interest. Articles on contamination of surface water only, without a link to groundwater, are out of the scope. Broad latitude is allowed in identifying contaminants of interest, and include legacy and emerging pollutants, nutrients, nanoparticles, pathogenic microorganisms (e.g., bacteria, viruses, protozoa), microplastics, and various constituents associated with energy production (e.g., methane, carbon dioxide, hydrogen sulfide). The journal''s scope embraces a wide range of topics including: experimental investigations of contaminant sorption, diffusion, transformation, volatilization and transport in the surface and subsurface; characterization of soil and aquifer properties only as they influence contaminant behavior; development and testing of mathematical models of contaminant behaviour; innovative techniques for restoration of contaminated sites; development of new tools or techniques for monitoring the extent of soil and groundwater contamination; transformation of contaminants in the hyporheic zone; effects of contaminants traversing the hyporheic zone on surface water and groundwater ecosystems; subsurface carbon sequestration and/or turnover; and migration of fluids associated with energy production into groundwater.
期刊最新文献
Sorption behavior of oxytetracycline on microplastics and the influence of environmental factors in groundwater: Experimental investigation and molecular dynamics simulation. Optimizing coastal groundwater quality predictions: A novel data mining framework with cross-validation, bootstrapping, and entropy analysis. Comparison of adsorption capacity of 4-Nonylphenol on conventional and biodegradable microplastics aged under natural water. First evidence of microplastics in the Quilca-Vítor-Chili river basin, Arequipa region, Peru. Characterization and risk assessment of microplastics pollution in Mohamaya Lake, Bangladesh.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1