Maedeh Mahmoudi, Amin Mahdavi-Meymand, Ammar AlDallal, Mohammad Zounemat-Kermani
{"title":"Improving groundwater quality predictions in semi-arid regions using ensemble learning models.","authors":"Maedeh Mahmoudi, Amin Mahdavi-Meymand, Ammar AlDallal, Mohammad Zounemat-Kermani","doi":"10.1007/s11356-024-35874-3","DOIUrl":null,"url":null,"abstract":"<p><p>Groundwater resources constitute one of the primary sources of freshwater in semi-arid and arid climates. Monitoring the groundwater quality is an essential component of environmental management. In this study, a comprehensive comparison was conducted to analyze the performance of nine ensembles and regular machine learning (ML) methods in predicting two water quality parameters including total dissolved solids (TDS) and pH, in an area with semi-arid climate conditions. The study area under consideration is an aquifer located in the Sirjan plain, Kerman, Iran. The developed models include standard multilayer perceptron neural network (MLPNN), classification and regression trees (CART), Chi-square automatic interaction detection (CHAID), and their ensemble versions in bagging (BG) and boosting (BT) ensemble structures. The analysis revealed that standard MLs yield comparable results in predicting TDS. The MLPNN, exhibiting a standard root mean square error (SRMSE) of 0.085, demonstrated superior accuracy in predicting TDS when contrasted with CART and CHAID models. Predicting pH poses a greater challenge for the models. Ensemble techniques significantly enhanced the accuracy of regular models. On average, the bagging and boosting techniques resulted in a 22.68% improvement in the accuracy of regular models, which represents a statistically significant enhancement. The boosting method, with an average SRMSE of 0.0602, is more accurate than bagging. Based on the results, the CHAID-BT with SRMSE of 0.0790 and CHAID-BG with SRMSE of 0.0330 are ranked the most accurate models for predicting TDS and pH, respectively. The performance of ensemble techniques in predicting TDS is more remarkable. In practical implementation, ensemble techniques can be considered an alternative method with high accuracy for sustainable water resources management in semi-arid regions, helping to address water shortages, climate change, water pollution, etc.</p>","PeriodicalId":545,"journal":{"name":"Environmental Science and Pollution Research","volume":" ","pages":""},"PeriodicalIF":5.8000,"publicationDate":"2025-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Science and Pollution Research","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1007/s11356-024-35874-3","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Groundwater resources constitute one of the primary sources of freshwater in semi-arid and arid climates. Monitoring the groundwater quality is an essential component of environmental management. In this study, a comprehensive comparison was conducted to analyze the performance of nine ensembles and regular machine learning (ML) methods in predicting two water quality parameters including total dissolved solids (TDS) and pH, in an area with semi-arid climate conditions. The study area under consideration is an aquifer located in the Sirjan plain, Kerman, Iran. The developed models include standard multilayer perceptron neural network (MLPNN), classification and regression trees (CART), Chi-square automatic interaction detection (CHAID), and their ensemble versions in bagging (BG) and boosting (BT) ensemble structures. The analysis revealed that standard MLs yield comparable results in predicting TDS. The MLPNN, exhibiting a standard root mean square error (SRMSE) of 0.085, demonstrated superior accuracy in predicting TDS when contrasted with CART and CHAID models. Predicting pH poses a greater challenge for the models. Ensemble techniques significantly enhanced the accuracy of regular models. On average, the bagging and boosting techniques resulted in a 22.68% improvement in the accuracy of regular models, which represents a statistically significant enhancement. The boosting method, with an average SRMSE of 0.0602, is more accurate than bagging. Based on the results, the CHAID-BT with SRMSE of 0.0790 and CHAID-BG with SRMSE of 0.0330 are ranked the most accurate models for predicting TDS and pH, respectively. The performance of ensemble techniques in predicting TDS is more remarkable. In practical implementation, ensemble techniques can be considered an alternative method with high accuracy for sustainable water resources management in semi-arid regions, helping to address water shortages, climate change, water pollution, etc.
期刊介绍:
Environmental Science and Pollution Research (ESPR) serves the international community in all areas of Environmental Science and related subjects with emphasis on chemical compounds. This includes:
- Terrestrial Biology and Ecology
- Aquatic Biology and Ecology
- Atmospheric Chemistry
- Environmental Microbiology/Biobased Energy Sources
- Phytoremediation and Ecosystem Restoration
- Environmental Analyses and Monitoring
- Assessment of Risks and Interactions of Pollutants in the Environment
- Conservation Biology and Sustainable Agriculture
- Impact of Chemicals/Pollutants on Human and Animal Health
It reports from a broad interdisciplinary outlook.