Application of bagging and boosting ensemble machine learning techniques for groundwater potential mapping in a drought-prone agriculture region of eastern India

IF 6 3区 环境科学与生态学 Q1 ENVIRONMENTAL SCIENCES Environmental Sciences Europe Pub Date : 2024-09-02 DOI:10.1186/s12302-024-00981-y
Krishnagopal Halder, Amit Kumar Srivastava, Anitabha Ghosh, Ranajit Nabik, Subrata Pan, Uday Chatterjee, Dipak Bisai, Subodh Chandra Pal, Wenzhi Zeng, Frank Ewert, Thomas Gaiser, Chaitanya Baliram Pande, Abu Reza Md. Towfiqul Islam, Edris Alam, Md Kamrul Islam
{"title":"Application of bagging and boosting ensemble machine learning techniques for groundwater potential mapping in a drought-prone agriculture region of eastern India","authors":"Krishnagopal Halder,&nbsp;Amit Kumar Srivastava,&nbsp;Anitabha Ghosh,&nbsp;Ranajit Nabik,&nbsp;Subrata Pan,&nbsp;Uday Chatterjee,&nbsp;Dipak Bisai,&nbsp;Subodh Chandra Pal,&nbsp;Wenzhi Zeng,&nbsp;Frank Ewert,&nbsp;Thomas Gaiser,&nbsp;Chaitanya Baliram Pande,&nbsp;Abu Reza Md. Towfiqul Islam,&nbsp;Edris Alam,&nbsp;Md Kamrul Islam","doi":"10.1186/s12302-024-00981-y","DOIUrl":null,"url":null,"abstract":"<div><p>Groundwater is a primary source of drinking water for billions worldwide. It plays a crucial role in irrigation, domestic, and industrial uses, and significantly contributes to drought resilience in various regions. However, excessive groundwater discharge has left many areas vulnerable to potable water shortages. Therefore, assessing groundwater potential zones (GWPZ) is essential for implementing sustainable management practices to ensure the availability of groundwater for present and future generations. This study aims to delineate areas with high groundwater potential in the Bankura district of West Bengal using four machine learning methods: Random Forest (RF), Adaptive Boosting (AdaBoost), Extreme Gradient Boosting (XGBoost), and Voting Ensemble (VE). The models used 161 data points, comprising 70% of the training dataset, to identify significant correlations between the presence and absence of groundwater in the region. Among the methods, Random Forest (RF) and Extreme Gradient Boosting (XGBoost) proved to be the most effective in mapping groundwater potential, suggesting their applicability in other regions with similar hydrogeological conditions. The performance metrics for RF are very good with a precision of 0.919, recall of 0.971, F1-score of 0.944, and accuracy of 0.943. This indicates a strong capability to accurately predict groundwater zones with minimal false positives and negatives. Adaptive Boosting (AdaBoost) demonstrated comparable performance across all metrics (precision: 0.919, recall: 0.971, F1-score: 0.944, accuracy: 0.943), highlighting its effectiveness in predicting groundwater potential areas accurately; whereas, Extreme Gradient Boosting (XGBoost) outperformed the other models slightly, with higher values in all metrics: precision (0.944), recall (0.971), F1-score (0.958), and accuracy (0.957), suggesting a more refined model performance. The Voting Ensemble (VE) approach also showed enhanced performance, mirroring XGBoost's metrics (precision: 0.944, recall: 0.971, F1-score: 0.958, accuracy: 0.957). This indicates that combining the strengths of individual models leads to better predictions. The groundwater potentiality zoning across the Bankura district varied significantly, with areas of very low potentiality accounting for 41.81% and very high potentiality at 24.35%. The uncertainty in predictions ranged from 0.0 to 0.75 across the study area, reflecting the variability in groundwater availability and the need for targeted management strategies.</p><p>In summary, this study highlights the critical need for assessing and managing groundwater resources effectively using advanced machine learning techniques. The findings provide a foundation for better groundwater management practices, ensuring sustainable use and conservation in Bankura district and beyond.</p></div>","PeriodicalId":546,"journal":{"name":"Environmental Sciences Europe","volume":"36 1","pages":""},"PeriodicalIF":6.0000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1186/s12302-024-00981-y.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Sciences Europe","FirstCategoryId":"93","ListUrlMain":"https://link.springer.com/article/10.1186/s12302-024-00981-y","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Groundwater is a primary source of drinking water for billions worldwide. It plays a crucial role in irrigation, domestic, and industrial uses, and significantly contributes to drought resilience in various regions. However, excessive groundwater discharge has left many areas vulnerable to potable water shortages. Therefore, assessing groundwater potential zones (GWPZ) is essential for implementing sustainable management practices to ensure the availability of groundwater for present and future generations. This study aims to delineate areas with high groundwater potential in the Bankura district of West Bengal using four machine learning methods: Random Forest (RF), Adaptive Boosting (AdaBoost), Extreme Gradient Boosting (XGBoost), and Voting Ensemble (VE). The models used 161 data points, comprising 70% of the training dataset, to identify significant correlations between the presence and absence of groundwater in the region. Among the methods, Random Forest (RF) and Extreme Gradient Boosting (XGBoost) proved to be the most effective in mapping groundwater potential, suggesting their applicability in other regions with similar hydrogeological conditions. The performance metrics for RF are very good with a precision of 0.919, recall of 0.971, F1-score of 0.944, and accuracy of 0.943. This indicates a strong capability to accurately predict groundwater zones with minimal false positives and negatives. Adaptive Boosting (AdaBoost) demonstrated comparable performance across all metrics (precision: 0.919, recall: 0.971, F1-score: 0.944, accuracy: 0.943), highlighting its effectiveness in predicting groundwater potential areas accurately; whereas, Extreme Gradient Boosting (XGBoost) outperformed the other models slightly, with higher values in all metrics: precision (0.944), recall (0.971), F1-score (0.958), and accuracy (0.957), suggesting a more refined model performance. The Voting Ensemble (VE) approach also showed enhanced performance, mirroring XGBoost's metrics (precision: 0.944, recall: 0.971, F1-score: 0.958, accuracy: 0.957). This indicates that combining the strengths of individual models leads to better predictions. The groundwater potentiality zoning across the Bankura district varied significantly, with areas of very low potentiality accounting for 41.81% and very high potentiality at 24.35%. The uncertainty in predictions ranged from 0.0 to 0.75 across the study area, reflecting the variability in groundwater availability and the need for targeted management strategies.

In summary, this study highlights the critical need for assessing and managing groundwater resources effectively using advanced machine learning techniques. The findings provide a foundation for better groundwater management practices, ensuring sustainable use and conservation in Bankura district and beyond.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在印度东部干旱易发农业区应用装袋和提升集合机器学习技术绘制地下水潜能图
地下水是全球数十亿人的主要饮用水源。它在灌溉、家庭和工业用水方面发挥着至关重要的作用,并极大地促进了不同地区的抗旱能力。然而,地下水的过度排放导致许多地区饮用水短缺。因此,评估地下水潜势区(GWPZ)对于实施可持续管理措施以确保今世后代的地下水供应至关重要。本研究旨在使用四种机器学习方法,在西孟加拉邦班库拉地区划定地下水潜力高的区域:随机森林 (RF)、自适应提升 (AdaBoost)、极端梯度提升 (XGBoost) 和投票集合 (VE)。这些模型使用了 161 个数据点(占训练数据集的 70%)来识别该地区地下水存在与否之间的显著相关性。在这些方法中,随机森林(RF)和极端梯度提升(XGBoost)被证明是绘制地下水潜势图最有效的方法,这表明它们适用于具有类似水文地质条件的其他地区。RF 的性能指标非常好,精确度为 0.919,召回率为 0.971,F1 分数为 0.944,准确度为 0.943。这表明 RF 具有很强的准确预测地下水区的能力,误报和漏报极少。自适应提升(AdaBoost)在所有指标上都表现出相当的性能(精确度:0.919,召回率:0.971,F1-分数:0.944,准确度:0.943),突出了其在预测地下水区方面的有效性。而极端梯度提升模型(XGBoost)在所有指标上的表现略优于其他模型,其精确度(0.944)、召回率(0.971)、F1-分数(0.958)和准确率(0.957)的数值都更高,这表明该模型的性能更加精细。投票合集(VE)方法也显示出更高的性能,与 XGBoost 的指标(精确度:0.944;召回率:0.971;F1-分数:0.958;准确率:0.957)如出一辙。这表明,结合单个模型的优势可以获得更好的预测结果。班库拉地区的地下水潜力分区差异很大,极低潜力区占 41.81%,极高潜力区占 24.35%。整个研究区域的预测不确定性从 0.0 到 0.75 不等,反映了地下水可用性的多变性和有针对性的管理策略的必要性。研究结果为更好的地下水管理实践奠定了基础,确保了班库拉地区及其他地区地下水的可持续利用和保护。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Environmental Sciences Europe
Environmental Sciences Europe Environmental Science-Pollution
CiteScore
11.20
自引率
1.70%
发文量
110
审稿时长
13 weeks
期刊介绍: ESEU is an international journal, focusing primarily on Europe, with a broad scope covering all aspects of environmental sciences, including the main topic regulation. ESEU will discuss the entanglement between environmental sciences and regulation because, in recent years, there have been misunderstandings and even disagreement between stakeholders in these two areas. ESEU will help to improve the comprehension of issues between environmental sciences and regulation. ESEU will be an outlet from the German-speaking (DACH) countries to Europe and an inlet from Europe to the DACH countries regarding environmental sciences and regulation. Moreover, ESEU will facilitate the exchange of ideas and interaction between Europe and the DACH countries regarding environmental regulatory issues. Although Europe is at the center of ESEU, the journal will not exclude the rest of the world, because regulatory issues pertaining to environmental sciences can be fully seen only from a global perspective.
期刊最新文献
Calculating the effect of intensive use of urban organic waste on soil concentrations of potentially toxic elements in a peri-urban agriculture context in Norway Disentangling mechanisms by which microplastic films affect plant-soil systems: physical effects of particles can override toxic effects of additives Insights into the role of hexa-bacterial consortium for bioremediation of soil contaminated with chlorantraniliprole Unlocking the potential of data harmonization and FAIRness in chemical risk assessment: lessons from practice and insights for policy development Heavy metal contamination and potential health risks in upland rice-producing soils of rotational shifting cultivation in northern Thailand
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1