Application of bagging and boosting ensemble machine learning techniques for groundwater potential mapping in a drought-prone agriculture region of eastern India
Krishnagopal Halder, Amit Kumar Srivastava, Anitabha Ghosh, Ranajit Nabik, Subrata Pan, Uday Chatterjee, Dipak Bisai, Subodh Chandra Pal, Wenzhi Zeng, Frank Ewert, Thomas Gaiser, Chaitanya Baliram Pande, Abu Reza Md. Towfiqul Islam, Edris Alam, Md Kamrul Islam
{"title":"Application of bagging and boosting ensemble machine learning techniques for groundwater potential mapping in a drought-prone agriculture region of eastern India","authors":"Krishnagopal Halder, Amit Kumar Srivastava, Anitabha Ghosh, Ranajit Nabik, Subrata Pan, Uday Chatterjee, Dipak Bisai, Subodh Chandra Pal, Wenzhi Zeng, Frank Ewert, Thomas Gaiser, Chaitanya Baliram Pande, Abu Reza Md. Towfiqul Islam, Edris Alam, Md Kamrul Islam","doi":"10.1186/s12302-024-00981-y","DOIUrl":null,"url":null,"abstract":"<div><p>Groundwater is a primary source of drinking water for billions worldwide. It plays a crucial role in irrigation, domestic, and industrial uses, and significantly contributes to drought resilience in various regions. However, excessive groundwater discharge has left many areas vulnerable to potable water shortages. Therefore, assessing groundwater potential zones (GWPZ) is essential for implementing sustainable management practices to ensure the availability of groundwater for present and future generations. This study aims to delineate areas with high groundwater potential in the Bankura district of West Bengal using four machine learning methods: Random Forest (RF), Adaptive Boosting (AdaBoost), Extreme Gradient Boosting (XGBoost), and Voting Ensemble (VE). The models used 161 data points, comprising 70% of the training dataset, to identify significant correlations between the presence and absence of groundwater in the region. Among the methods, Random Forest (RF) and Extreme Gradient Boosting (XGBoost) proved to be the most effective in mapping groundwater potential, suggesting their applicability in other regions with similar hydrogeological conditions. The performance metrics for RF are very good with a precision of 0.919, recall of 0.971, F1-score of 0.944, and accuracy of 0.943. This indicates a strong capability to accurately predict groundwater zones with minimal false positives and negatives. Adaptive Boosting (AdaBoost) demonstrated comparable performance across all metrics (precision: 0.919, recall: 0.971, F1-score: 0.944, accuracy: 0.943), highlighting its effectiveness in predicting groundwater potential areas accurately; whereas, Extreme Gradient Boosting (XGBoost) outperformed the other models slightly, with higher values in all metrics: precision (0.944), recall (0.971), F1-score (0.958), and accuracy (0.957), suggesting a more refined model performance. The Voting Ensemble (VE) approach also showed enhanced performance, mirroring XGBoost's metrics (precision: 0.944, recall: 0.971, F1-score: 0.958, accuracy: 0.957). This indicates that combining the strengths of individual models leads to better predictions. The groundwater potentiality zoning across the Bankura district varied significantly, with areas of very low potentiality accounting for 41.81% and very high potentiality at 24.35%. The uncertainty in predictions ranged from 0.0 to 0.75 across the study area, reflecting the variability in groundwater availability and the need for targeted management strategies.</p><p>In summary, this study highlights the critical need for assessing and managing groundwater resources effectively using advanced machine learning techniques. The findings provide a foundation for better groundwater management practices, ensuring sustainable use and conservation in Bankura district and beyond.</p></div>","PeriodicalId":546,"journal":{"name":"Environmental Sciences Europe","volume":"36 1","pages":""},"PeriodicalIF":6.0000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1186/s12302-024-00981-y.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Sciences Europe","FirstCategoryId":"93","ListUrlMain":"https://link.springer.com/article/10.1186/s12302-024-00981-y","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Groundwater is a primary source of drinking water for billions worldwide. It plays a crucial role in irrigation, domestic, and industrial uses, and significantly contributes to drought resilience in various regions. However, excessive groundwater discharge has left many areas vulnerable to potable water shortages. Therefore, assessing groundwater potential zones (GWPZ) is essential for implementing sustainable management practices to ensure the availability of groundwater for present and future generations. This study aims to delineate areas with high groundwater potential in the Bankura district of West Bengal using four machine learning methods: Random Forest (RF), Adaptive Boosting (AdaBoost), Extreme Gradient Boosting (XGBoost), and Voting Ensemble (VE). The models used 161 data points, comprising 70% of the training dataset, to identify significant correlations between the presence and absence of groundwater in the region. Among the methods, Random Forest (RF) and Extreme Gradient Boosting (XGBoost) proved to be the most effective in mapping groundwater potential, suggesting their applicability in other regions with similar hydrogeological conditions. The performance metrics for RF are very good with a precision of 0.919, recall of 0.971, F1-score of 0.944, and accuracy of 0.943. This indicates a strong capability to accurately predict groundwater zones with minimal false positives and negatives. Adaptive Boosting (AdaBoost) demonstrated comparable performance across all metrics (precision: 0.919, recall: 0.971, F1-score: 0.944, accuracy: 0.943), highlighting its effectiveness in predicting groundwater potential areas accurately; whereas, Extreme Gradient Boosting (XGBoost) outperformed the other models slightly, with higher values in all metrics: precision (0.944), recall (0.971), F1-score (0.958), and accuracy (0.957), suggesting a more refined model performance. The Voting Ensemble (VE) approach also showed enhanced performance, mirroring XGBoost's metrics (precision: 0.944, recall: 0.971, F1-score: 0.958, accuracy: 0.957). This indicates that combining the strengths of individual models leads to better predictions. The groundwater potentiality zoning across the Bankura district varied significantly, with areas of very low potentiality accounting for 41.81% and very high potentiality at 24.35%. The uncertainty in predictions ranged from 0.0 to 0.75 across the study area, reflecting the variability in groundwater availability and the need for targeted management strategies.
In summary, this study highlights the critical need for assessing and managing groundwater resources effectively using advanced machine learning techniques. The findings provide a foundation for better groundwater management practices, ensuring sustainable use and conservation in Bankura district and beyond.
期刊介绍:
ESEU is an international journal, focusing primarily on Europe, with a broad scope covering all aspects of environmental sciences, including the main topic regulation.
ESEU will discuss the entanglement between environmental sciences and regulation because, in recent years, there have been misunderstandings and even disagreement between stakeholders in these two areas. ESEU will help to improve the comprehension of issues between environmental sciences and regulation.
ESEU will be an outlet from the German-speaking (DACH) countries to Europe and an inlet from Europe to the DACH countries regarding environmental sciences and regulation.
Moreover, ESEU will facilitate the exchange of ideas and interaction between Europe and the DACH countries regarding environmental regulatory issues.
Although Europe is at the center of ESEU, the journal will not exclude the rest of the world, because regulatory issues pertaining to environmental sciences can be fully seen only from a global perspective.