Identifying the spatial pattern and driving factors of nitrate in groundwater using a novel framework of interpretable stacking ensemble learning.

IF 3.2 3区 环境科学与生态学 Q3 ENGINEERING, ENVIRONMENTAL Environmental Geochemistry and Health Pub Date : 2024-10-29 DOI:10.1007/s10653-024-02201-1
Xuan Li, Guohua Liang, Lei Wang, Yuesuo Yang, Yuanyin Li, Zhongguo Li, Bin He, Guoli Wang
{"title":"Identifying the spatial pattern and driving factors of nitrate in groundwater using a novel framework of interpretable stacking ensemble learning.","authors":"Xuan Li, Guohua Liang, Lei Wang, Yuesuo Yang, Yuanyin Li, Zhongguo Li, Bin He, Guoli Wang","doi":"10.1007/s10653-024-02201-1","DOIUrl":null,"url":null,"abstract":"<p><p>Groundwater nitrate contamination poses a potential threat to human health and environmental safety globally. This study proposes an interpretable stacking ensemble learning (SEL) framework for enhancing and interpreting groundwater nitrate spatial predictions by integrating the two-level heterogeneous SEL model and SHapley Additive exPlanations (SHAP). In the SEL model, five commonly used machine learning models were utilized as base models (gradient boosting decision tree, extreme gradient boosting, random forest, extremely randomized trees, and k-nearest neighbor), whose outputs were taken as input data for the meta-model. When applied to the agricultural intensive area, the Eden Valley in the UK, the SEL model outperformed the individual models in predictive performance and generalization ability. It reveals a mean groundwater nitrate level of 2.22 mg/L-N, with 2.46% of sandstone aquifers exceeding the drinking standard of 11.3 mg/L-N. Alarmingly, 8.74% of areas with high groundwater nitrate remain outside the designated nitrate vulnerable zones. Moreover, SHAP identified that transmissivity, baseflow index, hydraulic conductivity, the percentage of arable land, and the C:N ratio in the soil were the top five key driving factors of groundwater nitrate. With nitrate threatening groundwater globally, this study presents a high-accuracy, interpretable, and flexible modeling framework that enhances our understanding of the mechanisms behind groundwater nitrate contamination. It implies that the interpretable SEL framework has great promise for providing valuable evidence for environmental management, water resource protection, and sustainable development, particularly in the data-scarce area.</p>","PeriodicalId":11759,"journal":{"name":"Environmental Geochemistry and Health","volume":null,"pages":null},"PeriodicalIF":3.2000,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11522174/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Geochemistry and Health","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1007/s10653-024-02201-1","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ENVIRONMENTAL","Score":null,"Total":0}
引用次数: 0

Abstract

Groundwater nitrate contamination poses a potential threat to human health and environmental safety globally. This study proposes an interpretable stacking ensemble learning (SEL) framework for enhancing and interpreting groundwater nitrate spatial predictions by integrating the two-level heterogeneous SEL model and SHapley Additive exPlanations (SHAP). In the SEL model, five commonly used machine learning models were utilized as base models (gradient boosting decision tree, extreme gradient boosting, random forest, extremely randomized trees, and k-nearest neighbor), whose outputs were taken as input data for the meta-model. When applied to the agricultural intensive area, the Eden Valley in the UK, the SEL model outperformed the individual models in predictive performance and generalization ability. It reveals a mean groundwater nitrate level of 2.22 mg/L-N, with 2.46% of sandstone aquifers exceeding the drinking standard of 11.3 mg/L-N. Alarmingly, 8.74% of areas with high groundwater nitrate remain outside the designated nitrate vulnerable zones. Moreover, SHAP identified that transmissivity, baseflow index, hydraulic conductivity, the percentage of arable land, and the C:N ratio in the soil were the top five key driving factors of groundwater nitrate. With nitrate threatening groundwater globally, this study presents a high-accuracy, interpretable, and flexible modeling framework that enhances our understanding of the mechanisms behind groundwater nitrate contamination. It implies that the interpretable SEL framework has great promise for providing valuable evidence for environmental management, water resource protection, and sustainable development, particularly in the data-scarce area.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用新颖的可解释堆叠集合学习框架识别地下水中硝酸盐的空间模式和驱动因素。
地下水硝酸盐污染对全球人类健康和环境安全构成潜在威胁。本研究提出了一种可解释的堆叠集合学习(SEL)框架,通过整合两级异构 SEL 模型和 SHapley Additive exPlanations(SHAP)来增强和解释地下水硝酸盐空间预测。在 SEL 模型中,使用了五种常用的机器学习模型作为基础模型(梯度提升决策树、极梯度提升、随机森林、极随机树和 k 最近邻),其输出结果作为元模型的输入数据。当将 SEL 模型应用于英国伊登山谷这一农业密集区时,其预测性能和泛化能力均优于单个模型。它显示地下水硝酸盐的平均水平为 2.22 mg/L-N,其中 2.46% 的砂岩含水层超过了 11.3 mg/L-N 的饮用水标准。令人担忧的是,8.74% 的地下水硝酸盐含量较高地区仍处于指定的硝酸盐易受影响区之外。此外,SHAP 发现,渗透率、基流指数、水力传导性、耕地比例和土壤中的碳氮比是地下水硝酸盐的五大主要驱动因素。在硝酸盐威胁全球地下水的情况下,本研究提出了一个高精度、可解释且灵活的建模框架,可加深我们对地下水硝酸盐污染背后机制的理解。这意味着可解释的 SEL 框架有望为环境管理、水资源保护和可持续发展提供有价值的证据,尤其是在数据稀缺的地区。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Environmental Geochemistry and Health
Environmental Geochemistry and Health 环境科学-工程:环境
CiteScore
8.00
自引率
4.80%
发文量
279
审稿时长
4.2 months
期刊介绍: Environmental Geochemistry and Health publishes original research papers and review papers across the broad field of environmental geochemistry. Environmental geochemistry and health establishes and explains links between the natural or disturbed chemical composition of the earth’s surface and the health of plants, animals and people. Beneficial elements regulate or promote enzymatic and hormonal activity whereas other elements may be toxic. Bedrock geochemistry controls the composition of soil and hence that of water and vegetation. Environmental issues, such as pollution, arising from the extraction and use of mineral resources, are discussed. The effects of contaminants introduced into the earth’s geochemical systems are examined. Geochemical surveys of soil, water and plants show how major and trace elements are distributed geographically. Associated epidemiological studies reveal the possibility of causal links between the natural or disturbed geochemical environment and disease. Experimental research illuminates the nature or consequences of natural or disturbed geochemical processes. The journal particularly welcomes novel research linking environmental geochemistry and health issues on such topics as: heavy metals (including mercury), persistent organic pollutants (POPs), and mixed chemicals emitted through human activities, such as uncontrolled recycling of electronic-waste; waste recycling; surface-atmospheric interaction processes (natural and anthropogenic emissions, vertical transport, deposition, and physical-chemical interaction) of gases and aerosols; phytoremediation/restoration of contaminated sites; food contamination and safety; environmental effects of medicines; effects and toxicity of mixed pollutants; speciation of heavy metals/metalloids; effects of mining; disturbed geochemistry from human behavior, natural or man-made hazards; particle and nanoparticle toxicology; risk and the vulnerability of populations, etc.
期刊最新文献
Correction to: Environmental and human health risk of potentially toxic metals in freshwater and brackish water Nile tilapia (Oreochromis niloticus) aquaculture. Correction: Ecological, environmental risks and sources of arsenic and other elements in soils of Tuotuo River region, Qinghai-Tibet Plateau. Identifying the spatial pattern and driving factors of nitrate in groundwater using a novel framework of interpretable stacking ensemble learning. Mercury speciation in environmental samples associated with artisanal small-scale gold mines using a novel solid-phase extraction approach to sample collection and preservation. Correction: Synergistic mitigation of cadmium stress in rice (Oryza sativa L.) through combined selenium, calcium, and magnesium supplementation.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1