通过局部归因法洞察基于机器学习的数字土壤制图的预测不确定性

IF 5.8 2区 农林科学 Q1 SOIL SCIENCE Soil Pub Date : 2024-09-30 DOI:10.5194/soil-10-679-2024
Jeremy Rohmer, Stephane Belbeze, Dominique Guyonnet
{"title":"通过局部归因法洞察基于机器学习的数字土壤制图的预测不确定性","authors":"Jeremy Rohmer, Stephane Belbeze, Dominique Guyonnet","doi":"10.5194/soil-10-679-2024","DOIUrl":null,"url":null,"abstract":"Abstract. Machine learning (ML) models have become key ingredients for digital soil mapping. To improve the interpretability of their predictions, diagnostic tools such as the widely used local attribution approach known as SHapley Additive exPlanations (SHAP) have been developed. However, the analysis of ML model predictions is only one part of the problem, and there is an interest in obtaining deeper insights into the drivers of the prediction uncertainty as well, i.e. explaining why an ML model is confident given the set of chosen covariate values in addition to why the ML model delivered some particular results. In this study, we show how to apply SHAP to local prediction uncertainty estimates for a case of urban soil pollution – namely, the presence of petroleum hydrocarbons in soil in Toulouse (France), which pose a health risk via vapour intrusion into buildings, direct soil ingestion, and groundwater contamination. Our results show that the drivers of the prediction best estimates are not necessarily the drivers of confidence in these predictions, and we identify those leading to a reduction in uncertainty. Our study suggests that decisions regarding data collection and covariate characterisation as well as communication of the results should be made accordingly.","PeriodicalId":48610,"journal":{"name":"Soil","volume":null,"pages":null},"PeriodicalIF":5.8000,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Insights into the prediction uncertainty of machine-learning-based digital soil mapping through a local attribution approach\",\"authors\":\"Jeremy Rohmer, Stephane Belbeze, Dominique Guyonnet\",\"doi\":\"10.5194/soil-10-679-2024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract. Machine learning (ML) models have become key ingredients for digital soil mapping. To improve the interpretability of their predictions, diagnostic tools such as the widely used local attribution approach known as SHapley Additive exPlanations (SHAP) have been developed. However, the analysis of ML model predictions is only one part of the problem, and there is an interest in obtaining deeper insights into the drivers of the prediction uncertainty as well, i.e. explaining why an ML model is confident given the set of chosen covariate values in addition to why the ML model delivered some particular results. In this study, we show how to apply SHAP to local prediction uncertainty estimates for a case of urban soil pollution – namely, the presence of petroleum hydrocarbons in soil in Toulouse (France), which pose a health risk via vapour intrusion into buildings, direct soil ingestion, and groundwater contamination. Our results show that the drivers of the prediction best estimates are not necessarily the drivers of confidence in these predictions, and we identify those leading to a reduction in uncertainty. Our study suggests that decisions regarding data collection and covariate characterisation as well as communication of the results should be made accordingly.\",\"PeriodicalId\":48610,\"journal\":{\"name\":\"Soil\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.8000,\"publicationDate\":\"2024-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Soil\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://doi.org/10.5194/soil-10-679-2024\",\"RegionNum\":2,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"SOIL SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Soil","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.5194/soil-10-679-2024","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOIL SCIENCE","Score":null,"Total":0}
引用次数: 0

摘要

摘要机器学习(ML)模型已成为数字土壤制图的关键要素。为了提高其预测的可解释性,人们开发了一些诊断工具,如被广泛使用的本地归因方法,即 SHapley Additive exPlanations(SHAP)。然而,对 ML 模型预测的分析只是问题的一部分,人们还有兴趣深入了解预测不确定性的驱动因素,即除了解释为什么 ML 模型会得出某些特定结果之外,还要解释为什么 ML 模型在所选的协变量值集合下是有信心的。在本研究中,我们展示了如何将 SHAP 应用于城市土壤污染案例的局部预测不确定性估计,即图卢兹(法国)土壤中存在的石油碳氢化合物,它通过蒸汽侵入建筑物、直接摄入土壤和地下水污染对健康构成威胁。我们的研究结果表明,预测最佳估计值的驱动因素并不一定是这些预测可信度的驱动因素,我们确定了导致不确定性降低的驱动因素。我们的研究表明,有关数据收集、协变量特征描述以及结果交流的决策应相应作出。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Insights into the prediction uncertainty of machine-learning-based digital soil mapping through a local attribution approach
Abstract. Machine learning (ML) models have become key ingredients for digital soil mapping. To improve the interpretability of their predictions, diagnostic tools such as the widely used local attribution approach known as SHapley Additive exPlanations (SHAP) have been developed. However, the analysis of ML model predictions is only one part of the problem, and there is an interest in obtaining deeper insights into the drivers of the prediction uncertainty as well, i.e. explaining why an ML model is confident given the set of chosen covariate values in addition to why the ML model delivered some particular results. In this study, we show how to apply SHAP to local prediction uncertainty estimates for a case of urban soil pollution – namely, the presence of petroleum hydrocarbons in soil in Toulouse (France), which pose a health risk via vapour intrusion into buildings, direct soil ingestion, and groundwater contamination. Our results show that the drivers of the prediction best estimates are not necessarily the drivers of confidence in these predictions, and we identify those leading to a reduction in uncertainty. Our study suggests that decisions regarding data collection and covariate characterisation as well as communication of the results should be made accordingly.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Soil
Soil Agricultural and Biological Sciences-Soil Science
CiteScore
10.80
自引率
2.90%
发文量
44
审稿时长
30 weeks
期刊介绍: SOIL is an international scientific journal dedicated to the publication and discussion of high-quality research in the field of soil system sciences. SOIL is at the interface between the atmosphere, lithosphere, hydrosphere, and biosphere. SOIL publishes scientific research that contributes to understanding the soil system and its interaction with humans and the entire Earth system. The scope of the journal includes all topics that fall within the study of soil science as a discipline, with an emphasis on studies that integrate soil science with other sciences (hydrology, agronomy, socio-economics, health sciences, atmospheric sciences, etc.).
期刊最新文献
Cr(VI) reduction, electricity production, and microbial resistance variation in paddy soil under microbial fuel cell operation Insights into the prediction uncertainty of machine-learning-based digital soil mapping through a local attribution approach Cultivation reduces quantities of mineral-organic associations in the form of amorphous coprecipitates Benchmarking soil multifunctionality Depth extrapolation of field-scale soil moisture time series derived with cosmic-ray neutron sensing (CRNS) using the soil moisture analytical relationship (SMAR) model
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1