应用于环境数据的模型不可知可解释性方法的实例

IF 1.5 3区环境科学与生态学 Q4 ENVIRONMENTAL SCIENCES Environmetrics Pub Date : 2022-10-25 DOI:10.1002/env.2772

Christopher K. Wikle, Abhirup Datta, Bhava Vyasa Hari, Edward L. Boone, Indranil Sahoo, Indulekha Kavila, Stefano Castruccio, Susan J. Simmons, Wesley S. Burr, Won Chang

{"title":"应用于环境数据的模型不可知可解释性方法的实例","authors":"Christopher K. Wikle, Abhirup Datta, Bhava Vyasa Hari, Edward L. Boone, Indranil Sahoo, Indulekha Kavila, Stefano Castruccio, Susan J. Simmons, Wesley S. Burr, Won Chang","doi":"10.1002/env.2772","DOIUrl":null,"url":null,"abstract":"<p>Historically, two primary criticisms statisticians have of machine learning and deep neural models is their lack of uncertainty quantification and the inability to do inference (i.e., to explain what inputs are important). Explainable AI has developed in the last few years as a sub-discipline of computer science and machine learning to mitigate these concerns (as well as concerns of fairness and transparency in deep modeling). In this article, our focus is on explaining which inputs are important in models for predicting environmental data. In particular, we focus on three general methods for explainability that are model agnostic and thus applicable across a breadth of models without internal explainability: “feature shuffling”, “interpretable local surrogates”, and “occlusion analysis”. We describe particular implementations of each of these and illustrate their use with a variety of models, all applied to the problem of long-lead forecasting monthly soil moisture in the North American corn belt given sea surface temperature anomalies in the Pacific Ocean.</p>","PeriodicalId":50512,"journal":{"name":"Environmetrics","volume":"34 1","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2022-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/env.2772","citationCount":"3","resultStr":"{\"title\":\"An illustration of model agnostic explainability methods applied to environmental data\",\"authors\":\"Christopher K. Wikle, Abhirup Datta, Bhava Vyasa Hari, Edward L. Boone, Indranil Sahoo, Indulekha Kavila, Stefano Castruccio, Susan J. Simmons, Wesley S. Burr, Won Chang\",\"doi\":\"10.1002/env.2772\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Historically, two primary criticisms statisticians have of machine learning and deep neural models is their lack of uncertainty quantification and the inability to do inference (i.e., to explain what inputs are important). Explainable AI has developed in the last few years as a sub-discipline of computer science and machine learning to mitigate these concerns (as well as concerns of fairness and transparency in deep modeling). In this article, our focus is on explaining which inputs are important in models for predicting environmental data. In particular, we focus on three general methods for explainability that are model agnostic and thus applicable across a breadth of models without internal explainability: “feature shuffling”, “interpretable local surrogates”, and “occlusion analysis”. We describe particular implementations of each of these and illustrate their use with a variety of models, all applied to the problem of long-lead forecasting monthly soil moisture in the North American corn belt given sea surface temperature anomalies in the Pacific Ocean.</p>\",\"PeriodicalId\":50512,\"journal\":{\"name\":\"Environmetrics\",\"volume\":\"34 1\",\"pages\":\"\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2022-10-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/env.2772\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Environmetrics\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/env.2772\",\"RegionNum\":3,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmetrics","FirstCategoryId":"93","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/env.2772","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}

引用次数: 3

摘要

从历史上看，统计学家对机器学习和深度神经模型的两个主要批评是它们缺乏不确定性量化和无法进行推理(即解释哪些输入是重要的)。在过去的几年里，可解释的人工智能作为计算机科学和机器学习的一个分支学科得到了发展，以减轻这些担忧(以及对深度建模公平性和透明度的担忧)。在本文中，我们的重点是解释在预测环境数据的模型中哪些输入是重要的。我们特别关注三种一般的可解释性方法，这些方法与模型无关，因此适用于大量没有内部可解释性的模型:“特征洗牌”、“可解释的局部代理”和“遮挡分析”。我们描述了这些方法的具体实现，并举例说明了它们与各种模型的使用，所有这些模型都适用于在太平洋海面温度异常的情况下长期预测北美玉米带每月土壤湿度的问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

An illustration of model agnostic explainability methods applied to environmental data

Historically, two primary criticisms statisticians have of machine learning and deep neural models is their lack of uncertainty quantification and the inability to do inference (i.e., to explain what inputs are important). Explainable AI has developed in the last few years as a sub-discipline of computer science and machine learning to mitigate these concerns (as well as concerns of fairness and transparency in deep modeling). In this article, our focus is on explaining which inputs are important in models for predicting environmental data. In particular, we focus on three general methods for explainability that are model agnostic and thus applicable across a breadth of models without internal explainability: “feature shuffling”, “interpretable local surrogates”, and “occlusion analysis”. We describe particular implementations of each of these and illustrate their use with a variety of models, all applied to the problem of long-lead forecasting monthly soil moisture in the North American corn belt given sea surface temperature anomalies in the Pacific Ocean.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Environmetrics 环境科学-环境科学

CiteScore

2.90

自引率

17.60%

发文量

审稿时长

18-36 weeks

期刊介绍： Environmetrics, the official journal of The International Environmetrics Society (TIES), an Association of the International Statistical Institute, is devoted to the dissemination of high-quality quantitative research in the environmental sciences. The journal welcomes pertinent and innovative submissions from quantitative disciplines developing new statistical and mathematical techniques, methods, and theories that solve modern environmental problems. Articles must proffer substantive, new statistical or mathematical advances to answer important scientific questions in the environmental sciences, or must develop novel or enhanced statistical methodology with clear applications to environmental science. New methods should be illustrated with recent environmental data.

期刊最新文献

Issue Information Bias correction of daily precipitation from climate models, using the Q-GAM method Issue Information A hierarchical constrained density regression model for predicting cluster-level dose-response Under the mantra: ‘Make use of colorblind friendly graphs’