基于人类反馈的可解释的局部概念解释预测全因死亡率

IF 4.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Journal of Artificial Intelligence Research Pub Date : 2022-11-18 DOI:10.1613/jair.1.14019

Radwa El Shawi, M. Al-Mallah

{"title":"基于人类反馈的可解释的局部概念解释预测全因死亡率","authors":"Radwa El Shawi, M. Al-Mallah","doi":"10.1613/jair.1.14019","DOIUrl":null,"url":null,"abstract":"Machine learning models are incorporated in different fields and disciplines in which some of them require a high level of accountability and transparency, for example, the healthcare sector. With the General Data Protection Regulation (GDPR), the importance for plausibility and verifiability of the predictions made by machine learning models has become essential. A widely used category of explanation techniques attempts to explain models’ predictions by quantifying the importance score of each input feature. However, summarizing such scores to provide human-interpretable explanations is challenging. Another category of explanation techniques focuses on learning a domain representation in terms of high-level human-understandable concepts and then utilizing them to explain predictions. These explanations are hampered by how concepts are constructed, which is not intrinsically interpretable. To this end, we propose Concept-based Local Explanations with Feedback (CLEF), a novel local model agnostic explanation framework for learning a set of high-level transparent concept definitions in high-dimensional tabular data that uses clinician-labeled concepts rather than raw features. CLEF maps the raw input features to high-level intuitive concepts and then decompose the evidence of prediction of the instance being explained into concepts. In addition, the proposed framework generates counterfactual explanations, suggesting the minimum changes in the instance’s concept based explanation that will lead to a different prediction. We demonstrate with simulated user feedback on predicting the risk of mortality. Such direct feedback is more effective than other techniques, that rely on hand-labelled or automatically extracted concepts, in learning concepts that align with ground truth concept definitions.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"22 1","pages":"833-855"},"PeriodicalIF":4.5000,"publicationDate":"2022-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Interpretable Local Concept-based Explanation with Human Feedback to Predict All-cause Mortality\",\"authors\":\"Radwa El Shawi, M. Al-Mallah\",\"doi\":\"10.1613/jair.1.14019\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning models are incorporated in different fields and disciplines in which some of them require a high level of accountability and transparency, for example, the healthcare sector. With the General Data Protection Regulation (GDPR), the importance for plausibility and verifiability of the predictions made by machine learning models has become essential. A widely used category of explanation techniques attempts to explain models’ predictions by quantifying the importance score of each input feature. However, summarizing such scores to provide human-interpretable explanations is challenging. Another category of explanation techniques focuses on learning a domain representation in terms of high-level human-understandable concepts and then utilizing them to explain predictions. These explanations are hampered by how concepts are constructed, which is not intrinsically interpretable. To this end, we propose Concept-based Local Explanations with Feedback (CLEF), a novel local model agnostic explanation framework for learning a set of high-level transparent concept definitions in high-dimensional tabular data that uses clinician-labeled concepts rather than raw features. CLEF maps the raw input features to high-level intuitive concepts and then decompose the evidence of prediction of the instance being explained into concepts. In addition, the proposed framework generates counterfactual explanations, suggesting the minimum changes in the instance’s concept based explanation that will lead to a different prediction. We demonstrate with simulated user feedback on predicting the risk of mortality. Such direct feedback is more effective than other techniques, that rely on hand-labelled or automatically extracted concepts, in learning concepts that align with ground truth concept definitions.\",\"PeriodicalId\":54877,\"journal\":{\"name\":\"Journal of Artificial Intelligence Research\",\"volume\":\"22 1\",\"pages\":\"833-855\"},\"PeriodicalIF\":4.5000,\"publicationDate\":\"2022-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Artificial Intelligence Research\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1613/jair.1.14019\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Artificial Intelligence Research","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1613/jair.1.14019","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 1

摘要

机器学习模型被纳入不同的领域和学科，其中一些需要高度的问责制和透明度，例如医疗保健部门。随着通用数据保护条例(GDPR)的实施，机器学习模型所做预测的合理性和可验证性变得至关重要。一种广泛使用的解释技术试图通过量化每个输入特征的重要性得分来解释模型的预测。然而，总结这些分数以提供人类可解释的解释是具有挑战性的。另一类解释技术侧重于根据人类可理解的高级概念学习领域表示，然后利用它们来解释预测。这些解释受到概念构造方式的阻碍，这在本质上是不可解释的。为此，我们提出了基于概念的带有反馈的局部解释(CLEF)，这是一个新颖的局部模型不可知论解释框架，用于在高维表格数据中学习一组高级别透明概念定义，该解释使用临床医生标记的概念而不是原始特征。CLEF将原始输入特征映射到高级直观概念，然后将正在解释的实例的预测证据分解为概念。此外，所提出的框架产生反事实解释，表明实例中基于概念的解释的最小变化将导致不同的预测。我们用模拟的用户反馈来证明预测死亡风险。这种直接反馈比其他依赖于手工标记或自动提取概念的技术更有效，在学习与基本真理概念定义一致的概念时。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Interpretable Local Concept-based Explanation with Human Feedback to Predict All-cause Mortality

Machine learning models are incorporated in different fields and disciplines in which some of them require a high level of accountability and transparency, for example, the healthcare sector. With the General Data Protection Regulation (GDPR), the importance for plausibility and verifiability of the predictions made by machine learning models has become essential. A widely used category of explanation techniques attempts to explain models’ predictions by quantifying the importance score of each input feature. However, summarizing such scores to provide human-interpretable explanations is challenging. Another category of explanation techniques focuses on learning a domain representation in terms of high-level human-understandable concepts and then utilizing them to explain predictions. These explanations are hampered by how concepts are constructed, which is not intrinsically interpretable. To this end, we propose Concept-based Local Explanations with Feedback (CLEF), a novel local model agnostic explanation framework for learning a set of high-level transparent concept definitions in high-dimensional tabular data that uses clinician-labeled concepts rather than raw features. CLEF maps the raw input features to high-level intuitive concepts and then decompose the evidence of prediction of the instance being explained into concepts. In addition, the proposed framework generates counterfactual explanations, suggesting the minimum changes in the instance’s concept based explanation that will lead to a different prediction. We demonstrate with simulated user feedback on predicting the risk of mortality. Such direct feedback is more effective than other techniques, that rely on hand-labelled or automatically extracted concepts, in learning concepts that align with ground truth concept definitions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Artificial Intelligence Research 工程技术-计算机：人工智能

CiteScore

9.60

自引率

4.00%

发文量

审稿时长

4 months

期刊介绍： JAIR(ISSN 1076 - 9757) covers all areas of artificial intelligence (AI), publishing refereed research articles, survey articles, and technical notes. Established in 1993 as one of the first electronic scientific journals, JAIR is indexed by INSPEC, Science Citation Index, and MathSciNet. JAIR reviews papers within approximately three months of submission and publishes accepted articles on the internet immediately upon receiving the final versions. JAIR articles are published for free distribution on the internet by the AI Access Foundation, and for purchase in bound volumes by AAAI Press.

期刊最新文献

Collective Belief Revision Competitive Equilibria with a Constant Number of Chores Improving Resource Allocations by Sharing in Pairs A General Model for Aggregating Annotations Across Simple, Complex, and Multi-Object Annotation Tasks Asymptotics of K-Fold Cross Validation