评估显著性方法的感知和语义可解释性:黑色素瘤的案例研究

Applied AI letters Pub Date : 2022-09-13 DOI:10.1002/ail2.77

Harshit Bokadia, Scott Cheng-Hsin Yang, Zhaobin Li, Tomas Folke, Patrick Shafto

{"title":"评估显著性方法的感知和语义可解释性:黑色素瘤的案例研究","authors":"Harshit Bokadia, Scott Cheng-Hsin Yang, Zhaobin Li, Tomas Folke, Patrick Shafto","doi":"10.1002/ail2.77","DOIUrl":null,"url":null,"abstract":"<p>In order to be useful, XAI explanations have to be faithful to the AI system they seek to elucidate and also interpretable to the people that engage with them. There exist multiple algorithmic methods for assessing faithfulness, but this is not so for interpretability, which is typically only assessed through expensive user studies. Here we propose two complementary metrics to algorithmically evaluate the interpretability of saliency map explanations. One metric assesses perceptual interpretability by quantifying the visual coherence of the saliency map. The second metric assesses semantic interpretability by capturing the degree of overlap between the saliency map and textbook features—features human experts use to make a classification. We use a melanoma dataset and a deep-neural network classifier as a case-study to explore how our two interpretability metrics relate to each other and a faithfulness metric. Across six commonly used saliency methods, we find that none achieves high scores across all three metrics for all test images, but that different methods perform well in different regions of the data distribution. This variation between methods can be leveraged to consistently achieve high interpretability and faithfulness by using our metrics to inform saliency mask selection on a case-by-case basis. Our interpretability metrics provide a new way to evaluate saliency-based explanations and allow for the adaptive combination of saliency-based explanation methods.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.77","citationCount":"3","resultStr":"{\"title\":\"Evaluating perceptual and semantic interpretability of saliency methods: A case study of melanoma\",\"authors\":\"Harshit Bokadia, Scott Cheng-Hsin Yang, Zhaobin Li, Tomas Folke, Patrick Shafto\",\"doi\":\"10.1002/ail2.77\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>In order to be useful, XAI explanations have to be faithful to the AI system they seek to elucidate and also interpretable to the people that engage with them. There exist multiple algorithmic methods for assessing faithfulness, but this is not so for interpretability, which is typically only assessed through expensive user studies. Here we propose two complementary metrics to algorithmically evaluate the interpretability of saliency map explanations. One metric assesses perceptual interpretability by quantifying the visual coherence of the saliency map. The second metric assesses semantic interpretability by capturing the degree of overlap between the saliency map and textbook features—features human experts use to make a classification. We use a melanoma dataset and a deep-neural network classifier as a case-study to explore how our two interpretability metrics relate to each other and a faithfulness metric. Across six commonly used saliency methods, we find that none achieves high scores across all three metrics for all test images, but that different methods perform well in different regions of the data distribution. This variation between methods can be leveraged to consistently achieve high interpretability and faithfulness by using our metrics to inform saliency mask selection on a case-by-case basis. Our interpretability metrics provide a new way to evaluate saliency-based explanations and allow for the adaptive combination of saliency-based explanation methods.</p>\",\"PeriodicalId\":72253,\"journal\":{\"name\":\"Applied AI letters\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.77\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied AI letters\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/ail2.77\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied AI letters","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ail2.77","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

为了发挥作用，XAI解释必须忠实于它们试图解释的AI系统，并且能够让参与其中的人理解。存在多种算法方法来评估忠实度，但对于可解释性而言并非如此，这通常只能通过昂贵的用户研究来评估。在这里，我们提出了两个互补的指标，以算法评估显著性地图解释的可解释性。一种度量通过量化显著性图的视觉一致性来评估感知可解释性。第二个指标通过捕捉显著性图和教科书特征之间的重叠程度来评估语义可解释性，这些特征是人类专家用来进行分类的。我们使用黑色素瘤数据集和深度神经网络分类器作为案例研究，探索我们的两个可解释性指标如何相互关联以及可信度指标。在六种常用的显著性方法中，我们发现没有一种方法能够在所有测试图像的所有三个度量中获得高分，但是不同的方法在数据分布的不同区域表现良好。可以利用方法之间的这种差异，通过使用我们的指标来根据具体情况通知显着掩码选择，从而始终如一地实现高可解释性和可靠性。我们的可解释性指标提供了一种新的方法来评估基于显著性的解释，并允许基于显著性的解释方法的自适应组合。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Evaluating perceptual and semantic interpretability of saliency methods: A case study of melanoma

In order to be useful, XAI explanations have to be faithful to the AI system they seek to elucidate and also interpretable to the people that engage with them. There exist multiple algorithmic methods for assessing faithfulness, but this is not so for interpretability, which is typically only assessed through expensive user studies. Here we propose two complementary metrics to algorithmically evaluate the interpretability of saliency map explanations. One metric assesses perceptual interpretability by quantifying the visual coherence of the saliency map. The second metric assesses semantic interpretability by capturing the degree of overlap between the saliency map and textbook features—features human experts use to make a classification. We use a melanoma dataset and a deep-neural network classifier as a case-study to explore how our two interpretability metrics relate to each other and a faithfulness metric. Across six commonly used saliency methods, we find that none achieves high scores across all three metrics for all test images, but that different methods perform well in different regions of the data distribution. This variation between methods can be leveraged to consistently achieve high interpretability and faithfulness by using our metrics to inform saliency mask selection on a case-by-case basis. Our interpretability metrics provide a new way to evaluate saliency-based explanations and allow for the adaptive combination of saliency-based explanation methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助