{"title":"Clarity in complexity: how aggregating explanations resolves the disagreement problem","authors":"Oana Mitruț, Gabriela Moise, Alin Moldoveanu, Florica Moldoveanu, Marius Leordeanu, Livia Petrescu","doi":"10.1007/s10462-024-10952-7","DOIUrl":null,"url":null,"abstract":"<div><p>The Rashômon Effect, applied in Explainable Machine Learning, refers to the disagreement between the explanations provided by various attribution explainers and to the dissimilarity across multiple explanations generated by a particular explainer for a single instance from the dataset (differences between feature importances and their associated signs and ranks), an undesirable outcome especially in sensitive domains such as healthcare or finance. We propose a method inspired from textual-case based reasoning for aligning explanations from various explainers in order to resolve the disagreement and dissimilarity problems. We iteratively generated a number of 100 explanations for each instance from six popular datasets, using three prevalent feature attribution explainers: LIME, Anchors and SHAP (with the variations Tree SHAP and Kernel SHAP) and consequently applied a global cluster-based aggregation strategy that quantifies alignment and reveals similarities and associations between explanations. We evaluated our method by weighting the <span>\\(\\:k\\)</span>-NN algorithm with agreed feature overlap explanation weights and compared it to a non-weighted <span>\\(\\:k\\)</span>-NN predictor, having as task binary classification. Also, we compared the results of the weighted <span>\\(\\:k\\)</span>-NN algorithm using aggregated feature overlap explanation weights to the weighted <span>\\(\\:k\\)</span>-NN algorithm using weights produced by a single explanation method (either LIME, SHAP or Anchors). Our global alignment method benefited the most from a hybridization with feature importance scores (information gain), that was essential for acquiring a more accurate estimate of disagreement, for enabling explainers to reach a consensus across multiple explanations and for supporting effective model learning through improved classification performance.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"57 12","pages":""},"PeriodicalIF":10.7000,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-024-10952-7.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-024-10952-7","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The Rashômon Effect, applied in Explainable Machine Learning, refers to the disagreement between the explanations provided by various attribution explainers and to the dissimilarity across multiple explanations generated by a particular explainer for a single instance from the dataset (differences between feature importances and their associated signs and ranks), an undesirable outcome especially in sensitive domains such as healthcare or finance. We propose a method inspired from textual-case based reasoning for aligning explanations from various explainers in order to resolve the disagreement and dissimilarity problems. We iteratively generated a number of 100 explanations for each instance from six popular datasets, using three prevalent feature attribution explainers: LIME, Anchors and SHAP (with the variations Tree SHAP and Kernel SHAP) and consequently applied a global cluster-based aggregation strategy that quantifies alignment and reveals similarities and associations between explanations. We evaluated our method by weighting the \(\:k\)-NN algorithm with agreed feature overlap explanation weights and compared it to a non-weighted \(\:k\)-NN predictor, having as task binary classification. Also, we compared the results of the weighted \(\:k\)-NN algorithm using aggregated feature overlap explanation weights to the weighted \(\:k\)-NN algorithm using weights produced by a single explanation method (either LIME, SHAP or Anchors). Our global alignment method benefited the most from a hybridization with feature importance scores (information gain), that was essential for acquiring a more accurate estimate of disagreement, for enabling explainers to reach a consensus across multiple explanations and for supporting effective model learning through improved classification performance.
期刊介绍:
Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.