{"title":"Counterfactual Shapley Values for Explaining Reinforcement Learning","authors":"Yiwei Shi, Qi Zhang, Kevin McAreavey, Weiru Liu","doi":"arxiv-2408.02529","DOIUrl":null,"url":null,"abstract":"This paper introduces a novel approach Counterfactual Shapley Values (CSV),\nwhich enhances explainability in reinforcement learning (RL) by integrating\ncounterfactual analysis with Shapley Values. The approach aims to quantify and\ncompare the contributions of different state dimensions to various action\nchoices. To more accurately analyze these impacts, we introduce new\ncharacteristic value functions, the ``Counterfactual Difference Characteristic\nValue\" and the ``Average Counterfactual Difference Characteristic Value.\" These\nfunctions help calculate the Shapley values to evaluate the differences in\ncontributions between optimal and non-optimal actions. Experiments across\nseveral RL domains, such as GridWorld, FrozenLake, and Taxi, demonstrate the\neffectiveness of the CSV method. The results show that this method not only\nimproves transparency in complex RL systems but also quantifies the differences\nacross various decisions.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"191 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.02529","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper introduces a novel approach Counterfactual Shapley Values (CSV),
which enhances explainability in reinforcement learning (RL) by integrating
counterfactual analysis with Shapley Values. The approach aims to quantify and
compare the contributions of different state dimensions to various action
choices. To more accurately analyze these impacts, we introduce new
characteristic value functions, the ``Counterfactual Difference Characteristic
Value" and the ``Average Counterfactual Difference Characteristic Value." These
functions help calculate the Shapley values to evaluate the differences in
contributions between optimal and non-optimal actions. Experiments across
several RL domains, such as GridWorld, FrozenLake, and Taxi, demonstrate the
effectiveness of the CSV method. The results show that this method not only
improves transparency in complex RL systems but also quantifies the differences
across various decisions.