{"title":"Semifactual Explanations for Reinforcement Learning","authors":"Jasmina Gajcin, Jovan Jeromela, Ivana Dusparic","doi":"arxiv-2409.05435","DOIUrl":null,"url":null,"abstract":"Reinforcement Learning (RL) is a learning paradigm in which the agent learns\nfrom its environment through trial and error. Deep reinforcement learning (DRL)\nalgorithms represent the agent's policies using neural networks, making their\ndecisions difficult to interpret. Explaining the behaviour of DRL agents is\nnecessary to advance user trust, increase engagement, and facilitate\nintegration with real-life tasks. Semifactual explanations aim to explain an\noutcome by providing \"even if\" scenarios, such as \"even if the car were moving\ntwice as slowly, it would still have to swerve to avoid crashing\". Semifactuals\nhelp users understand the effects of different factors on the outcome and\nsupport the optimisation of resources. While extensively studied in psychology\nand even utilised in supervised learning, semifactuals have not been used to\nexplain the decisions of RL systems. In this work, we develop a first approach\nto generating semifactual explanations for RL agents. We start by defining five\nproperties of desirable semifactual explanations in RL and then introducing\nSGRL-Rewind and SGRL-Advance, the first algorithms for generating semifactual\nexplanations in RL. We evaluate the algorithms in two standard RL environments\nand find that they generate semifactuals that are easier to reach, represent\nthe agent's policy better, and are more diverse compared to baselines. Lastly,\nwe conduct and analyse a user study to assess the participant's perception of\nsemifactual explanations of the agent's actions.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"9 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.05435","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Reinforcement Learning (RL) is a learning paradigm in which the agent learns
from its environment through trial and error. Deep reinforcement learning (DRL)
algorithms represent the agent's policies using neural networks, making their
decisions difficult to interpret. Explaining the behaviour of DRL agents is
necessary to advance user trust, increase engagement, and facilitate
integration with real-life tasks. Semifactual explanations aim to explain an
outcome by providing "even if" scenarios, such as "even if the car were moving
twice as slowly, it would still have to swerve to avoid crashing". Semifactuals
help users understand the effects of different factors on the outcome and
support the optimisation of resources. While extensively studied in psychology
and even utilised in supervised learning, semifactuals have not been used to
explain the decisions of RL systems. In this work, we develop a first approach
to generating semifactual explanations for RL agents. We start by defining five
properties of desirable semifactual explanations in RL and then introducing
SGRL-Rewind and SGRL-Advance, the first algorithms for generating semifactual
explanations in RL. We evaluate the algorithms in two standard RL environments
and find that they generate semifactuals that are easier to reach, represent
the agent's policy better, and are more diverse compared to baselines. Lastly,
we conduct and analyse a user study to assess the participant's perception of
semifactual explanations of the agent's actions.