Tobias Rieger , Dietrich Manzey , Benigna Meussling , Linda Onnasch , Eileen Roesler
{"title":"Be careful what you explain: Benefits and costs of explainable AI in a simulated medical task","authors":"Tobias Rieger , Dietrich Manzey , Benigna Meussling , Linda Onnasch , Eileen Roesler","doi":"10.1016/j.chbah.2023.100021","DOIUrl":null,"url":null,"abstract":"<div><p>We investigated the impact of explainability instructions with respect to system limitations on trust behavior and trust attitude when using an artificial intelligence (AI) support agent to perform a simulated medical task. In an online experiment (<em>N</em> = 128), participants performed a visual estimation task in a simulated medical setting (i.e., estimate the percentage of bacteria in a visual stimulus). All participants were supported by an AI that gave perfect recommendations for all but one color of bacteria (i.e., error-prone color with 50% reliability). We manipulated between-subjects whether participants knew about the error-prone color (XAI condition) or not (nonXAI condition). The analyses revealed that participants showed higher trust behavior (i.e., lower deviation from the AI recommendation) for the non-error-prone trials in the XAI condition. Moreover, participants showed lower trust behavior for the error-prone color in the XAI condition than in the nonXAI condition. However, this behavioral adaptation only applied to the subset of error-prone trials in which the AI gave correct recommendations, and not to the actual erroneous trials. Thus, designing explainable AI systems can also come with inadequate behavioral adaptations, as explainability was associated with benefits (i.e., more adequate behavior in non-error-prone trials), but also costs (stronger changes to the AI recommendations in correct error-prone trials).</p></div>","PeriodicalId":100324,"journal":{"name":"Computers in Human Behavior: Artificial Humans","volume":"1 2","pages":"Article 100021"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S294988212300021X/pdfft?md5=221d729df96546eae8913e787fa04ac8&pid=1-s2.0-S294988212300021X-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in Human Behavior: Artificial Humans","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S294988212300021X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We investigated the impact of explainability instructions with respect to system limitations on trust behavior and trust attitude when using an artificial intelligence (AI) support agent to perform a simulated medical task. In an online experiment (N = 128), participants performed a visual estimation task in a simulated medical setting (i.e., estimate the percentage of bacteria in a visual stimulus). All participants were supported by an AI that gave perfect recommendations for all but one color of bacteria (i.e., error-prone color with 50% reliability). We manipulated between-subjects whether participants knew about the error-prone color (XAI condition) or not (nonXAI condition). The analyses revealed that participants showed higher trust behavior (i.e., lower deviation from the AI recommendation) for the non-error-prone trials in the XAI condition. Moreover, participants showed lower trust behavior for the error-prone color in the XAI condition than in the nonXAI condition. However, this behavioral adaptation only applied to the subset of error-prone trials in which the AI gave correct recommendations, and not to the actual erroneous trials. Thus, designing explainable AI systems can also come with inadequate behavioral adaptations, as explainability was associated with benefits (i.e., more adequate behavior in non-error-prone trials), but also costs (stronger changes to the AI recommendations in correct error-prone trials).