Care to Explain? AI Explanation Types Differentially Impact Chest Radiograph Diagnostic Performance and Physician Trust in AI.

IF 12.1 1区 医学 Q1 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Radiology Pub Date : 2024-11-01 DOI:10.1148/radiol.233261
Drew Prinster, Amama Mahmood, Suchi Saria, Jean Jeudy, Cheng Ting Lin, Paul H Yi, Chien-Ming Huang
{"title":"Care to Explain? AI Explanation Types Differentially Impact Chest Radiograph Diagnostic Performance and Physician Trust in AI.","authors":"Drew Prinster, Amama Mahmood, Suchi Saria, Jean Jeudy, Cheng Ting Lin, Paul H Yi, Chien-Ming Huang","doi":"10.1148/radiol.233261","DOIUrl":null,"url":null,"abstract":"<p><p>Background It is unclear whether artificial intelligence (AI) explanations help or hurt radiologists and other physicians in AI-assisted radiologic diagnostic decision-making. Purpose To test whether the type of AI explanation and the correctness and confidence level of AI advice impact physician diagnostic performance, perception of AI advice usefulness, and trust in AI advice for chest radiograph diagnosis. Materials and Methods A multicenter, prospective randomized study was conducted from April 2022 to September 2022. Two types of AI explanations prevalent in medical imaging-local (feature-based) explanations and global (prototype-based) explanations-were a between-participant factor, while AI correctness and confidence were within-participant factors. Radiologists (task experts) and internal or emergency medicine physicians (task nonexperts) received a chest radiograph to read; then, simulated AI advice was presented. Generalized linear mixed-effects models were used to analyze the effects of the experimental variables on diagnostic accuracy, efficiency, physician perception of AI usefulness, and \"simple trust\" (ie, speed of alignment with or divergence from AI advice); the control variables included knowledge of AI, demographic characteristics, and task expertise. Holm-Sidak corrections were used to adjust for multiple comparisons. Results Data from 220 physicians (median age, 30 years [IQR, 28-32.75 years]; 146 male participants) were analyzed. Compared with global AI explanations, local AI explanations yielded better physician diagnostic accuracy when the AI advice was correct (β = 0.86; <i>P</i> value adjusted for multiple comparisons [<i>P</i><sub>adj</sub>] < .001) and increased diagnostic efficiency overall by reducing the time spent considering AI advice (β = -0.19; <i>P</i><sub>adj</sub> = .01). While there were interaction effects of explanation type, AI confidence level, and physician task expertise on diagnostic accuracy (β = -1.05; <i>P</i><sub>adj</sub> = .04), there was no evidence that AI explanation type or AI confidence level significantly affected subjective measures (physician diagnostic confidence and perception of AI usefulness). Finally, radiologists and nonradiologists placed greater simple trust in local AI explanations than in global explanations, regardless of the correctness of the AI advice (β = 1.32; <i>P</i><sub>adj</sub> = .048). Conclusion The type of AI explanation impacted physician diagnostic performance and trust in AI, even when physicians themselves were not aware of such effects. © RSNA, 2024 <i>Supplemental material is available for this article</i>.</p>","PeriodicalId":20896,"journal":{"name":"Radiology","volume":"313 2","pages":"e233261"},"PeriodicalIF":12.1000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1148/radiol.233261","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

Abstract

Background It is unclear whether artificial intelligence (AI) explanations help or hurt radiologists and other physicians in AI-assisted radiologic diagnostic decision-making. Purpose To test whether the type of AI explanation and the correctness and confidence level of AI advice impact physician diagnostic performance, perception of AI advice usefulness, and trust in AI advice for chest radiograph diagnosis. Materials and Methods A multicenter, prospective randomized study was conducted from April 2022 to September 2022. Two types of AI explanations prevalent in medical imaging-local (feature-based) explanations and global (prototype-based) explanations-were a between-participant factor, while AI correctness and confidence were within-participant factors. Radiologists (task experts) and internal or emergency medicine physicians (task nonexperts) received a chest radiograph to read; then, simulated AI advice was presented. Generalized linear mixed-effects models were used to analyze the effects of the experimental variables on diagnostic accuracy, efficiency, physician perception of AI usefulness, and "simple trust" (ie, speed of alignment with or divergence from AI advice); the control variables included knowledge of AI, demographic characteristics, and task expertise. Holm-Sidak corrections were used to adjust for multiple comparisons. Results Data from 220 physicians (median age, 30 years [IQR, 28-32.75 years]; 146 male participants) were analyzed. Compared with global AI explanations, local AI explanations yielded better physician diagnostic accuracy when the AI advice was correct (β = 0.86; P value adjusted for multiple comparisons [Padj] < .001) and increased diagnostic efficiency overall by reducing the time spent considering AI advice (β = -0.19; Padj = .01). While there were interaction effects of explanation type, AI confidence level, and physician task expertise on diagnostic accuracy (β = -1.05; Padj = .04), there was no evidence that AI explanation type or AI confidence level significantly affected subjective measures (physician diagnostic confidence and perception of AI usefulness). Finally, radiologists and nonradiologists placed greater simple trust in local AI explanations than in global explanations, regardless of the correctness of the AI advice (β = 1.32; Padj = .048). Conclusion The type of AI explanation impacted physician diagnostic performance and trust in AI, even when physicians themselves were not aware of such effects. © RSNA, 2024 Supplemental material is available for this article.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
需要解释吗?人工智能解释类型对胸片诊断性能和医生对人工智能信任度的不同影响。
背景 在人工智能辅助放射诊断决策中,人工智能(AI)解释对放射科医生和其他医生是有帮助还是有伤害,目前尚不清楚。目的 检验人工智能解释的类型以及人工智能建议的正确性和可信度是否会影响医生的诊断表现、对人工智能建议有用性的感知以及对人工智能建议在胸片诊断中的信任度。材料与方法 2022 年 4 月至 2022 年 9 月进行了一项多中心、前瞻性随机研究。医学影像中流行的两种人工智能解释--局部(基于特征)解释和全局(基于原型)解释--是参与者之间的因素,而人工智能的正确性和信任度是参与者内部的因素。放射科医生(任务专家)和内科或急诊科医生(任务非专家)需要阅读一张胸片,然后模拟人工智能建议。使用广义线性混合效应模型分析了实验变量对诊断准确性、效率、医生对人工智能有用性的感知以及 "简单信任"(即与人工智能建议一致或背离的速度)的影响;控制变量包括人工智能知识、人口统计学特征和任务专长。Holm-Sidak 校正用于调整多重比较。结果 分析了 220 名医生(中位年龄 30 岁 [IQR,28-32.75 岁];146 名男性参与者)的数据。与全局人工智能解释相比,当人工智能建议正确时,局部人工智能解释能提高医生的诊断准确率(β = 0.86;经多重比较调整后的 P 值 [Padj] < .001),并通过减少考虑人工智能建议所花费的时间提高整体诊断效率(β = -0.19;Padj = .01)。虽然解释类型、人工智能信心水平和医生任务专长对诊断准确性存在交互效应(β = -1.05; Padj = .04),但没有证据表明人工智能解释类型或人工智能信心水平对主观测量(医生诊断信心和对人工智能有用性的感知)有显著影响。最后,无论人工智能建议的正确性如何,放射科医生和非放射科医生对局部人工智能解释的简单信任度均高于对全局解释的信任度(β = 1.32; Padj = .048)。结论 人工智能解释的类型会影响医生的诊断表现和对人工智能的信任,即使医生自己并没有意识到这种影响。© RSNA, 2024 本文有补充材料。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Radiology
Radiology 医学-核医学
CiteScore
35.20
自引率
3.00%
发文量
596
审稿时长
3.6 months
期刊介绍: Published regularly since 1923 by the Radiological Society of North America (RSNA), Radiology has long been recognized as the authoritative reference for the most current, clinically relevant and highest quality research in the field of radiology. Each month the journal publishes approximately 240 pages of peer-reviewed original research, authoritative reviews, well-balanced commentary on significant articles, and expert opinion on new techniques and technologies. Radiology publishes cutting edge and impactful imaging research articles in radiology and medical imaging in order to help improve human health.
期刊最新文献
Risk Factors for Pneumothorax Following Lung Biopsy: Another Peek at Air Leak. Sex-specific Associations between Left Ventricular Remodeling at MRI and Long-term Cardiovascular Risk. The Clinical Weight of Left Ventricular Mass and Shape. Assessment of Nonmass Lesions Detected with Screening Breast US Based on Mammographic Findings. CT-guided Coaxial Lung Biopsy: Number of Cores and Association with Complications.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1