Bias in medical AI: Implications for clinical decision-making.

PLOS digital health Pub Date : 2024-11-07 eCollection Date: 2024-11-01 DOI:10.1371/journal.pdig.0000651
James L Cross, Michael A Choma, John A Onofrey
{"title":"Bias in medical AI: Implications for clinical decision-making.","authors":"James L Cross, Michael A Choma, John A Onofrey","doi":"10.1371/journal.pdig.0000651","DOIUrl":null,"url":null,"abstract":"<p><p>Biases in medical artificial intelligence (AI) arise and compound throughout the AI lifecycle. These biases can have significant clinical consequences, especially in applications that involve clinical decision-making. Left unaddressed, biased medical AI can lead to substandard clinical decisions and the perpetuation and exacerbation of longstanding healthcare disparities. We discuss potential biases that can arise at different stages in the AI development pipeline and how they can affect AI algorithms and clinical decision-making. Bias can occur in data features and labels, model development and evaluation, deployment, and publication. Insufficient sample sizes for certain patient groups can result in suboptimal performance, algorithm underestimation, and clinically unmeaningful predictions. Missing patient findings can also produce biased model behavior, including capturable but nonrandomly missing data, such as diagnosis codes, and data that is not usually or not easily captured, such as social determinants of health. Expertly annotated labels used to train supervised learning models may reflect implicit cognitive biases or substandard care practices. Overreliance on performance metrics during model development may obscure bias and diminish a model's clinical utility. When applied to data outside the training cohort, model performance can deteriorate from previous validation and can do so differentially across subgroups. How end users interact with deployed solutions can introduce bias. Finally, where models are developed and published, and by whom, impacts the trajectories and priorities of future medical AI development. Solutions to mitigate bias must be implemented with care, which include the collection of large and diverse data sets, statistical debiasing methods, thorough model evaluation, emphasis on model interpretability, and standardized bias reporting and transparency requirements. Prior to real-world implementation in clinical settings, rigorous validation through clinical trials is critical to demonstrate unbiased application. Addressing biases across model development stages is crucial for ensuring all patients benefit equitably from the future of medical AI.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 11","pages":"e0000651"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11542778/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLOS digital health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1371/journal.pdig.0000651","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/11/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Biases in medical artificial intelligence (AI) arise and compound throughout the AI lifecycle. These biases can have significant clinical consequences, especially in applications that involve clinical decision-making. Left unaddressed, biased medical AI can lead to substandard clinical decisions and the perpetuation and exacerbation of longstanding healthcare disparities. We discuss potential biases that can arise at different stages in the AI development pipeline and how they can affect AI algorithms and clinical decision-making. Bias can occur in data features and labels, model development and evaluation, deployment, and publication. Insufficient sample sizes for certain patient groups can result in suboptimal performance, algorithm underestimation, and clinically unmeaningful predictions. Missing patient findings can also produce biased model behavior, including capturable but nonrandomly missing data, such as diagnosis codes, and data that is not usually or not easily captured, such as social determinants of health. Expertly annotated labels used to train supervised learning models may reflect implicit cognitive biases or substandard care practices. Overreliance on performance metrics during model development may obscure bias and diminish a model's clinical utility. When applied to data outside the training cohort, model performance can deteriorate from previous validation and can do so differentially across subgroups. How end users interact with deployed solutions can introduce bias. Finally, where models are developed and published, and by whom, impacts the trajectories and priorities of future medical AI development. Solutions to mitigate bias must be implemented with care, which include the collection of large and diverse data sets, statistical debiasing methods, thorough model evaluation, emphasis on model interpretability, and standardized bias reporting and transparency requirements. Prior to real-world implementation in clinical settings, rigorous validation through clinical trials is critical to demonstrate unbiased application. Addressing biases across model development stages is crucial for ensuring all patients benefit equitably from the future of medical AI.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
医学人工智能中的偏见:对临床决策的影响。
医学人工智能(AI)中的偏差会在整个人工智能生命周期中出现并不断加剧。这些偏差会对临床产生重大影响,尤其是在涉及临床决策的应用中。如果不加以解决,带有偏见的医疗人工智能可能会导致不合格的临床决策,并使长期存在的医疗差距永久化和加剧。我们将讨论在人工智能开发管道的不同阶段可能出现的潜在偏差,以及这些偏差会如何影响人工智能算法和临床决策。偏见可能出现在数据特征和标签、模型开发和评估、部署和发布中。某些患者群体的样本量不足会导致性能不达标、算法估计不足以及临床预测无意义。缺失的患者研究结果也会导致模型行为出现偏差,包括诊断代码等可捕获但非随机缺失的数据,以及健康的社会决定因素等通常不会或不易捕获的数据。用于训练监督学习模型的专家注释标签可能会反映出隐含的认知偏差或不合标准的护理实践。在模型开发过程中过度依赖性能指标可能会掩盖偏差,降低模型的临床实用性。当应用到训练队列以外的数据时,模型的性能可能会比之前的验证结果更差,而且在不同的亚组中会有不同的表现。最终用户如何与已部署的解决方案互动也会产生偏差。最后,模型在哪里开发和发布,由谁开发和发布,都会影响未来医疗人工智能的发展轨迹和优先顺序。减轻偏倚的解决方案必须谨慎实施,其中包括收集大量多样的数据集、统计去伪存真方法、全面的模型评估、强调模型的可解释性以及标准化的偏倚报告和透明度要求。在临床环境中实际应用之前,通过临床试验进行严格验证对证明无偏见应用至关重要。要确保所有患者都能公平地受益于未来的医疗人工智能,解决模型开发阶段的偏差问题至关重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Opportunities to design better computer vison-assisted food diaries to support individuals and experts in dietary assessment: An observation and interview study with nutrition experts. Deep learning-based screening for locomotive syndrome using single-camera walking video: Development and validation study. A recurrent neural network and parallel hidden Markov model algorithm to segment and detect heart murmurs in phonocardiograms. On-site electronic consent in pediatrics using generic Informed Consent Service (gICS): Creating a specialized setup and collecting consent data. A feature-based qualitative assessment of smoking cessation mobile applications.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1