An intentional approach to managing bias in general purpose embedding models

IF 23.8 1区 医学 Q1 MEDICAL INFORMATICS Lancet Digital Health Pub Date : 2024-01-24 DOI:10.1016/S2589-7500(23)00227-3
Wei-Hung Weng MD PhD , Andrew Sellergen BS , Atilla P Kiraly PhD , Alexander D’Amour PhD , Jungyeon Park BA , Rory Pilgrim BE LLB , Stephen Pfohl PhD , Charles Lau MD , Vivek Natarajan MS , Shekoofeh Azizi PhD , Alan Karthikesalingam MD PhD , Heather Cole-Lewis PhD , Yossi Matias PhD , Greg S Corrado PhD , Dale R Webster PhD , Shravya Shetty MS , Shruthi Prabhakara PhD , Krish Eswaran PhD , Leo A G Celi MD MPH , Yun Liu PhD
{"title":"An intentional approach to managing bias in general purpose embedding models","authors":"Wei-Hung Weng MD PhD ,&nbsp;Andrew Sellergen BS ,&nbsp;Atilla P Kiraly PhD ,&nbsp;Alexander D’Amour PhD ,&nbsp;Jungyeon Park BA ,&nbsp;Rory Pilgrim BE LLB ,&nbsp;Stephen Pfohl PhD ,&nbsp;Charles Lau MD ,&nbsp;Vivek Natarajan MS ,&nbsp;Shekoofeh Azizi PhD ,&nbsp;Alan Karthikesalingam MD PhD ,&nbsp;Heather Cole-Lewis PhD ,&nbsp;Yossi Matias PhD ,&nbsp;Greg S Corrado PhD ,&nbsp;Dale R Webster PhD ,&nbsp;Shravya Shetty MS ,&nbsp;Shruthi Prabhakara PhD ,&nbsp;Krish Eswaran PhD ,&nbsp;Leo A G Celi MD MPH ,&nbsp;Yun Liu PhD","doi":"10.1016/S2589-7500(23)00227-3","DOIUrl":null,"url":null,"abstract":"<div><p>Advances in machine learning for health care have brought concerns about bias from the research community; specifically, the introduction, perpetuation, or exacerbation of care disparities. Reinforcing these concerns is the finding that medical images often reveal signals about sensitive attributes in ways that are hard to pinpoint by both algorithms and people. This finding raises a question about how to best design general purpose pretrained embeddings (GPPEs, defined as embeddings meant to support a broad array of use cases) for building downstream models that are free from particular types of bias. The downstream model should be carefully evaluated for bias, and audited and improved as appropriate. However, in our view, well intentioned attempts to prevent the upstream components—GPPEs—from learning sensitive attributes can have unintended consequences on the downstream models. Despite producing a veneer of technical neutrality, the resultant end-to-end system might still be biased or poorly performing. We present reasons, by building on previously published data, to support the reasoning that GPPEs should ideally contain as much information as the original data contain, and highlight the perils of trying to remove sensitive attributes from a GPPE. We also emphasise that downstream prediction models trained for specific tasks and settings, whether developed using GPPEs or not, should be carefully designed and evaluated to avoid bias that makes models vulnerable to issues such as distributional shift. These evaluations should be done by a diverse team, including social scientists, on a diverse cohort representing the full breadth of the patient population for which the final model is intended.</p></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":null,"pages":null},"PeriodicalIF":23.8000,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2589750023002273/pdfft?md5=ad8033f384f4f510ef4a0b3ef9ad406f&pid=1-s2.0-S2589750023002273-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Lancet Digital Health","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2589750023002273","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0

Abstract

Advances in machine learning for health care have brought concerns about bias from the research community; specifically, the introduction, perpetuation, or exacerbation of care disparities. Reinforcing these concerns is the finding that medical images often reveal signals about sensitive attributes in ways that are hard to pinpoint by both algorithms and people. This finding raises a question about how to best design general purpose pretrained embeddings (GPPEs, defined as embeddings meant to support a broad array of use cases) for building downstream models that are free from particular types of bias. The downstream model should be carefully evaluated for bias, and audited and improved as appropriate. However, in our view, well intentioned attempts to prevent the upstream components—GPPEs—from learning sensitive attributes can have unintended consequences on the downstream models. Despite producing a veneer of technical neutrality, the resultant end-to-end system might still be biased or poorly performing. We present reasons, by building on previously published data, to support the reasoning that GPPEs should ideally contain as much information as the original data contain, and highlight the perils of trying to remove sensitive attributes from a GPPE. We also emphasise that downstream prediction models trained for specific tasks and settings, whether developed using GPPEs or not, should be carefully designed and evaluated to avoid bias that makes models vulnerable to issues such as distributional shift. These evaluations should be done by a diverse team, including social scientists, on a diverse cohort representing the full breadth of the patient population for which the final model is intended.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
管理通用嵌入模型偏差的有意方法
用于医疗保健的机器学习技术的进步引起了研究界对偏见的关注,特别是对医疗差距的引入、延续或加剧的关注。医学图像经常以算法和人都难以确定的方式揭示敏感属性信号,这一发现强化了上述担忧。这一发现提出了一个问题,即如何以最佳方式设计通用预训练嵌入(GPPE,定义为旨在支持各种用例的嵌入),以构建没有特定类型偏差的下游模型。应仔细评估下游模型是否存在偏差,并酌情进行审核和改进。然而,我们认为,防止上游组件--GPPE--学习敏感属性的良好意图可能会对下游模型产生意想不到的后果。尽管会产生一层技术中立的外衣,但由此产生的端到端系统仍可能存在偏差或表现不佳。我们以之前公布的数据为基础,提出了支持 GPPE 理想情况下应包含与原始数据一样多信息的理由,并强调了试图从 GPPE 中删除敏感属性的危险性。我们还强调,为特定任务和环境训练的下游预测模型,无论是否使用 GPPE 开发,都应该经过精心设计和评估,以避免出现偏差,使模型容易受到分布偏移等问题的影响。这些评估应由包括社会科学家在内的不同团队在代表最终模型所针对的全部患者群体的不同人群中进行。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
41.20
自引率
1.60%
发文量
232
审稿时长
13 weeks
期刊介绍: The Lancet Digital Health publishes important, innovative, and practice-changing research on any topic connected with digital technology in clinical medicine, public health, and global health. The journal’s open access content crosses subject boundaries, building bridges between health professionals and researchers.By bringing together the most important advances in this multidisciplinary field,The Lancet Digital Health is the most prominent publishing venue in digital health. We publish a range of content types including Articles,Review, Comment, and Correspondence, contributing to promoting digital technologies in health practice worldwide.
期刊最新文献
Artificial intelligence-enabled electrocardiogram for mortality and cardiovascular risk estimation: a model development and validation study COVID-19 testing and reporting behaviours in England across different sociodemographic groups: a population-based study using testing data and data from community prevalence surveillance surveys Fairly evaluating the performance of normative models – Authors' reply Fairly evaluating the performance of normative models Lifting the veil on health datasets
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1