Reweighting UK Biobank corrects for pervasive selection bias due to volunteering.

IF 6.4 2区 医学 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH International journal of epidemiology Pub Date : 2024-04-11 DOI:10.1093/ije/dyae054
Sjoerd van Alten, Benjamin W Domingue, Jessica Faul, Titus Galama, Andries T Marees
{"title":"Reweighting UK Biobank corrects for pervasive selection bias due to volunteering.","authors":"Sjoerd van Alten, Benjamin W Domingue, Jessica Faul, Titus Galama, Andries T Marees","doi":"10.1093/ije/dyae054","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Biobanks typically rely on volunteer-based sampling. This results in large samples (power) at the cost of representativeness (bias). The problem of volunteer bias is debated. Here, we (i) show that volunteering biases associations in UK Biobank (UKB) and (ii) estimate inverse probability (IP) weights that correct for volunteer bias in UKB.</p><p><strong>Methods: </strong>Drawing on UK Census data, we constructed a subsample representative of UKB's target population, which consists of all individuals invited to participate. Based on demographic variables shared between the UK Census and UKB, we estimated IP weights (IPWs) for each UKB participant. We compared 21 weighted and unweighted bivariate associations between these demographic variables to assess volunteer bias.</p><p><strong>Results: </strong>Volunteer bias in all associations, as naively estimated in UKB, was substantial-in some cases so severe that unweighted estimates had the opposite sign of the association in the target population. For example, older individuals in UKB reported being in better health, in contrast to evidence from the UK Census. Using IPWs in weighted regressions reduced 87% of volunteer bias on average. Volunteer-based sampling reduced the effective sample size of UKB substantially, to 32% of its original size.</p><p><strong>Conclusions: </strong>Estimates from large-scale biobanks may be misleading due to volunteer bias. We recommend IP weighting to correct for such bias. To aid in the construction of the next generation of biobanks, we provide suggestions on how to best ensure representativeness in a volunteer-based design. For UKB, IPWs have been made available.</p>","PeriodicalId":14147,"journal":{"name":"International journal of epidemiology","volume":null,"pages":null},"PeriodicalIF":6.4000,"publicationDate":"2024-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11076923/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/ije/dyae054","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Biobanks typically rely on volunteer-based sampling. This results in large samples (power) at the cost of representativeness (bias). The problem of volunteer bias is debated. Here, we (i) show that volunteering biases associations in UK Biobank (UKB) and (ii) estimate inverse probability (IP) weights that correct for volunteer bias in UKB.

Methods: Drawing on UK Census data, we constructed a subsample representative of UKB's target population, which consists of all individuals invited to participate. Based on demographic variables shared between the UK Census and UKB, we estimated IP weights (IPWs) for each UKB participant. We compared 21 weighted and unweighted bivariate associations between these demographic variables to assess volunteer bias.

Results: Volunteer bias in all associations, as naively estimated in UKB, was substantial-in some cases so severe that unweighted estimates had the opposite sign of the association in the target population. For example, older individuals in UKB reported being in better health, in contrast to evidence from the UK Census. Using IPWs in weighted regressions reduced 87% of volunteer bias on average. Volunteer-based sampling reduced the effective sample size of UKB substantially, to 32% of its original size.

Conclusions: Estimates from large-scale biobanks may be misleading due to volunteer bias. We recommend IP weighting to correct for such bias. To aid in the construction of the next generation of biobanks, we provide suggestions on how to best ensure representativeness in a volunteer-based design. For UKB, IPWs have been made available.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
对英国生物库进行重新权重,纠正了因志愿服务而普遍存在的选择偏差。
背景:生物库通常依赖于以志愿者为基础的抽样。这样做的结果是样本量大(功率大),但代表性(偏差大)却是代价。志愿者偏差问题备受争议。在此,我们(i) 表明志愿者偏差会影响英国生物库(UKB)中的关联;(ii) 估计反概率(IP)权重,以纠正英国生物库中的志愿者偏差:根据英国人口普查数据,我们构建了一个代表英国生物库目标人群的子样本,其中包括所有受邀参与的个人。根据英国人口普查和英国广播公司共享的人口统计学变量,我们估算出了每位英国广播公司参与者的 IP 权重 (IPW)。我们比较了这些人口统计学变量之间的 21 个加权和非加权二元关联,以评估志愿者偏差:根据英国调查局的天真估计,所有关联中的志愿者偏差都很大,在某些情况下甚至严重到未加权估计值与目标人群中关联的符号相反。例如,在英国人口普查中,年龄较大的人报告健康状况较好,这与英国人口普查的证据相反。在加权回归中使用 IPW 平均减少了 87% 的志愿者偏差。基于志愿者的抽样大大减少了英国生物库的有效样本量,仅为原来的 32%:结论:大规模生物库的估计值可能会因志愿者偏差而产生误导。我们建议采用 IP 加权法来纠正这种偏差。为了帮助建设下一代生物库,我们就如何在基于志愿者的设计中最好地确保代表性提出了建议。对于英国生物库,IPW 已经可用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
International journal of epidemiology
International journal of epidemiology 医学-公共卫生、环境卫生与职业卫生
CiteScore
13.60
自引率
2.60%
发文量
226
审稿时长
3 months
期刊介绍: The International Journal of Epidemiology is a vital resource for individuals seeking to stay updated on the latest advancements and emerging trends in the field of epidemiology worldwide. The journal fosters communication among researchers, educators, and practitioners involved in the study, teaching, and application of epidemiology pertaining to both communicable and non-communicable diseases. It also includes research on health services and medical care. Furthermore, the journal presents new methodologies in epidemiology and statistics, catering to professionals working in social and preventive medicine. Published six times a year, the International Journal of Epidemiology provides a comprehensive platform for the analysis of data. Overall, this journal is an indispensable tool for staying informed and connected within the dynamic realm of epidemiology.
期刊最新文献
Causal diagrams for disease latency bias. Food, health, and climate change: can epidemiologists contribute further? Association of conventional cigarette smoking, heated tobacco product use and dual use with hypertension. Disentangling discordant vitamin D associations with prostate cancer incidence and fatality in a large, nested case-control study. Cohort Profile: The Pearl River Cohort Study.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1