随机缺失下缺失数据的多重插补:在插补模型中包含对撞机作为辅助变量会导致偏差

E. Curnow, K. Tilling, J. Heron, R. Cornish, J. Carpenter
{"title":"随机缺失下缺失数据的多重插补:在插补模型中包含对撞机作为辅助变量会导致偏差","authors":"E. Curnow, K. Tilling, J. Heron, R. Cornish, J. Carpenter","doi":"10.1101/2023.06.16.23291497","DOIUrl":null,"url":null,"abstract":"Epidemiological studies often have missing data, which are commonly handled by multiple imputation (MI). In MI, in addition to those required for the substantive analysis, imputation models often include other variables (\"auxiliary variables\"). Auxiliary variables that predict the partially observed variables can reduce the standard error (SE) of the MI estimator and, if they also predict the probability that data are missing, reduce bias due to data being missing not at random. However, guidance for choosing auxiliary variables is lacking. We examine the consequences of a poorly-chosen auxiliary variable: if it shares a common cause with the partially observed variable and the probability that it is missing (i.e. it is a \"collider\"), its inclusion can induce bias in the MI estimator and may increase SE. We quantify, both algebraically and by simulation, the magnitude of bias and SE when either the exposure or outcome are incomplete. When the substantive analysis outcome is partially observed, the bias can be substantial, relative to the magnitude of the exposure coefficient. In settings in which complete records analysis is valid, the bias is smaller when the exposure is partially observed. However, bias can be larger if the outcome also causes missingness in the exposure. When using MI, it is important to examine, through a combination of data exploration and considering plausible casual diagrams and missingness mechanisms, whether potential auxiliary variables are colliders.","PeriodicalId":73083,"journal":{"name":"Frontiers in epidemiology","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Multiple imputation of missing data under missing at random: including a collider as an auxiliary variable in the imputation model can induce bias\",\"authors\":\"E. Curnow, K. Tilling, J. Heron, R. Cornish, J. Carpenter\",\"doi\":\"10.1101/2023.06.16.23291497\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Epidemiological studies often have missing data, which are commonly handled by multiple imputation (MI). In MI, in addition to those required for the substantive analysis, imputation models often include other variables (\\\"auxiliary variables\\\"). Auxiliary variables that predict the partially observed variables can reduce the standard error (SE) of the MI estimator and, if they also predict the probability that data are missing, reduce bias due to data being missing not at random. However, guidance for choosing auxiliary variables is lacking. We examine the consequences of a poorly-chosen auxiliary variable: if it shares a common cause with the partially observed variable and the probability that it is missing (i.e. it is a \\\"collider\\\"), its inclusion can induce bias in the MI estimator and may increase SE. We quantify, both algebraically and by simulation, the magnitude of bias and SE when either the exposure or outcome are incomplete. When the substantive analysis outcome is partially observed, the bias can be substantial, relative to the magnitude of the exposure coefficient. In settings in which complete records analysis is valid, the bias is smaller when the exposure is partially observed. However, bias can be larger if the outcome also causes missingness in the exposure. When using MI, it is important to examine, through a combination of data exploration and considering plausible casual diagrams and missingness mechanisms, whether potential auxiliary variables are colliders.\",\"PeriodicalId\":73083,\"journal\":{\"name\":\"Frontiers in epidemiology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in epidemiology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2023.06.16.23291497\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in epidemiology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2023.06.16.23291497","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

流行病学研究经常有缺失的数据,通常通过多重插补(MI)处理。在MI中,除了实质性分析所需的变量外,插补模型通常还包括其他变量(“辅助变量”)。预测部分观察到的变量的辅助变量可以减少MI估计器的标准误差(SE),并且如果它们还预测数据丢失的概率,则可以减少由于数据不是随机丢失而引起的偏差。然而,缺乏关于选择辅助变量的指导。我们研究了一个选择不当的辅助变量的后果:如果它与部分观察到的变量有共同的原因,以及它缺失的概率(即它是一个“对撞机”),那么它的包含可能会在MI估计量中引起偏差,并可能增加SE,当暴露或结果不完整时,偏差和SE的大小。当部分观察到实质性分析结果时,相对于暴露系数的大小,偏差可能是实质性的。在完整记录分析有效的设置中,当部分观察到暴露时,偏差较小。然而,如果结果也导致暴露缺失,则偏差可能会更大。在使用MI时,重要的是要通过数据探索和考虑看似合理的随机图和缺失机制来检查潜在的辅助变量是否是对撞机。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Multiple imputation of missing data under missing at random: including a collider as an auxiliary variable in the imputation model can induce bias
Epidemiological studies often have missing data, which are commonly handled by multiple imputation (MI). In MI, in addition to those required for the substantive analysis, imputation models often include other variables ("auxiliary variables"). Auxiliary variables that predict the partially observed variables can reduce the standard error (SE) of the MI estimator and, if they also predict the probability that data are missing, reduce bias due to data being missing not at random. However, guidance for choosing auxiliary variables is lacking. We examine the consequences of a poorly-chosen auxiliary variable: if it shares a common cause with the partially observed variable and the probability that it is missing (i.e. it is a "collider"), its inclusion can induce bias in the MI estimator and may increase SE. We quantify, both algebraically and by simulation, the magnitude of bias and SE when either the exposure or outcome are incomplete. When the substantive analysis outcome is partially observed, the bias can be substantial, relative to the magnitude of the exposure coefficient. In settings in which complete records analysis is valid, the bias is smaller when the exposure is partially observed. However, bias can be larger if the outcome also causes missingness in the exposure. When using MI, it is important to examine, through a combination of data exploration and considering plausible casual diagrams and missingness mechanisms, whether potential auxiliary variables are colliders.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
The impact of cognitive bias about infectious diseases on social well-being. The spatio-temporal evolution of leishmaniasis in the province of Essaouira. Prevalence of chronic kidney disease and associated factors among adult diabetic patients: a hospital-based cross-sectional study. Using a computational cognitive model to simulate the effects of personal and social network experiences on seasonal influenza vaccination decisions. Prevalence of occupational injuries and associated factors among solid waste collectors in Jigjiga city, eastern Ethiopia: a cross-sectional study design.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1