Gaps in the usage and reporting of multiple imputation for incomplete data: findings from a scoping review of observational studies addressing causal questions.

IF 3.9 3区 医学 Q1 HEALTH CARE SCIENCES & SERVICES BMC Medical Research Methodology Pub Date : 2024-09-04 DOI:10.1186/s12874-024-02302-6
Rheanna M Mainzer, Margarita Moreno-Betancur, Cattram D Nguyen, Julie A Simpson, John B Carlin, Katherine J Lee
{"title":"Gaps in the usage and reporting of multiple imputation for incomplete data: findings from a scoping review of observational studies addressing causal questions.","authors":"Rheanna M Mainzer, Margarita Moreno-Betancur, Cattram D Nguyen, Julie A Simpson, John B Carlin, Katherine J Lee","doi":"10.1186/s12874-024-02302-6","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Missing data are common in observational studies and often occur in several of the variables required when estimating a causal effect, i.e. the exposure, outcome and/or variables used to control for confounding. Analyses involving multiple incomplete variables are not as straightforward as analyses with a single incomplete variable. For example, in the context of multivariable missingness, the standard missing data assumptions (\"missing completely at random\", \"missing at random\" [MAR], \"missing not at random\") are difficult to interpret and assess. It is not clear how the complexities that arise due to multivariable missingness are being addressed in practice. The aim of this study was to review how missing data are managed and reported in observational studies that use multiple imputation (MI) for causal effect estimation, with a particular focus on missing data summaries, missing data assumptions, primary and sensitivity analyses, and MI implementation.</p><p><strong>Methods: </strong>We searched five top general epidemiology journals for observational studies that aimed to answer a causal research question and used MI, published between January 2019 and December 2021. Article screening and data extraction were performed systematically.</p><p><strong>Results: </strong>Of the 130 studies included in this review, 108 (83%) derived an analysis sample by excluding individuals with missing data in specific variables (e.g., outcome) and 114 (88%) had multivariable missingness within the analysis sample. Forty-four (34%) studies provided a statement about missing data assumptions, 35 of which stated the MAR assumption, but only 11/44 (25%) studies provided a justification for these assumptions. The number of imputations, MI method and MI software were generally well-reported (71%, 75% and 88% of studies, respectively), while aspects of the imputation model specification were not clear for more than half of the studies. A secondary analysis that used a different approach to handle the missing data was conducted in 69/130 (53%) studies. Of these 69 studies, 68 (99%) lacked a clear justification for the secondary analysis.</p><p><strong>Conclusion: </strong>Effort is needed to clarify the rationale for and improve the reporting of MI for estimation of causal effects from observational data. We encourage greater transparency in making and reporting analytical decisions related to missing data.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":null,"pages":null},"PeriodicalIF":3.9000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11373423/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Research Methodology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12874-024-02302-6","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Missing data are common in observational studies and often occur in several of the variables required when estimating a causal effect, i.e. the exposure, outcome and/or variables used to control for confounding. Analyses involving multiple incomplete variables are not as straightforward as analyses with a single incomplete variable. For example, in the context of multivariable missingness, the standard missing data assumptions ("missing completely at random", "missing at random" [MAR], "missing not at random") are difficult to interpret and assess. It is not clear how the complexities that arise due to multivariable missingness are being addressed in practice. The aim of this study was to review how missing data are managed and reported in observational studies that use multiple imputation (MI) for causal effect estimation, with a particular focus on missing data summaries, missing data assumptions, primary and sensitivity analyses, and MI implementation.

Methods: We searched five top general epidemiology journals for observational studies that aimed to answer a causal research question and used MI, published between January 2019 and December 2021. Article screening and data extraction were performed systematically.

Results: Of the 130 studies included in this review, 108 (83%) derived an analysis sample by excluding individuals with missing data in specific variables (e.g., outcome) and 114 (88%) had multivariable missingness within the analysis sample. Forty-four (34%) studies provided a statement about missing data assumptions, 35 of which stated the MAR assumption, but only 11/44 (25%) studies provided a justification for these assumptions. The number of imputations, MI method and MI software were generally well-reported (71%, 75% and 88% of studies, respectively), while aspects of the imputation model specification were not clear for more than half of the studies. A secondary analysis that used a different approach to handle the missing data was conducted in 69/130 (53%) studies. Of these 69 studies, 68 (99%) lacked a clear justification for the secondary analysis.

Conclusion: Effort is needed to clarify the rationale for and improve the reporting of MI for estimation of causal effects from observational data. We encourage greater transparency in making and reporting analytical decisions related to missing data.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
不完整数据多重估算的使用和报告中存在的差距:针对因果问题的观察性研究的范围界定研究结果。
背景:缺失数据在观察性研究中很常见,在估算因果效应时,缺失数据往往出现在几个必要的变量中,即暴露、结果和/或用于控制混杂因素的变量。涉及多个不完整变量的分析不像单个不完整变量的分析那么简单。例如,在多变量缺失的情况下,标准的缺失数据假设("完全随机缺失"、"随机缺失"[MAR]、"非随机缺失")很难解释和评估。目前还不清楚在实践中是如何解决多变量缺失带来的复杂问题的。本研究旨在回顾使用多重归因(MI)进行因果效应估计的观察性研究中如何管理和报告缺失数据,尤其关注缺失数据摘要、缺失数据假设、主要分析和敏感性分析以及 MI 的实施:我们在五种顶级普通流行病学期刊上检索了2019年1月至2021年12月间发表的旨在回答因果研究问题并使用MI的观察性研究。我们系统地进行了文章筛选和数据提取:在纳入本综述的 130 项研究中,108 项(83%)通过排除特定变量(如结果)数据缺失的个体获得了分析样本,114 项(88%)在分析样本中存在多变量缺失。有 44 项(34%)研究提供了有关缺失数据假设的声明,其中 35 项声明了 MAR 假设,但只有 11/44 项(25%)研究提供了这些假设的理由。对估算次数、MI 方法和 MI 软件的报告普遍较好(分别占研究的 71%、75% 和 88%),但半数以上的研究对估算模型规范的某些方面并不清楚。69/130(53%)项研究采用了不同的方法对缺失数据进行了二次分析。在这 69 项研究中,有 68 项(99%)的二次分析缺乏明确的理由:结论:需要努力澄清对观察数据的因果效应进行估计的 MI 的理由并改进其报告。我们鼓励在做出和报告与缺失数据相关的分析决定时提高透明度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
BMC Medical Research Methodology
BMC Medical Research Methodology 医学-卫生保健
CiteScore
6.50
自引率
2.50%
发文量
298
审稿时长
3-8 weeks
期刊介绍: BMC Medical Research Methodology is an open access journal publishing original peer-reviewed research articles in methodological approaches to healthcare research. Articles on the methodology of epidemiological research, clinical trials and meta-analysis/systematic review are particularly encouraged, as are empirical studies of the associations between choice of methodology and study outcomes. BMC Medical Research Methodology does not aim to publish articles describing scientific methods or techniques: these should be directed to the BMC journal covering the relevant biomedical subject area.
期刊最新文献
Challenges in measurement of adolescent mental health: how are gender patterns affected when level of symptoms is analysed simultaneously with impairment? Motivations for enrollment in a COVID-19 ring-based post-exposure prophylaxis trial: qualitative examination of participant experiences. Concordance between humans and GPT-4 in appraising the methodological quality of case reports and case series using the Murad tool. Bayesian additive regression trees for predicting childhood asthma in the CHILD cohort study. Incorporating external controls in the design of randomized clinical trials: a case study in solid tumors.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1