Multiple Imputation for Longitudinal Data: A Tutorial.

IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Statistics in Medicine Pub Date : 2025-02-10 DOI:10.1002/sim.10274
Rushani Wijesuriya, Margarita Moreno-Betancur, John B Carlin, Ian R White, Matteo Quartagno, Katherine J Lee
{"title":"Multiple Imputation for Longitudinal Data: A Tutorial.","authors":"Rushani Wijesuriya, Margarita Moreno-Betancur, John B Carlin, Ian R White, Matteo Quartagno, Katherine J Lee","doi":"10.1002/sim.10274","DOIUrl":null,"url":null,"abstract":"<p><p>Longitudinal studies are frequently used in medical research and involve collecting repeated measures on individuals over time. Observations from the same individual are invariably correlated and thus an analytic approach that accounts for this clustering by individual is required. While almost all research suffers from missing data, this can be particularly problematic in longitudinal studies as participation often becomes harder to maintain over time. Multiple imputation (MI) is widely used to handle missing data in such studies. When using MI, it is important that the imputation model is compatible with the proposed analysis model. In a longitudinal analysis, this implies that the clustering considered in the analysis model should be reflected in the imputation process. Several MI approaches have been proposed to impute incomplete longitudinal data, such as treating repeated measurements of the same variable as distinct variables or using generalized linear mixed imputation models. However, the uptake of these methods has been limited, as they require additional data manipulation and use of advanced imputation procedures. In this tutorial, we review the available MI approaches that can be used for handling incomplete longitudinal data, including where individuals are clustered within higher-level clusters. We illustrate implementation with replicable R and Stata code using a case study from the Childhood to Adolescence Transition Study.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 3-4","pages":"e10274"},"PeriodicalIF":1.8000,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11755704/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistics in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/sim.10274","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Longitudinal studies are frequently used in medical research and involve collecting repeated measures on individuals over time. Observations from the same individual are invariably correlated and thus an analytic approach that accounts for this clustering by individual is required. While almost all research suffers from missing data, this can be particularly problematic in longitudinal studies as participation often becomes harder to maintain over time. Multiple imputation (MI) is widely used to handle missing data in such studies. When using MI, it is important that the imputation model is compatible with the proposed analysis model. In a longitudinal analysis, this implies that the clustering considered in the analysis model should be reflected in the imputation process. Several MI approaches have been proposed to impute incomplete longitudinal data, such as treating repeated measurements of the same variable as distinct variables or using generalized linear mixed imputation models. However, the uptake of these methods has been limited, as they require additional data manipulation and use of advanced imputation procedures. In this tutorial, we review the available MI approaches that can be used for handling incomplete longitudinal data, including where individuals are clustered within higher-level clusters. We illustrate implementation with replicable R and Stata code using a case study from the Childhood to Adolescence Transition Study.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
纵向数据的多重插值:教程。
纵向研究经常用于医学研究,涉及在一段时间内收集对个体的重复测量。来自同一个体的观察总是相互关联的,因此需要一种分析方法来解释这种个体聚类。虽然几乎所有的研究都存在数据缺失的问题,但这在纵向研究中尤其成问题,因为随着时间的推移,参与往往变得更难维持。在这类研究中,多次插值(Multiple imputation, MI)被广泛用于处理缺失数据。在使用MI时,重要的是输入模型与建议的分析模型兼容。在纵向分析中,这意味着分析模型中考虑的聚类应该反映在imputation过程中。已经提出了几种MI方法来推算不完整的纵向数据,例如将同一变量的重复测量作为不同的变量或使用广义线性混合推算模型。然而,这些方法的采用受到限制,因为它们需要额外的数据操作和使用先进的计算程序。在本教程中,我们将回顾可用的MI方法,这些方法可用于处理不完整的纵向数据,包括将个体聚集在更高级别的集群中。我们用一个从儿童到青少年过渡研究的案例研究来说明可复制的R和Stata代码的实现。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Statistics in Medicine
Statistics in Medicine 医学-公共卫生、环境卫生与职业卫生
CiteScore
3.40
自引率
10.00%
发文量
334
审稿时长
2-4 weeks
期刊介绍: The journal aims to influence practice in medicine and its associated sciences through the publication of papers on statistical and other quantitative methods. Papers will explain new methods and demonstrate their application, preferably through a substantive, real, motivating example or a comprehensive evaluation based on an illustrative example. Alternatively, papers will report on case-studies where creative use or technical generalizations of established methodology is directed towards a substantive application. Reviews of, and tutorials on, general topics relevant to the application of statistics to medicine will also be published. The main criteria for publication are appropriateness of the statistical methods to a particular medical problem and clarity of exposition. Papers with primarily mathematical content will be excluded. The journal aims to enhance communication between statisticians, clinicians and medical researchers.
期刊最新文献
A Bayesian Approach to Estimate Causal Average Treatment Effects Under Unmeasured Confounding. Estimands and Doubly Robust Estimation for Cluster-Randomized Trials With Survival Outcomes. The Impact of Two Data-Generating Processes for Competing Risk Data on the Discrimination and Calibration of Two Types of Competing Risk Regression Models. Applying the Partial Order Continual Reassessment Method to High-Dimensional Treatment Combinations. A Model Based on Mixture of Weibull Distributions for Depending Competing Risks Data in the Presence of Long-Term Survivors, and Its Application to Malignant Melanoma Cancer Data.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1