Peter C Austin, Moira K Kapral, Manav V Vyas, Jiming Fang, Amy Ying Xin Yu
{"title":"Using Multilevel Models and Generalized Estimating Equation Models to Account for Clustering in Neurology Clinical Research.","authors":"Peter C Austin, Moira K Kapral, Manav V Vyas, Jiming Fang, Amy Ying Xin Yu","doi":"10.1212/WNL.0000000000209947","DOIUrl":null,"url":null,"abstract":"<p><p>In clinical and health services research, clustered data (also known as data with a multilevel or hierarchical structure) are frequently encountered. For example, patients may be clustered or nested within hospitals. Understanding when data have a multilevel structure is important because clustering of individuals can induce a homogeneity in outcomes within clusters, so that, even after adjusting for measured covariates, outcomes for 2 individuals in the same cluster are more likely to be similar than outcomes for 2 individuals from different clusters. Using conventional statistical regression models to analyze clustered data can result in incorrect conclusions being drawn. In particular, estimated CIs may be artificially narrow, and significance levels may be artificially low. As a result, one may conclude that there is a statistically significant association when there is none. To avoid this problem, investigators should ensure that their analyses use techniques that account for clustering of data. Generalized linear models estimated using generalized estimating equation (GEE) methods and multilevel regression models (also known as hierarchical regression models, mixed-effects models, or random-effects models) are two such techniques. We provide an introduction to clustered or multilevel data and describe how GEE models or multilevel models can be used for the analysis of multilevel data.</p>","PeriodicalId":19256,"journal":{"name":"Neurology","volume":"103 9","pages":"e209947"},"PeriodicalIF":7.7000,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11469681/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1212/WNL.0000000000209947","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/10/11 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
In clinical and health services research, clustered data (also known as data with a multilevel or hierarchical structure) are frequently encountered. For example, patients may be clustered or nested within hospitals. Understanding when data have a multilevel structure is important because clustering of individuals can induce a homogeneity in outcomes within clusters, so that, even after adjusting for measured covariates, outcomes for 2 individuals in the same cluster are more likely to be similar than outcomes for 2 individuals from different clusters. Using conventional statistical regression models to analyze clustered data can result in incorrect conclusions being drawn. In particular, estimated CIs may be artificially narrow, and significance levels may be artificially low. As a result, one may conclude that there is a statistically significant association when there is none. To avoid this problem, investigators should ensure that their analyses use techniques that account for clustering of data. Generalized linear models estimated using generalized estimating equation (GEE) methods and multilevel regression models (also known as hierarchical regression models, mixed-effects models, or random-effects models) are two such techniques. We provide an introduction to clustered or multilevel data and describe how GEE models or multilevel models can be used for the analysis of multilevel data.
在临床和医疗服务研究中,经常会遇到聚类数据(也称为具有多级或分层结构的数据)。例如,患者可能被聚类或嵌套在医院内。了解数据何时具有多层次结构非常重要,因为个体的聚类会导致聚类内结果的同质性,这样,即使调整了测量的协变量,同一聚类中两个个体的结果也比不同聚类中两个个体的结果更有可能相似。使用传统的统计回归模型来分析聚类数据可能会导致得出不正确的结论。特别是,估计的 CI 可能会被人为地缩小,显著性水平可能会被人为地降低。因此,人们可能会得出结论,认为存在具有统计学意义的关联,而实际上并不存在。为避免这一问题,研究者应确保其分析使用了考虑数据聚类的技术。使用广义估计方程(GEE)方法估计的广义线性模型和多层次回归模型(也称为分层回归模型、混合效应模型或随机效应模型)就是这样的两种技术。我们将介绍聚类或多层次数据,并说明如何使用 GEE 模型或多层次模型分析多层次数据。
期刊介绍:
Neurology, the official journal of the American Academy of Neurology, aspires to be the premier peer-reviewed journal for clinical neurology research. Its mission is to publish exceptional peer-reviewed original research articles, editorials, and reviews to improve patient care, education, clinical research, and professionalism in neurology.
As the leading clinical neurology journal worldwide, Neurology targets physicians specializing in nervous system diseases and conditions. It aims to advance the field by presenting new basic and clinical research that influences neurological practice. The journal is a leading source of cutting-edge, peer-reviewed information for the neurology community worldwide. Editorial content includes Research, Clinical/Scientific Notes, Views, Historical Neurology, NeuroImages, Humanities, Letters, and position papers from the American Academy of Neurology. The online version is considered the definitive version, encompassing all available content.
Neurology is indexed in prestigious databases such as MEDLINE/PubMed, Embase, Scopus, Biological Abstracts®, PsycINFO®, Current Contents®, Web of Science®, CrossRef, and Google Scholar.