首页 > 最新文献

Biostatistics最新文献

英文 中文
Correction to: A transformation perspective on marginal and conditional models. Correction to:边际模型和条件模型的转换视角。
IF 2.1 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-04-15 DOI: 10.1093/biostatistics/kxad017
{"title":"Correction to: A transformation perspective on marginal and conditional models.","authors":"","doi":"10.1093/biostatistics/kxad017","DOIUrl":"10.1093/biostatistics/kxad017","url":null,"abstract":"","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":"597"},"PeriodicalIF":2.1,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11017110/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10301897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-trait analysis of gene-by-environment interactions in large-scale genetic studies. 大规模遗传研究中基因与环境相互作用的多属性分析。
IF 2.1 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-04-15 DOI: 10.1093/biostatistics/kxad004
Lan Luo, Devan V Mehrotra, Judong Shen, Zheng-Zheng Tang

Identifying genotype-by-environment interaction (GEI) is challenging because the GEI analysis generally has low power. Large-scale consortium-based studies are ultimately needed to achieve adequate power for identifying GEI. We introduce Multi-Trait Analysis of Gene-Environment Interactions (MTAGEI), a powerful, robust, and computationally efficient framework to test gene-environment interactions on multiple traits in large data sets, such as the UK Biobank (UKB). To facilitate the meta-analysis of GEI studies in a consortium, MTAGEI efficiently generates summary statistics of genetic associations for multiple traits under different environmental conditions and integrates the summary statistics for GEI analysis. MTAGEI enhances the power of GEI analysis by aggregating GEI signals across multiple traits and variants that would otherwise be difficult to detect individually. MTAGEI achieves robustness by combining complementary tests under a wide spectrum of genetic architectures. We demonstrate the advantages of MTAGEI over existing single-trait-based GEI tests through extensive simulation studies and the analysis of the whole exome sequencing data from the UKB.

鉴定基因型与环境的交互作用(GEI)具有挑战性,因为 GEI 分析的功率通常较低。最终需要进行大规模的联合研究,以获得足够的功率来识别 GEI。我们介绍了基因与环境互作的多性状分析(MTAGEI),这是一个功能强大、稳健且计算效率高的框架,用于测试英国生物库(UKB)等大型数据集中多个性状的基因与环境互作。为便于在联合体中对 GEI 研究进行荟萃分析,MTAGEI 可高效生成不同环境条件下多个性状的遗传关联汇总统计,并将汇总统计整合到 GEI 分析中。MTAGEI 通过汇总多个性状和变异的 GEI 信号,增强了 GEI 分析的能力,否则很难单独检测到这些信号。MTAGEI 通过在广泛的遗传结构下结合互补测试来实现稳健性。我们通过广泛的模拟研究和对英国广播公司全外显子组测序数据的分析,证明了 MTAGEI 与现有的基于单一性状的 GEI 检测相比所具有的优势。
{"title":"Multi-trait analysis of gene-by-environment interactions in large-scale genetic studies.","authors":"Lan Luo, Devan V Mehrotra, Judong Shen, Zheng-Zheng Tang","doi":"10.1093/biostatistics/kxad004","DOIUrl":"10.1093/biostatistics/kxad004","url":null,"abstract":"<p><p>Identifying genotype-by-environment interaction (GEI) is challenging because the GEI analysis generally has low power. Large-scale consortium-based studies are ultimately needed to achieve adequate power for identifying GEI. We introduce Multi-Trait Analysis of Gene-Environment Interactions (MTAGEI), a powerful, robust, and computationally efficient framework to test gene-environment interactions on multiple traits in large data sets, such as the UK Biobank (UKB). To facilitate the meta-analysis of GEI studies in a consortium, MTAGEI efficiently generates summary statistics of genetic associations for multiple traits under different environmental conditions and integrates the summary statistics for GEI analysis. MTAGEI enhances the power of GEI analysis by aggregating GEI signals across multiple traits and variants that would otherwise be difficult to detect individually. MTAGEI achieves robustness by combining complementary tests under a wide spectrum of genetic architectures. We demonstrate the advantages of MTAGEI over existing single-trait-based GEI tests through extensive simulation studies and the analysis of the whole exome sequencing data from the UKB.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":"504-520"},"PeriodicalIF":2.1,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9090518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modeling biomarker variability in joint analysis of longitudinal and time-to-event data. 在纵向数据和时间到事件数据的联合分析中建立生物标记变异性模型。
IF 2.1 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-04-15 DOI: 10.1093/biostatistics/kxad009
Chunyu Wang, Jiaming Shen, Christiana Charalambous, Jianxin Pan

The role of visit-to-visit variability of a biomarker in predicting related disease has been recognized in medical science. Existing measures of biological variability are criticized for being entangled with random variability resulted from measurement error or being unreliable due to limited measurements per individual. In this article, we propose a new measure to quantify the biological variability of a biomarker by evaluating the fluctuation of each individual-specific trajectory behind longitudinal measurements. Given a mixed-effects model for longitudinal data with the mean function over time specified by cubic splines, our proposed variability measure can be mathematically expressed as a quadratic form of random effects. A Cox model is assumed for time-to-event data by incorporating the defined variability as well as the current level of the underlying longitudinal trajectory as covariates, which, together with the longitudinal model, constitutes the joint modeling framework in this article. Asymptotic properties of maximum likelihood estimators are established for the present joint model. Estimation is implemented via an Expectation-Maximization (EM) algorithm with fully exponential Laplace approximation used in E-step to reduce the computation burden due to the increase of the random effects dimension. Simulation studies are conducted to reveal the advantage of the proposed method over the two-stage method, as well as a simpler joint modeling approach which does not take into account biomarker variability. Finally, we apply our model to investigate the effect of systolic blood pressure variability on cardiovascular events in the Medical Research Council elderly trial, which is also the motivating example for this article.

医学界已经认识到生物标志物的逐次变异性在预测相关疾病中的作用。现有的生物变异性测量方法因与测量误差导致的随机变异性纠缠在一起或因每个人的测量值有限而不可靠而受到批评。在本文中,我们提出了一种新的测量方法,通过评估纵向测量背后每个个体特定轨迹的波动来量化生物标志物的生物变异性。鉴于纵向数据的混合效应模型中,随时间变化的均值函数是由三次样条指定的,我们提出的变异性测量方法在数学上可以表示为随机效应的二次形式。通过将定义的变异性和基本纵向轨迹的当前水平作为协变量,假设时间到事件数据采用 Cox 模型,该模型与纵向模型一起构成了本文的联合建模框架。本文为本联合模型建立了最大似然估计器的渐近特性。估计是通过期望最大化(EM)算法实现的,在 E 步中使用了全指数拉普拉斯近似,以减少随机效应维度增加带来的计算负担。我们进行了模拟研究,以揭示所提出的方法相对于两阶段方法的优势,以及不考虑生物标记变异性的更简单的联合建模方法的优势。最后,我们应用我们的模型研究了医学研究委员会老年试验中收缩压变异性对心血管事件的影响,这也是本文的激励实例。
{"title":"Modeling biomarker variability in joint analysis of longitudinal and time-to-event data.","authors":"Chunyu Wang, Jiaming Shen, Christiana Charalambous, Jianxin Pan","doi":"10.1093/biostatistics/kxad009","DOIUrl":"10.1093/biostatistics/kxad009","url":null,"abstract":"<p><p>The role of visit-to-visit variability of a biomarker in predicting related disease has been recognized in medical science. Existing measures of biological variability are criticized for being entangled with random variability resulted from measurement error or being unreliable due to limited measurements per individual. In this article, we propose a new measure to quantify the biological variability of a biomarker by evaluating the fluctuation of each individual-specific trajectory behind longitudinal measurements. Given a mixed-effects model for longitudinal data with the mean function over time specified by cubic splines, our proposed variability measure can be mathematically expressed as a quadratic form of random effects. A Cox model is assumed for time-to-event data by incorporating the defined variability as well as the current level of the underlying longitudinal trajectory as covariates, which, together with the longitudinal model, constitutes the joint modeling framework in this article. Asymptotic properties of maximum likelihood estimators are established for the present joint model. Estimation is implemented via an Expectation-Maximization (EM) algorithm with fully exponential Laplace approximation used in E-step to reduce the computation burden due to the increase of the random effects dimension. Simulation studies are conducted to reveal the advantage of the proposed method over the two-stage method, as well as a simpler joint modeling approach which does not take into account biomarker variability. Finally, we apply our model to investigate the effect of systolic blood pressure variability on cardiovascular events in the Medical Research Council elderly trial, which is also the motivating example for this article.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":"577-596"},"PeriodicalIF":2.1,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11017116/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9522826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identifying covariate-related subnetworks for whole-brain connectome analysis. 为全脑连接组分析识别协变量相关子网络
IF 2.1 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-04-15 DOI: 10.1093/biostatistics/kxad007
Shuo Chen, Yuan Zhang, Qiong Wu, Chuan Bi, Peter Kochunov, L Elliot Hong

Whole-brain connectome data characterize the connections among distributed neural populations as a set of edges in a large network, and neuroscience research aims to systematically investigate associations between brain connectome and clinical or experimental conditions as covariates. A covariate is often related to a number of edges connecting multiple brain areas in an organized structure. However, in practice, neither the covariate-related edges nor the structure is known. Therefore, the understanding of underlying neural mechanisms relies on statistical methods that are capable of simultaneously identifying covariate-related connections and recognizing their network topological structures. The task can be challenging because of false-positive noise and almost infinite possibilities of edges combining into subnetworks. To address these challenges, we propose a new statistical approach to handle multivariate edge variables as outcomes and output covariate-related subnetworks. We first study the graph properties of covariate-related subnetworks from a graph and combinatorics perspective and accordingly bridge the inference for individual connectome edges and covariate-related subnetworks. Next, we develop efficient algorithms to exact covariate-related subnetworks from the whole-brain connectome data with an $ell_0$ norm penalty. We validate the proposed methods based on an extensive simulation study, and we benchmark our performance against existing methods. Using our proposed method, we analyze two separate resting-state functional magnetic resonance imaging data sets for schizophrenia research and obtain highly replicable disease-related subnetworks.

全脑连接组数据将分布式神经群之间的连接描述为大型网络中的一组边缘,神经科学研究旨在系统地调查大脑连接组与作为协变量的临床或实验条件之间的关联。协变量通常与有组织结构中连接多个脑区的若干边缘有关。然而,在实践中,与协变量相关的边缘和结构都是未知的。因此,对潜在神经机制的理解有赖于能够同时识别协变量相关连接和识别其网络拓扑结构的统计方法。由于假阳性噪声和几乎无限可能的边缘组合成子网络,这项任务具有挑战性。为了应对这些挑战,我们提出了一种新的统计方法来处理作为结果的多变量边缘变量,并输出与协变量相关的子网络。我们首先从图和组合学的角度研究了共变相关子网的图属性,并相应地为单个连接组边缘和共变相关子网架起了推断的桥梁。接下来,我们开发了高效算法,从全脑连接组数据中精确推导出具有$ell_0$规范惩罚的协变量相关子网络。我们基于广泛的模拟研究验证了所提出的方法,并将我们的性能与现有方法进行了比较。利用我们提出的方法,我们分析了两个独立的静息态功能磁共振成像数据集,用于精神分裂症研究,并获得了高度可复制的疾病相关子网络。
{"title":"Identifying covariate-related subnetworks for whole-brain connectome analysis.","authors":"Shuo Chen, Yuan Zhang, Qiong Wu, Chuan Bi, Peter Kochunov, L Elliot Hong","doi":"10.1093/biostatistics/kxad007","DOIUrl":"10.1093/biostatistics/kxad007","url":null,"abstract":"<p><p>Whole-brain connectome data characterize the connections among distributed neural populations as a set of edges in a large network, and neuroscience research aims to systematically investigate associations between brain connectome and clinical or experimental conditions as covariates. A covariate is often related to a number of edges connecting multiple brain areas in an organized structure. However, in practice, neither the covariate-related edges nor the structure is known. Therefore, the understanding of underlying neural mechanisms relies on statistical methods that are capable of simultaneously identifying covariate-related connections and recognizing their network topological structures. The task can be challenging because of false-positive noise and almost infinite possibilities of edges combining into subnetworks. To address these challenges, we propose a new statistical approach to handle multivariate edge variables as outcomes and output covariate-related subnetworks. We first study the graph properties of covariate-related subnetworks from a graph and combinatorics perspective and accordingly bridge the inference for individual connectome edges and covariate-related subnetworks. Next, we develop efficient algorithms to exact covariate-related subnetworks from the whole-brain connectome data with an $ell_0$ norm penalty. We validate the proposed methods based on an extensive simulation study, and we benchmark our performance against existing methods. Using our proposed method, we analyze two separate resting-state functional magnetic resonance imaging data sets for schizophrenia research and obtain highly replicable disease-related subnetworks.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":"541-558"},"PeriodicalIF":2.1,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11017127/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9846712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A transformation perspective on marginal and conditional models. 边际模型和条件模型的转换视角。
IF 1.8 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-04-15 DOI: 10.1093/biostatistics/kxac048
Luisa Barbanti, Torsten Hothorn

Clustered observations are ubiquitous in controlled and observational studies and arise naturally in multicenter trials or longitudinal surveys. We present a novel model for the analysis of clustered observations where the marginal distributions are described by a linear transformation model and the correlations by a joint multivariate normal distribution. The joint model provides an analytic formula for the marginal distribution. Owing to the richness of transformation models, the techniques are applicable to any type of response variable, including bounded, skewed, binary, ordinal, or survival responses. We demonstrate how the common normal assumption for reaction times can be relaxed in the sleep deprivation benchmark data set and report marginal odds ratios for the notoriously difficult toe nail data. We furthermore discuss the analysis of two clinical trials aiming at the estimation of marginal treatment effects. In the first trial, pain was repeatedly assessed on a bounded visual analog scale and marginal proportional-odds models are presented. The second trial reported disease-free survival in rectal cancer patients, where the marginal hazard ratio from Weibull and Cox models is of special interest. An empirical evaluation compares the performance of the novel approach to general estimation equations for binary responses and to conditional mixed-effects models for continuous responses. An implementation is available in the tram add-on package to the R system and was benchmarked against established models in the literature.

聚类观察结果在对照研究和观察研究中无处不在,在多中心试验或纵向调查中也会自然出现。我们提出了一种新的聚类观测数据分析模型,其中边际分布由线性变换模型描述,相关性由联合多元正态分布描述。联合模型提供了边际分布的分析公式。由于变换模型的丰富性,这些技术适用于任何类型的响应变量,包括有界、倾斜、二元、序数或生存响应。我们展示了如何在睡眠剥夺基准数据集中放宽反应时间的常见正态假设,并报告了众所周知的脚趾甲数据的边际几率比。此外,我们还讨论了旨在估计边际治疗效果的两项临床试验的分析。在第一项试验中,用有界视觉模拟量表对疼痛进行了反复评估,并给出了边际比例-胜数模型。第二项试验报告了直肠癌患者的无病生存期,其中 Weibull 和 Cox 模型的边际危险比特别值得关注。经验评估比较了新方法与二元反应的一般估计方程和连续反应的条件混合效应模型的性能。在$texttt{R}$系统的Tram附加软件包中提供了实现方法,并与文献中的成熟模型进行了基准比较。
{"title":"A transformation perspective on marginal and conditional models.","authors":"Luisa Barbanti, Torsten Hothorn","doi":"10.1093/biostatistics/kxac048","DOIUrl":"10.1093/biostatistics/kxac048","url":null,"abstract":"<p><p>Clustered observations are ubiquitous in controlled and observational studies and arise naturally in multicenter trials or longitudinal surveys. We present a novel model for the analysis of clustered observations where the marginal distributions are described by a linear transformation model and the correlations by a joint multivariate normal distribution. The joint model provides an analytic formula for the marginal distribution. Owing to the richness of transformation models, the techniques are applicable to any type of response variable, including bounded, skewed, binary, ordinal, or survival responses. We demonstrate how the common normal assumption for reaction times can be relaxed in the sleep deprivation benchmark data set and report marginal odds ratios for the notoriously difficult toe nail data. We furthermore discuss the analysis of two clinical trials aiming at the estimation of marginal treatment effects. In the first trial, pain was repeatedly assessed on a bounded visual analog scale and marginal proportional-odds models are presented. The second trial reported disease-free survival in rectal cancer patients, where the marginal hazard ratio from Weibull and Cox models is of special interest. An empirical evaluation compares the performance of the novel approach to general estimation equations for binary responses and to conditional mixed-effects models for continuous responses. An implementation is available in the tram add-on package to the R system and was benchmarked against established models in the literature.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":"402-428"},"PeriodicalIF":1.8,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11212492/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10297317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiple imputation of more than one environmental exposure with nondifferential measurement error. 具有非微分测量误差的一次以上环境暴露的多重插补。
IF 2.1 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-04-15 DOI: 10.1093/biostatistics/kxad011
Yuanzhi Yu, Roderick J Little, Matthew Perzanowski, Qixuan Chen

Measurement error is common in environmental epidemiologic studies, but methods for correcting measurement error in regression models with multiple environmental exposures as covariates have not been well investigated. We consider a multiple imputation approach, combining external or internal calibration samples that contain information on both true and error-prone exposures with the main study data of multiple exposures measured with error. We propose a constrained chained equations multiple imputation (CEMI) algorithm that places constraints on the imputation model parameters in the chained equations imputation based on the assumptions of strong nondifferential measurement error. We also extend the constrained CEMI method to accommodate nondetects in the error-prone exposures in the main study data. We estimate the variance of the regression coefficients using the bootstrap with two imputations of each bootstrapped sample. The constrained CEMI method is shown by simulations to outperform existing methods, namely the method that ignores measurement error, classical calibration, and regression prediction, yielding estimated regression coefficients with smaller bias and confidence intervals with coverage close to the nominal level. We apply the proposed method to the Neighborhood Asthma and Allergy Study to investigate the associations between the concentrations of multiple indoor allergens and the fractional exhaled nitric oxide level among asthmatic children in New York City. The constrained CEMI method can be implemented by imposing constraints on the imputation matrix using the mice and bootImpute packages in R.

测量误差在环境流行病学研究中很常见,但在以多种环境暴露为协变量的回归模型中校正测量误差的方法尚未得到很好的研究。我们考虑了一种多重插补方法,将包含真实和易出错暴露信息的外部或内部校准样本与误差测量的多重暴露的主要研究数据相结合。我们提出了一种约束链式方程多重插补(CEMI)算法,该算法基于强非微分测量误差的假设,对链式方程插补中的插补模型参数进行约束。我们还扩展了约束CEMI方法,以适应主要研究数据中容易出错的暴露中的非检测。我们使用bootstrap估计回归系数的方差,每个bootstrap样本有两个输入。模拟表明,约束CEMI方法优于现有方法,即忽略测量误差、经典校准和回归预测的方法,产生具有较小偏差的估计回归系数和覆盖率接近标称水平的置信区间。我们将所提出的方法应用于社区哮喘和过敏研究,以调查纽约市哮喘儿童中多种室内过敏原的浓度与呼出一氧化氮水平之间的关系。约束CEMI方法可以通过使用R中的鼠标和bootImpute包对插补矩阵施加约束来实现。
{"title":"Multiple imputation of more than one environmental exposure with nondifferential measurement error.","authors":"Yuanzhi Yu, Roderick J Little, Matthew Perzanowski, Qixuan Chen","doi":"10.1093/biostatistics/kxad011","DOIUrl":"10.1093/biostatistics/kxad011","url":null,"abstract":"<p><p>Measurement error is common in environmental epidemiologic studies, but methods for correcting measurement error in regression models with multiple environmental exposures as covariates have not been well investigated. We consider a multiple imputation approach, combining external or internal calibration samples that contain information on both true and error-prone exposures with the main study data of multiple exposures measured with error. We propose a constrained chained equations multiple imputation (CEMI) algorithm that places constraints on the imputation model parameters in the chained equations imputation based on the assumptions of strong nondifferential measurement error. We also extend the constrained CEMI method to accommodate nondetects in the error-prone exposures in the main study data. We estimate the variance of the regression coefficients using the bootstrap with two imputations of each bootstrapped sample. The constrained CEMI method is shown by simulations to outperform existing methods, namely the method that ignores measurement error, classical calibration, and regression prediction, yielding estimated regression coefficients with smaller bias and confidence intervals with coverage close to the nominal level. We apply the proposed method to the Neighborhood Asthma and Allergy Study to investigate the associations between the concentrations of multiple indoor allergens and the fractional exhaled nitric oxide level among asthmatic children in New York City. The constrained CEMI method can be implemented by imposing constraints on the imputation matrix using the mice and bootImpute packages in R.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":"306-322"},"PeriodicalIF":2.1,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11017114/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9522828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Characterizing quantile-varying covariate effects under the accelerated failure time model. 加速失效时间模型下的量变协变量效应特征。
IF 1.8 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-04-15 DOI: 10.1093/biostatistics/kxac052
Harrison T Reeder, Kyu Ha Lee, Sebastien Haneuse

An important task in survival analysis is choosing a structure for the relationship between covariates of interest and the time-to-event outcome. For example, the accelerated failure time (AFT) model structures each covariate effect as a constant multiplicative shift in the outcome distribution across all survival quantiles. Though parsimonious, this structure cannot detect or capture effects that differ across quantiles of the distribution, a limitation that is analogous to only permitting proportional hazards in the Cox model. To address this, we propose a general framework for quantile-varying multiplicative effects under the AFT model. Specifically, we embed flexible regression structures within the AFT model and derive a novel formula for interpretable effects on the quantile scale. A regression standardization scheme based on the g-formula is proposed to enable the estimation of both covariate-conditional and marginal effects for an exposure of interest. We implement a user-friendly Bayesian approach for the estimation and quantification of uncertainty while accounting for left truncation and complex censoring. We emphasize the intuitive interpretation of this model through numerical and graphical tools and illustrate its performance through simulation and application to a study of Alzheimer's disease and dementia.

生存分析中的一项重要任务是为相关协变量与时间到事件结果之间的关系选择一种结构。例如,加速失效时间(AFT)模型将每个协变量的影响结构化为结果分布在所有生存量级上的恒定乘法移动。这种结构虽然简洁,但无法检测或捕捉不同量级分布的效应,这种局限性类似于 Cox 模型中只允许比例危险度。为了解决这个问题,我们提出了 AFT 模型下的量级变化乘法效应一般框架。具体来说,我们在 AFT 模型中嵌入了灵活的回归结构,并推导出一个新颖的公式来解释量级上的效应。我们还提出了一种基于 g 公式的回归标准化方案,以估算相关暴露的协变量条件效应和边际效应。我们采用了一种用户友好的贝叶斯方法来估计和量化不确定性,同时考虑到左截断和复杂的普查。我们强调通过数字和图形工具对该模型进行直观解释,并通过模拟和应用于阿尔茨海默病和痴呆症研究来说明该模型的性能。
{"title":"Characterizing quantile-varying covariate effects under the accelerated failure time model.","authors":"Harrison T Reeder, Kyu Ha Lee, Sebastien Haneuse","doi":"10.1093/biostatistics/kxac052","DOIUrl":"10.1093/biostatistics/kxac052","url":null,"abstract":"<p><p>An important task in survival analysis is choosing a structure for the relationship between covariates of interest and the time-to-event outcome. For example, the accelerated failure time (AFT) model structures each covariate effect as a constant multiplicative shift in the outcome distribution across all survival quantiles. Though parsimonious, this structure cannot detect or capture effects that differ across quantiles of the distribution, a limitation that is analogous to only permitting proportional hazards in the Cox model. To address this, we propose a general framework for quantile-varying multiplicative effects under the AFT model. Specifically, we embed flexible regression structures within the AFT model and derive a novel formula for interpretable effects on the quantile scale. A regression standardization scheme based on the g-formula is proposed to enable the estimation of both covariate-conditional and marginal effects for an exposure of interest. We implement a user-friendly Bayesian approach for the estimation and quantification of uncertainty while accounting for left truncation and complex censoring. We emphasize the intuitive interpretation of this model through numerical and graphical tools and illustrate its performance through simulation and application to a study of Alzheimer's disease and dementia.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":"449-467"},"PeriodicalIF":1.8,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11484523/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10513263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cohort-based smoothing methods for age-specific contact rates. 基于队列的年龄接触率平滑方法。
IF 2.1 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-04-15 DOI: 10.1093/biostatistics/kxad005
Yannick Vandendijck, Oswaldo Gressani, Christel Faes, Carlo G Camarda, Niel Hens

The use of social contact rates is widespread in infectious disease modeling since it has been shown that they are key driving forces of important epidemiological parameters. Quantification of contact patterns is crucial to parameterize dynamic transmission models and to provide insights on the (basic) reproduction number. Information on social interactions can be obtained from population-based contact surveys, such as the European Commission project POLYMOD. Estimation of age-specific contact rates from these studies is often done using a piecewise constant approach or bivariate smoothing techniques. For the latter, typically, smoothness is introduced in the dimensions of the respondent's and contact's age (i.e., the rows and columns of the social contact matrix). We propose a smoothing constrained approach-taking into account the reciprocal nature of contacts-introducing smoothness over the diagonal (including all subdiagonals) of the social contact matrix. This modeling approach is justified assuming that when people age their contact behavior changes smoothly. We call this smoothing from a cohort perspective. Two approaches that allow for smoothing over social contact matrix diagonals are proposed, namely (i) reordering of the diagonal components of the contact matrix and (ii) reordering of the penalty matrix ensuring smoothness over the contact matrix diagonals. Parameter estimation is done in the likelihood framework by using constrained penalized iterative reweighted least squares. A simulation study underlines the benefits of cohort-based smoothing. Finally, the proposed methods are illustrated on the Belgian POLYMOD data of 2006. Code to reproduce the results of the article can be downloaded on this GitHub repository https://github.com/oswaldogressani/Cohort_smoothing.

由于社会接触率是重要流行病学参数的关键驱动力,因此在传染病建模中广泛使用社会接触率。接触模式的量化对于动态传播模型的参数化和提供有关(基本)繁殖数量的见解至关重要。有关社会互动的信息可以从基于人群的接触调查中获得,例如欧盟委员会的 POLYMOD 项目。从这些研究中估算特定年龄段的接触率通常采用片断常数法或双变量平滑技术。对于后者,通常会在受访者和接触者的年龄维度(即社会接触矩阵的行和列)上引入平滑性。考虑到接触的互惠性,我们提出了一种平滑约束方法,即在社会接触矩阵的对角线(包括所有子对角线)上引入平滑性。这种建模方法的合理性在于,假设人们随着年龄的增长,其接触行为会发生平滑变化。我们称之为队列平滑。我们提出了两种允许社会接触矩阵对角线平滑化的方法,即 (i) 对接触矩阵的对角线成分重新排序,以及 (ii) 对惩罚矩阵重新排序,以确保接触矩阵对角线的平滑化。参数估计是在似然法框架下,利用受约束的惩罚迭代加权最小二乘法进行的。模拟研究强调了基于队列的平滑化的好处。最后,在 2006 年比利时 POLYMOD 数据上对所提出的方法进行了说明。重现文章结果的代码可从 GitHub 存储库 https://github.com/oswaldogressani/Cohort_smoothing 下载。
{"title":"Cohort-based smoothing methods for age-specific contact rates.","authors":"Yannick Vandendijck, Oswaldo Gressani, Christel Faes, Carlo G Camarda, Niel Hens","doi":"10.1093/biostatistics/kxad005","DOIUrl":"10.1093/biostatistics/kxad005","url":null,"abstract":"<p><p>The use of social contact rates is widespread in infectious disease modeling since it has been shown that they are key driving forces of important epidemiological parameters. Quantification of contact patterns is crucial to parameterize dynamic transmission models and to provide insights on the (basic) reproduction number. Information on social interactions can be obtained from population-based contact surveys, such as the European Commission project POLYMOD. Estimation of age-specific contact rates from these studies is often done using a piecewise constant approach or bivariate smoothing techniques. For the latter, typically, smoothness is introduced in the dimensions of the respondent's and contact's age (i.e., the rows and columns of the social contact matrix). We propose a smoothing constrained approach-taking into account the reciprocal nature of contacts-introducing smoothness over the diagonal (including all subdiagonals) of the social contact matrix. This modeling approach is justified assuming that when people age their contact behavior changes smoothly. We call this smoothing from a cohort perspective. Two approaches that allow for smoothing over social contact matrix diagonals are proposed, namely (i) reordering of the diagonal components of the contact matrix and (ii) reordering of the penalty matrix ensuring smoothness over the contact matrix diagonals. Parameter estimation is done in the likelihood framework by using constrained penalized iterative reweighted least squares. A simulation study underlines the benefits of cohort-based smoothing. Finally, the proposed methods are illustrated on the Belgian POLYMOD data of 2006. Code to reproduce the results of the article can be downloaded on this GitHub repository https://github.com/oswaldogressani/Cohort_smoothing.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":"521-540"},"PeriodicalIF":2.1,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9141117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Systematically missing data in causally interpretable meta-analysis. 可解释因果关系的荟萃分析中的系统缺失数据。
IF 2.1 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-04-15 DOI: 10.1093/biostatistics/kxad006
Jon A Steingrimsson, David H Barker, Ruofan Bie, Issa J Dahabreh

Causally interpretable meta-analysis combines information from a collection of randomized controlled trials to estimate treatment effects in a target population in which experimentation may not be possible but from which covariate information can be obtained. In such analyses, a key practical challenge is the presence of systematically missing data when some trials have collected data on one or more baseline covariates, but other trials have not, such that the covariate information is missing for all participants in the latter. In this article, we provide identification results for potential (counterfactual) outcome means and average treatment effects in the target population when covariate data are systematically missing from some of the trials in the meta-analysis. We propose three estimators for the average treatment effect in the target population, examine their asymptotic properties, and show that they have good finite-sample performance in simulation studies. We use the estimators to analyze data from two large lung cancer screening trials and target population data from the National Health and Nutrition Examination Survey (NHANES). To accommodate the complex survey design of the NHANES, we modify the methods to incorporate survey sampling weights and allow for clustering.

可解释因果关系的荟萃分析结合了一系列随机对照试验的信息,以估算目标人群的治疗效果,在目标人群中可能无法进行试验,但可以从中获得协变量信息。在此类分析中,一个关键的实际挑战是系统性数据缺失的存在,即某些试验收集了一个或多个基线协变量的数据,而其他试验却没有收集,从而导致后者所有参与者的协变量信息缺失。在本文中,我们将提供在荟萃分析中部分试验系统性缺失协变量数据时,目标人群中潜在(反事实)结果均值和平均治疗效果的识别结果。我们提出了目标人群平均治疗效果的三个估计值,考察了它们的渐近特性,并在模拟研究中证明它们具有良好的有限样本性能。我们使用这些估计值分析了两项大型肺癌筛查试验的数据以及美国国家健康与营养调查(NHANES)的目标人群数据。为了适应 NHANES 复杂的调查设计,我们对方法进行了修改,加入了调查抽样权重并允许聚类。
{"title":"Systematically missing data in causally interpretable meta-analysis.","authors":"Jon A Steingrimsson, David H Barker, Ruofan Bie, Issa J Dahabreh","doi":"10.1093/biostatistics/kxad006","DOIUrl":"10.1093/biostatistics/kxad006","url":null,"abstract":"<p><p>Causally interpretable meta-analysis combines information from a collection of randomized controlled trials to estimate treatment effects in a target population in which experimentation may not be possible but from which covariate information can be obtained. In such analyses, a key practical challenge is the presence of systematically missing data when some trials have collected data on one or more baseline covariates, but other trials have not, such that the covariate information is missing for all participants in the latter. In this article, we provide identification results for potential (counterfactual) outcome means and average treatment effects in the target population when covariate data are systematically missing from some of the trials in the meta-analysis. We propose three estimators for the average treatment effect in the target population, examine their asymptotic properties, and show that they have good finite-sample performance in simulation studies. We use the estimators to analyze data from two large lung cancer screening trials and target population data from the National Health and Nutrition Examination Survey (NHANES). To accommodate the complex survey design of the NHANES, we modify the methods to incorporate survey sampling weights and allow for clustering.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":"289-305"},"PeriodicalIF":2.1,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11017122/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9567977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Bayesian approach to estimating COVID-19 incidence and infection fatality rates. 用贝叶斯方法估算 COVID-19 发病率和感染死亡率。
IF 2.1 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-04-15 DOI: 10.1093/biostatistics/kxad003
Justin J Slater, Aiyush Bansal, Harlan Campbell, Jeffrey S Rosenthal, Paul Gustafson, Patrick E Brown

Naive estimates of incidence and infection fatality rates (IFR) of coronavirus disease 2019 suffer from a variety of biases, many of which relate to preferential testing. This has motivated epidemiologists from around the globe to conduct serosurveys that measure the immunity of individuals by testing for the presence of SARS-CoV-2 antibodies in the blood. These quantitative measures (titer values) are then used as a proxy for previous or current infection. However, statistical methods that use this data to its full potential have yet to be developed. Previous researchers have discretized these continuous values, discarding potentially useful information. In this article, we demonstrate how multivariate mixture models can be used in combination with post-stratification to estimate cumulative incidence and IFR in an approximate Bayesian framework without discretization. In doing so, we account for uncertainty from both the estimated number of infections and incomplete deaths data to provide estimates of IFR. This method is demonstrated using data from the Action to Beat Coronavirus erosurvey in Canada.

对 2019 年冠状病毒疾病发病率和感染致死率(IFR)的天真估计存在各种偏差,其中许多偏差与优先检测有关。这促使全球流行病学家开展血清调查,通过检测血液中是否存在 SARS-CoV-2 抗体来衡量个人的免疫力。这些定量指标(滴度值)随后被用作以前或现在感染的替代指标。然而,充分利用这些数据的统计方法仍有待开发。以前的研究人员将这些连续值离散化,从而丢弃了潜在的有用信息。在本文中,我们展示了如何将多元混合模型与后分层相结合,在近似贝叶斯框架下估算累计发病率和 IFR,而无需离散化。在此过程中,我们考虑了估计感染人数和不完整死亡数据的不确定性,从而提供了 IFR 的估计值。我们使用加拿大 "战胜冠状病毒行动 "侵蚀调查的数据对该方法进行了演示。
{"title":"A Bayesian approach to estimating COVID-19 incidence and infection fatality rates.","authors":"Justin J Slater, Aiyush Bansal, Harlan Campbell, Jeffrey S Rosenthal, Paul Gustafson, Patrick E Brown","doi":"10.1093/biostatistics/kxad003","DOIUrl":"10.1093/biostatistics/kxad003","url":null,"abstract":"<p><p>Naive estimates of incidence and infection fatality rates (IFR) of coronavirus disease 2019 suffer from a variety of biases, many of which relate to preferential testing. This has motivated epidemiologists from around the globe to conduct serosurveys that measure the immunity of individuals by testing for the presence of SARS-CoV-2 antibodies in the blood. These quantitative measures (titer values) are then used as a proxy for previous or current infection. However, statistical methods that use this data to its full potential have yet to be developed. Previous researchers have discretized these continuous values, discarding potentially useful information. In this article, we demonstrate how multivariate mixture models can be used in combination with post-stratification to estimate cumulative incidence and IFR in an approximate Bayesian framework without discretization. In doing so, we account for uncertainty from both the estimated number of infections and incomplete deaths data to provide estimates of IFR. This method is demonstrated using data from the Action to Beat Coronavirus erosurvey in Canada.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":"354-384"},"PeriodicalIF":2.1,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11017123/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10850020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Biostatistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1