Biometrics最新文献

英文中文

Distributed lag models for retrospective cohort data with application to a study of built environment and body weight.

IF 1.4 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2025-01-07 DOI: 10.1093/biomtc/ujae166

Jennifer F Bobb, Stephen J Mooney, Maricela Cruz, Anne Vernez Moudon, Adam Drewnowski, David Arterburn, Andrea J Cook

Distributed lag models (DLMs) estimate the health effects of exposure over multiple time lags prior to the outcome and are widely used in time series studies. Applying DLMs to retrospective cohort studies is challenging due to inconsistent lengths of exposure history across participants, which is common when using electronic health record databases. A standard approach is to define subcohorts of individuals with some minimum exposure history, but this limits power and may amplify selection bias. We propose alternative full-cohort methods that use all available data while simultaneously enabling examination of the longest time lag estimable in the cohort. Through simulation studies, we find that restricting to a subcohort can lead to biased estimates of exposure effects due to confounding by correlated exposures at more distant lags. By contrast, full-cohort methods that incorporate multiple imputation of complete exposure histories can avoid this bias to efficiently estimate lagged and cumulative effects. Applying full-cohort DLMs to a study examining the association between residential density (a proxy for walkability) over 12 years and body weight, we find evidence of an immediate effect in the prior 1-2 years. We also observed an association at the maximal lag considered (12 years prior), which we posit reflects an earlier ($ge$12 years) or incrementally increasing prior effect over time. DLMs can be efficiently incorporated within retrospective cohort studies to identify critical windows of exposure.

{"title":"Distributed lag models for retrospective cohort data with application to a study of built environment and body weight.","authors":"Jennifer F Bobb, Stephen J Mooney, Maricela Cruz, Anne Vernez Moudon, Adam Drewnowski, David Arterburn, Andrea J Cook","doi":"10.1093/biomtc/ujae166","DOIUrl":"10.1093/biomtc/ujae166","url":null,"abstract":"Distributed lag models (DLMs) estimate the health effects of exposure over multiple time lags prior to the outcome and are widely used in time series studies. Applying DLMs to retrospective cohort studies is challenging due to inconsistent lengths of exposure history across participants, which is common when using electronic health record databases. A standard approach is to define subcohorts of individuals with some minimum exposure history, but this limits power and may amplify selection bias. We propose alternative full-cohort methods that use all available data while simultaneously enabling examination of the longest time lag estimable in the cohort. Through simulation studies, we find that restricting to a subcohort can lead to biased estimates of exposure effects due to confounding by correlated exposures at more distant lags. By contrast, full-cohort methods that incorporate multiple imputation of complete exposure histories can avoid this bias to efficiently estimate lagged and cumulative effects. Applying full-cohort DLMs to a study examining the association between residential density (a proxy for walkability) over 12 years and body weight, we find evidence of an immediate effect in the prior 1-2 years. We also observed an association at the maximal lag considered (12 years prior), which we posit reflects an earlier ($ge$12 years) or incrementally increasing prior effect over time. DLMs can be efficiently incorporated within retrospective cohort studies to identify critical windows of exposure.","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11760659/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143031922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Distributed model building and recursive integration for big spatial data modeling. 大空间数据建模的分布式模型构建与递归集成。

IF 1.4 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2025-01-07 DOI: 10.1093/biomtc/ujae159

Emily C Hector, Brian J Reich, Ani Eloyan

Motivated by the need for computationally tractable spatial methods in neuroimaging studies, we develop a distributed and integrated framework for estimation and inference of Gaussian process model parameters with ultra-high-dimensional likelihoods. We propose a shift in viewpoint from whole to local data perspectives that is rooted in distributed model building and integrated estimation and inference. The framework's backbone is a computationally and statistically efficient integration procedure that simultaneously incorporates dependence within and between spatial resolutions in a recursively partitioned spatial domain. Statistical and computational properties of our distributed approach are investigated theoretically and in simulations. The proposed approach is used to extract new insights into autism spectrum disorder from the autism brain imaging data exchange.

由于神经影像学研究需要可计算的空间方法，我们开发了一个分布式和集成的框架，用于估计和推断具有超高维似然的高斯过程模型参数。我们提出了一种观点的转变，从整体到局部数据的观点，根植于分布式模型构建和集成的估计和推理。该框架的主干是一个计算和统计上有效的集成过程，同时在递归划分的空间域中合并空间分辨率内部和空间分辨率之间的依赖性。本文从理论和仿真两方面研究了分布式方法的统计和计算特性。该方法用于从自闭症脑成像数据交换中提取自闭症谱系障碍的新见解。

引用次数: 0

Estimating hypothetical estimands with causal inference and missing data estimators in a diabetes trial case study.

IF 1.4 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2025-01-07 DOI: 10.1093/biomtc/ujae167

Camila Olarte Parra, Rhian M Daniel, David Wright, Jonathan W Bartlett

The ICH E9 addendum on estimands in clinical trials provides a framework for precisely defining the treatment effect that is to be estimated, but says little about estimation methods. Here, we report analyses of a clinical trial in type 2 diabetes, targeting the effects of randomized treatment, handling rescue treatment and discontinuation of randomized treatment using the so-called hypothetical strategy. We show how this can be estimated using mixed models for repeated measures, multiple imputation, inverse probability of treatment weighting, G-formula, and G-estimation. We describe their assumptions and practical details of their implementation using packages in R. We report the results of these analyses, broadly finding similar estimates and standard errors across the estimators. We discuss various considerations relevant when choosing an estimation approach, including computational time, how to handle missing data, whether to include post intercurrent event data in the analysis, whether and how to adjust for additional time-varying confounders, and whether and how to model different types of intercurrent event data separately.

{"title":"Estimating hypothetical estimands with causal inference and missing data estimators in a diabetes trial case study.","authors":"Camila Olarte Parra, Rhian M Daniel, David Wright, Jonathan W Bartlett","doi":"10.1093/biomtc/ujae167","DOIUrl":"https://doi.org/10.1093/biomtc/ujae167","url":null,"abstract":"The ICH E9 addendum on estimands in clinical trials provides a framework for precisely defining the treatment effect that is to be estimated, but says little about estimation methods. Here, we report analyses of a clinical trial in type 2 diabetes, targeting the effects of randomized treatment, handling rescue treatment and discontinuation of randomized treatment using the so-called hypothetical strategy. We show how this can be estimated using mixed models for repeated measures, multiple imputation, inverse probability of treatment weighting, G-formula, and G-estimation. We describe their assumptions and practical details of their implementation using packages in R. We report the results of these analyses, broadly finding similar estimates and standard errors across the estimators. We discuss various considerations relevant when choosing an estimation approach, including computational time, how to handle missing data, whether to include post intercurrent event data in the analysis, whether and how to adjust for additional time-varying confounders, and whether and how to model different types of intercurrent event data separately.","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143051435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A unified combination framework for dependent tests with applications to microbiome association studies.

IF 1.4 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2025-01-07 DOI: 10.1093/biomtc/ujaf001

Xiufan Yu, Linjun Zhang, Arun Srinivasan, Min-Ge Xie, Lingzhou Xue

We introduce a novel meta-analysis framework to combine dependent tests under a general setting, and utilize it to synthesize various microbiome association tests that are calculated from the same dataset. Our development builds upon the classical meta-analysis methods of aggregating P-values and also a more recent general method of combining confidence distributions, but makes generalizations to handle dependent tests. The proposed framework ensures rigorous statistical guarantees, and we provide a comprehensive study and compare it with various existing dependent combination methods. Notably, we demonstrate that the widely used Cauchy combination method for dependent tests, referred to as the vanilla Cauchy combination in this article, can be viewed as a special case within our framework. Moreover, the proposed framework provides a way to address the problem when the distributional assumptions underlying the vanilla Cauchy combination are violated. Our numerical results demonstrate that ignoring the dependence among the to-be-combined components may lead to a severe size distortion phenomenon. Compared to the existing P-value combination methods, including the vanilla Cauchy combination method and other methods, the proposed combination framework is flexible and can be adapted to handle the dependence accurately and utilizes the information efficiently to construct tests with accurate size and enhanced power. The development is applied to the microbiome association studies, where we aggregate information from multiple existing tests using the same dataset. The combined tests harness the strengths of each individual test across a wide range of alternative spaces, enabling more efficient and meaningful discoveries of vital microbiome associations.

{"title":"A unified combination framework for dependent tests with applications to microbiome association studies.","authors":"Xiufan Yu, Linjun Zhang, Arun Srinivasan, Min-Ge Xie, Lingzhou Xue","doi":"10.1093/biomtc/ujaf001","DOIUrl":"10.1093/biomtc/ujaf001","url":null,"abstract":"We introduce a novel meta-analysis framework to combine dependent tests under a general setting, and utilize it to synthesize various microbiome association tests that are calculated from the same dataset. Our development builds upon the classical meta-analysis methods of aggregating P-values and also a more recent general method of combining confidence distributions, but makes generalizations to handle dependent tests. The proposed framework ensures rigorous statistical guarantees, and we provide a comprehensive study and compare it with various existing dependent combination methods. Notably, we demonstrate that the widely used Cauchy combination method for dependent tests, referred to as the vanilla Cauchy combination in this article, can be viewed as a special case within our framework. Moreover, the proposed framework provides a way to address the problem when the distributional assumptions underlying the vanilla Cauchy combination are violated. Our numerical results demonstrate that ignoring the dependence among the to-be-combined components may lead to a severe size distortion phenomenon. Compared to the existing P-value combination methods, including the vanilla Cauchy combination method and other methods, the proposed combination framework is flexible and can be adapted to handle the dependence accurately and utilizes the information efficiently to construct tests with accurate size and enhanced power. The development is applied to the microbiome association studies, where we aggregate information from multiple existing tests using the same dataset. The combined tests harness the strengths of each individual test across a wide range of alternative spaces, enabling more efficient and meaningful discoveries of vital microbiome associations.","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11783248/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143063363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A regularized Bayesian Dirichlet-multinomial regression model for integrating single-cell-level omics and patient-level clinical study data.

IF 1.4 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2025-01-07 DOI: 10.1093/biomtc/ujaf005

Yanghong Guo, Lei Yu, Lei Guo, Lin Xu, Qiwei Li

The abundance of various cell types can vary significantly among patients with varying phenotypes and even those with the same phenotype. Recent scientific advancements provide mounting evidence that other clinical variables, such as age, gender, and lifestyle habits, can also influence the abundance of certain cell types. However, current methods for integrating single-cell-level omics data with clinical variables are inadequate. In this study, we propose a regularized Bayesian Dirichlet-multinomial regression framework to investigate the relationship between single-cell RNA sequencing data and patient-level clinical data. Additionally, the model employs a novel hierarchical tree structure to identify such relationships at different cell-type levels. Our model successfully uncovers significant associations between specific cell types and clinical variables across three distinct diseases: pulmonary fibrosis, COVID-19, and non-small cell lung cancer. This integrative analysis provides biological insights and could potentially inform clinical interventions for various diseases.

{"title":"A regularized Bayesian Dirichlet-multinomial regression model for integrating single-cell-level omics and patient-level clinical study data.","authors":"Yanghong Guo, Lei Yu, Lei Guo, Lin Xu, Qiwei Li","doi":"10.1093/biomtc/ujaf005","DOIUrl":"10.1093/biomtc/ujaf005","url":null,"abstract":"The abundance of various cell types can vary significantly among patients with varying phenotypes and even those with the same phenotype. Recent scientific advancements provide mounting evidence that other clinical variables, such as age, gender, and lifestyle habits, can also influence the abundance of certain cell types. However, current methods for integrating single-cell-level omics data with clinical variables are inadequate. In this study, we propose a regularized Bayesian Dirichlet-multinomial regression framework to investigate the relationship between single-cell RNA sequencing data and patient-level clinical data. Additionally, the model employs a novel hierarchical tree structure to identify such relationships at different cell-type levels. Our model successfully uncovers significant associations between specific cell types and clinical variables across three distinct diseases: pulmonary fibrosis, COVID-19, and non-small cell lung cancer. This integrative analysis provides biological insights and could potentially inform clinical interventions for various diseases.","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11783250/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143063282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Composite dyadic models for spatio-temporal data. 时空数据的复合二元模型。

IF 1.4 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2024-10-03 DOI: 10.1093/biomtc/ujae107

Michael R Schwob, Mevin B Hooten, Vagheesh Narasimhan

Mechanistic statistical models are commonly used to study the flow of biological processes. For example, in landscape genetics, the aim is to infer spatial mechanisms that govern gene flow in populations. Existing statistical approaches in landscape genetics do not account for temporal dependence in the data and may be computationally prohibitive. We infer mechanisms with a Bayesian hierarchical dyadic model that scales well with large data sets and that accounts for spatial and temporal dependence. We construct a fully connected network comprising spatio-temporal data for the dyadic model and use normalized composite likelihoods to account for the dependence structure in space and time. We develop a dyadic model to account for physical mechanisms commonly found in physical-statistical models and apply our methods to ancient human DNA data to infer the mechanisms that affected human movement in Bronze Age Europe.

机制统计模型通常用于研究生物过程的流动。例如，在景观遗传学中，目的是推断支配种群基因流动的空间机制。景观遗传学中的现有统计方法并不考虑数据的时间依赖性，而且计算量可能过大。我们采用贝叶斯分层二元模型来推断机制，该模型能很好地扩展大型数据集，并考虑空间和时间依赖性。我们为二元模型构建了一个由时空数据组成的全连接网络，并使用归一化复合似然来解释空间和时间上的依赖结构。我们建立了一个二元模型来解释物理统计模型中常见的物理机制，并将我们的方法应用于古人类 DNA 数据，以推断影响青铜时代欧洲人类运动的机制。

引用次数: 0

How to achieve model-robust inference in stepped wedge trials with model-based methods? 如何利用基于模型的方法在阶梯楔形试验中实现模型可靠的推断？

IF 1.4 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2024-10-03 DOI: 10.1093/biomtc/ujae123

Bingkai Wang, Xueqi Wang, Fan Li

A stepped wedge design is an unidirectional crossover design where clusters are randomized to distinct treatment sequences. While model-based analysis of stepped wedge designs is a standard practice to evaluate treatment effects accounting for clustering and adjusting for covariates, their properties under misspecification have not been systematically explored. In this article, we focus on model-based methods, including linear mixed models and generalized estimating equations with an independence, simple exchangeable, or nested exchangeable working correlation structure. We study when a potentially misspecified working model can offer consistent estimation of the marginal treatment effect estimands, which are defined nonparametrically with potential outcomes and may be functions of calendar time and/or exposure time. We prove a central result that consistency for nonparametric estimands usually requires a correctly specified treatment effect structure, but generally not the remaining aspects of the working model (functional form of covariates, random effects, and error distribution), and valid inference is obtained via the sandwich variance estimator. Furthermore, an additional g-computation step is required to achieve model-robust inference under non-identity link functions or for ratio estimands. The theoretical results are illustrated via several simulation experiments and re-analysis of a completed stepped wedge cluster randomized trial.

阶梯楔形设计是一种单向交叉设计，在这种设计中，分组被随机分配到不同的治疗序列中。基于模型的阶梯楔形设计分析是评估治疗效果的标准做法，它考虑了聚类并调整了协变量，但尚未系统地探讨其在错误规范下的特性。本文重点讨论基于模型的方法，包括线性混合模型和具有独立、简单可交换或嵌套可交换工作相关结构的广义估计方程。我们研究了一个可能被错误定义的工作模型在什么情况下可以提供边际治疗效果估计值的一致性估计，边际治疗效果估计值是用潜在结果非参数定义的，可能是日历时间和/或暴露时间的函数。我们证明了一个核心结果，即非参数估计的一致性通常需要一个正确指定的治疗效果结构，但一般不需要工作模型的其他方面（协变量的函数形式、随机效应和误差分布），并且可以通过三明治方差估计器获得有效推论。此外，还需要额外的 g 计算步骤，才能在非同一性联系函数或比率估计值条件下实现模型可靠的推断。通过几个模拟实验和对已完成的阶梯楔形群随机试验的重新分析，对理论结果进行了说明。

{"title":"How to achieve model-robust inference in stepped wedge trials with model-based methods?","authors":"Bingkai Wang, Xueqi Wang, Fan Li","doi":"10.1093/biomtc/ujae123","DOIUrl":"10.1093/biomtc/ujae123","url":null,"abstract":"A stepped wedge design is an unidirectional crossover design where clusters are randomized to distinct treatment sequences. While model-based analysis of stepped wedge designs is a standard practice to evaluate treatment effects accounting for clustering and adjusting for covariates, their properties under misspecification have not been systematically explored. In this article, we focus on model-based methods, including linear mixed models and generalized estimating equations with an independence, simple exchangeable, or nested exchangeable working correlation structure. We study when a potentially misspecified working model can offer consistent estimation of the marginal treatment effect estimands, which are defined nonparametrically with potential outcomes and may be functions of calendar time and/or exposure time. We prove a central result that consistency for nonparametric estimands usually requires a correctly specified treatment effect structure, but generally not the remaining aspects of the working model (functional form of covariates, random effects, and error distribution), and valid inference is obtained via the sandwich variance estimator. Furthermore, an additional g-computation step is required to achieve model-robust inference under non-identity link functions or for ratio estimands. The theoretical results are illustrated via several simulation experiments and re-analysis of a completed stepped wedge cluster randomized trial.","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 4","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11536888/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142581068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Group sequential testing of a treatment effect using a surrogate marker. 使用替代标记对治疗效果进行分组序列测试。

IF 1.4 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2024-10-03 DOI: 10.1093/biomtc/ujae108

Layla Parast, Jay Bartroff

The identification of surrogate markers is motivated by their potential to make decisions sooner about a treatment effect. However, few methods have been developed to actually use a surrogate marker to test for a treatment effect in a future study. Most existing methods consider combining surrogate marker and primary outcome information to test for a treatment effect, rely on fully parametric methods where strict parametric assumptions are made about the relationship between the surrogate and the outcome, and/or assume the surrogate marker is measured at only a single time point. Recent work has proposed a nonparametric test for a treatment effect using only surrogate marker information measured at a single time point by borrowing information learned from a prior study where both the surrogate and primary outcome were measured. In this paper, we utilize this nonparametric test and propose group sequential procedures that allow for early stopping of treatment effect testing in a setting where the surrogate marker is measured repeatedly over time. We derive the properties of the correlated surrogate-based nonparametric test statistics at multiple time points and compute stopping boundaries that allow for early stopping for a significant treatment effect, or for futility. We examine the performance of our proposed test using a simulation study and illustrate the method using data from two distinct AIDS clinical trials.

确定替代标记物的动机在于它们有可能更快地对治疗效果做出决定。然而，在未来的研究中，很少有方法能真正使用替代标记物来检验治疗效果。现有的大多数方法都考虑结合替代标记物和主要结果信息来检验治疗效果，依赖于全参数方法，即对替代标记物和结果之间的关系做出严格的参数假设，和/或假设替代标记物仅在单一时间点进行测量。最近的研究提出了一种非参数检验方法，通过借用先前研究中同时测量代用指标和主要结果的信息，仅使用单一时间点测量的代用指标信息来检验治疗效果。在本文中，我们利用这种非参数检验，提出了分组序列程序，允许在一段时间内重复测量替代标记物的情况下尽早停止治疗效果检验。我们推导了多个时间点上基于相关代用指标的非参数检验统计量的特性，并计算了停止界限，以便在治疗效果显著或无效时尽早停止。我们通过模拟研究检验了我们提出的检验方法的性能，并使用两项不同的艾滋病临床试验数据对该方法进行了说明。

{"title":"Group sequential testing of a treatment effect using a surrogate marker.","authors":"Layla Parast, Jay Bartroff","doi":"10.1093/biomtc/ujae108","DOIUrl":"10.1093/biomtc/ujae108","url":null,"abstract":"The identification of surrogate markers is motivated by their potential to make decisions sooner about a treatment effect. However, few methods have been developed to actually use a surrogate marker to test for a treatment effect in a future study. Most existing methods consider combining surrogate marker and primary outcome information to test for a treatment effect, rely on fully parametric methods where strict parametric assumptions are made about the relationship between the surrogate and the outcome, and/or assume the surrogate marker is measured at only a single time point. Recent work has proposed a nonparametric test for a treatment effect using only surrogate marker information measured at a single time point by borrowing information learned from a prior study where both the surrogate and primary outcome were measured. In this paper, we utilize this nonparametric test and propose group sequential procedures that allow for early stopping of treatment effect testing in a setting where the surrogate marker is measured repeatedly over time. We derive the properties of the correlated surrogate-based nonparametric test statistics at multiple time points and compute stopping boundaries that allow for early stopping for a significant treatment effect, or for futility. We examine the performance of our proposed test using a simulation study and illustrate the method using data from two distinct AIDS clinical trials.","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 4","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11459368/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142387635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Heterogeneity-aware integrative regression for ancestry-specific association studies. 用于祖先特异性关联研究的异质性感知整合回归。

IF 1.4 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2024-10-03 DOI: 10.1093/biomtc/ujae109

Aaron J Molstad, Yanwei Cai, Alexander P Reiner, Charles Kooperberg, Wei Sun, Li Hsu

Ancestry-specific proteome-wide association studies (PWAS) based on genetically predicted protein expression can reveal complex disease etiology specific to certain ancestral groups. These studies require ancestry-specific models for protein expression as a function of SNP genotypes. In order to improve protein expression prediction in ancestral populations historically underrepresented in genomic studies, we propose a new penalized maximum likelihood estimator for fitting ancestry-specific joint protein quantitative trait loci models. Our estimator borrows information across ancestral groups, while simultaneously allowing for heterogeneous error variances and regression coefficients. We propose an alternative parameterization of our model that makes the objective function convex and the penalty scale invariant. To improve computational efficiency, we propose an approximate version of our method and study its theoretical properties. Our method provides a substantial improvement in protein expression prediction accuracy in individuals of African ancestry, and in a downstream PWAS analysis, leads to the discovery of multiple associations between protein expression and blood lipid traits in the African ancestry population.

基于基因预测蛋白表达的特定祖先全蛋白质组关联研究（PWAS）可以揭示某些祖先群体特有的复杂疾病病因。这些研究需要特定祖先的蛋白质表达模型作为 SNP 基因型的函数。为了改善在基因组研究中历来代表性不足的祖先人群的蛋白质表达预测，我们提出了一种新的惩罚性最大似然估计器，用于拟合祖先特异性联合蛋白质数量性状位点模型。我们的估计器借用了不同祖先群体的信息，同时允许异质性误差方差和回归系数。我们提出了模型的另一种参数化方法，使目标函数具有凸性和惩罚尺度不变性。为了提高计算效率，我们提出了一种近似版本的方法，并对其理论特性进行了研究。我们的方法大大提高了非洲血统个体蛋白质表达预测的准确性，并在下游的 PWAS 分析中发现了非洲血统人群中蛋白质表达与血脂特征之间的多种关联。

{"title":"Heterogeneity-aware integrative regression for ancestry-specific association studies.","authors":"Aaron J Molstad, Yanwei Cai, Alexander P Reiner, Charles Kooperberg, Wei Sun, Li Hsu","doi":"10.1093/biomtc/ujae109","DOIUrl":"10.1093/biomtc/ujae109","url":null,"abstract":"Ancestry-specific proteome-wide association studies (PWAS) based on genetically predicted protein expression can reveal complex disease etiology specific to certain ancestral groups. These studies require ancestry-specific models for protein expression as a function of SNP genotypes. In order to improve protein expression prediction in ancestral populations historically underrepresented in genomic studies, we propose a new penalized maximum likelihood estimator for fitting ancestry-specific joint protein quantitative trait loci models. Our estimator borrows information across ancestral groups, while simultaneously allowing for heterogeneous error variances and regression coefficients. We propose an alternative parameterization of our model that makes the objective function convex and the penalty scale invariant. To improve computational efficiency, we propose an approximate version of our method and study its theoretical properties. Our method provides a substantial improvement in protein expression prediction accuracy in individuals of African ancestry, and in a downstream PWAS analysis, leads to the discovery of multiple associations between protein expression and blood lipid traits in the African ancestry population.","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 4","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11492996/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142457175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A new robust approach for the polytomous logistic regression model based on Rényi's pseudodistances. 基于 Rényi 伪距的多项式逻辑回归模型新稳健方法。

IF 1.4 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2024-10-03 DOI: 10.1093/biomtc/ujae125

Elena Castilla

This paper presents a robust alternative to the maximum likelihood estimator (MLE) for the polytomous logistic regression model, known as the family of minimum Rènyi Pseudodistance (RP) estimators. The proposed minimum RP estimators are parametrized by a tuning parameter $alpha ge 0$, and include the MLE as a special case when $alpha =0$. These estimators, along with a family of RP-based Wald-type tests, are shown to exhibit superior performance in the presence of misclassification errors. The paper includes an extensive simulation study and a real data example to illustrate the robustness of these proposed statistics.

本文提出了多态逻辑回归模型最大似然估计器（MLE）的稳健替代方法，即最小雷尼伪距（RP）估计器系列。所提出的最小 RP 估计器由一个调整参数 $alpha ge 0$ 参数化，并将 MLE 作为 $alpha =0$ 时的特例。这些估计器以及一系列基于 RP 的沃尔德类型检验，在存在误分类误差的情况下表现出卓越的性能。论文包括一项广泛的模拟研究和一个真实数据示例，以说明这些拟议统计量的稳健性。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Biometrics

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀