首页 > 最新文献

Biostatistics最新文献

英文 中文
Identification and estimation of mediational effects of longitudinal modified treatment policies. 纵向修正治疗政策的中介效应的识别和估计。
IF 2 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxaf031
Brian Gilbert, Katherine Hoffman, Nicholas Williams, Kara Rudolph, Edward J Schenck, Iván Díaz

We demonstrate a comprehensive semiparametric approach to causal mediation analysis, addressing the complexities inherent in settings with longitudinal and continuous treatments, confounders, and mediators. Our methodology utilizes a nonparametric structural equation model and a cross-fitted sequential regression technique based on doubly robust pseudo-outcomes, yielding an efficient, asymptotically normal estimator without relying on restrictive parametric modeling assumptions. We are motivated by a recent scientific controversy regarding the effects of invasive mechanical ventilation on the survival of COVID-19 patients, considering acute kidney injury as a mediating factor. We highlight the possibility of "inconsistent mediation," in which the direct and indirect effects of the exposure operate in opposite directions. We discuss the significance of mediation analysis for scientific understanding and its potential utility in treatment decisions.

我们展示了一种全面的半参数方法来进行因果中介分析,解决了纵向和连续治疗、混杂因素和中介因素设置中固有的复杂性。我们的方法利用非参数结构方程模型和基于双鲁棒伪结果的交叉拟合序列回归技术,产生有效的渐近正态估计,而不依赖于限制性参数建模假设。我们的动机是最近关于有创机械通气对COVID-19患者生存影响的科学争议,认为急性肾损伤是一个中介因素。我们强调了“不一致调解”的可能性,其中暴露的直接和间接影响在相反的方向上运作。我们讨论了中介分析对科学理解的意义及其在治疗决策中的潜在效用。
{"title":"Identification and estimation of mediational effects of longitudinal modified treatment policies.","authors":"Brian Gilbert, Katherine Hoffman, Nicholas Williams, Kara Rudolph, Edward J Schenck, Iván Díaz","doi":"10.1093/biostatistics/kxaf031","DOIUrl":"10.1093/biostatistics/kxaf031","url":null,"abstract":"<p><p>We demonstrate a comprehensive semiparametric approach to causal mediation analysis, addressing the complexities inherent in settings with longitudinal and continuous treatments, confounders, and mediators. Our methodology utilizes a nonparametric structural equation model and a cross-fitted sequential regression technique based on doubly robust pseudo-outcomes, yielding an efficient, asymptotically normal estimator without relying on restrictive parametric modeling assumptions. We are motivated by a recent scientific controversy regarding the effects of invasive mechanical ventilation on the survival of COVID-19 patients, considering acute kidney injury as a mediating factor. We highlight the possibility of \"inconsistent mediation,\" in which the direct and indirect effects of the exposure operate in opposite directions. We discuss the significance of mediation analysis for scientific understanding and its potential utility in treatment decisions.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145477179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Within-trial data borrowing for sequential multiple assignment randomized trials. 序贯多任务随机试验的试验内数据借用。
IF 1.8 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxaf003
Ales Kotalik, David M Vock, Nancy E Sherwood, Brian P Hobbs, Joseph S Koopmeiners

The Sequential Multiple Assignment Randomized Trial (SMART) is a complex trial design that involves randomizing a single participant multiple times in a sequential manner. This results in the branching nature of a SMART, which represents several distinct groups defined by different combinations of treatments, response statuses, etc. A SMART can then answer various scientific questions of interest, eg, the optimal dynamic treatment regime (DTR) for treating a chronic illness, what intervention to offer first, and what intervention to offer to nonresponders (or suboptimal responders). However, the analysis of a SMART can suffer from low precision, as the potentially widely branching structure can lead to reduced sample sizes in some groups of interest. In this paper, we propose a novel analysis method for a SMART in which dynamic borrowing is used to borrow strength across groups with similar expected outcomes, thus providing increased precision for the estimation of the expected outcomes of DTRs. We apply our method to a SMART evaluating various weight loss strategies using a binary endpoint of clinically significant weight loss and show by simulation that our method can improve the precision of the estimated expected outcome of a DTR, aid in the identification of the optimal DTR, and produce a clustering analysis of DTRs embedded in a SMART.

顺序多重分配随机试验(SMART)是一种复杂的试验设计,涉及以顺序方式将单个参与者多次随机化。这导致了SMART的分支性质,它代表了几个不同的组,由不同的治疗组合、反应状态等定义。然后SMART可以回答各种感兴趣的科学问题,例如,治疗慢性疾病的最佳动态治疗方案(DTR),首先提供什么干预措施,以及对无反应(或次优反应)提供什么干预措施。然而,对SMART的分析可能存在精度低的问题,因为潜在的广泛分支结构可能导致某些感兴趣组的样本量减少。在本文中,我们提出了一种新的SMART分析方法,其中动态借用用于在具有相似预期结果的组之间借用强度,从而提高了dtr预期结果的估计精度。我们将我们的方法应用于SMART,使用临床显著减肥的二元终点评估各种减肥策略,并通过模拟表明,我们的方法可以提高DTR估计预期结果的精度,有助于确定最佳DTR,并对嵌入在SMART中的DTR进行聚类分析。
{"title":"Within-trial data borrowing for sequential multiple assignment randomized trials.","authors":"Ales Kotalik, David M Vock, Nancy E Sherwood, Brian P Hobbs, Joseph S Koopmeiners","doi":"10.1093/biostatistics/kxaf003","DOIUrl":"10.1093/biostatistics/kxaf003","url":null,"abstract":"<p><p>The Sequential Multiple Assignment Randomized Trial (SMART) is a complex trial design that involves randomizing a single participant multiple times in a sequential manner. This results in the branching nature of a SMART, which represents several distinct groups defined by different combinations of treatments, response statuses, etc. A SMART can then answer various scientific questions of interest, eg, the optimal dynamic treatment regime (DTR) for treating a chronic illness, what intervention to offer first, and what intervention to offer to nonresponders (or suboptimal responders). However, the analysis of a SMART can suffer from low precision, as the potentially widely branching structure can lead to reduced sample sizes in some groups of interest. In this paper, we propose a novel analysis method for a SMART in which dynamic borrowing is used to borrow strength across groups with similar expected outcomes, thus providing increased precision for the estimation of the expected outcomes of DTRs. We apply our method to a SMART evaluating various weight loss strategies using a binary endpoint of clinically significant weight loss and show by simulation that our method can improve the precision of the estimated expected outcome of a DTR, aid in the identification of the optimal DTR, and produce a clustering analysis of DTRs embedded in a SMART.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11963638/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143765923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
While-alive regression analysis of composite survival endpoints. 复合生存终点的活时回归分析。
IF 2 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxaf047
Xi Fang, Hajime Uno, Fan Li

Composite endpoints are frequently used in clinical trials to enhance the event rate and improve the statistical power. In the presence of a terminal event, the while-alive cumulative frequency measure offers a useful alternative to define composite survival outcomes, by relating the average event rate to the survival time. Although non-parametric methods have been proposed for two-sample comparisons, limited attention has been given to regression methods that directly address time-varying association effects in while-alive measures. We address this gap by developing a regression framework for exposure-weighted while-alive measures for composite survival outcomes that include a terminal component event. Our regression approach uses splines to model time-varying association between covariates and a generalized while-alive loss rate of all component events, and can be applied to both independent and clustered data. We derive the asymptotic properties of the regression estimator under both independent data and cluster-correlated data settings, and study the operating characteristics of our methods through simulations. Finally, we apply our regression method to analyze data two randomized clinical trials. The proposed methods are implemented in the WAreg R package.

临床试验中经常使用复合终点来提高事件发生率和提高统计效能。在存在终末期事件时,通过将平均事件率与生存时间联系起来,存活期间累积频率测量为定义复合生存结果提供了一个有用的替代方法。虽然非参数方法已被提出用于两样本比较,但有限的关注已给予回归方法,直接解决时变关联效应在活着的措施。我们通过开发一个回归框架来解决这一差距,该框架用于包括终端组件事件在内的复合生存结果的暴露加权活时测量。我们的回归方法使用样条来模拟协变量之间的时变关联和所有组成事件的广义活时损失率,并且可以应用于独立和聚类数据。我们推导了回归估计量在独立数据和聚类相关数据设置下的渐近性质,并通过仿真研究了我们的方法的运行特性。最后,我们运用回归方法对两项随机临床试验的数据进行分析。所提出的方法在WAreg R包中实现。
{"title":"While-alive regression analysis of composite survival endpoints.","authors":"Xi Fang, Hajime Uno, Fan Li","doi":"10.1093/biostatistics/kxaf047","DOIUrl":"https://doi.org/10.1093/biostatistics/kxaf047","url":null,"abstract":"<p><p>Composite endpoints are frequently used in clinical trials to enhance the event rate and improve the statistical power. In the presence of a terminal event, the while-alive cumulative frequency measure offers a useful alternative to define composite survival outcomes, by relating the average event rate to the survival time. Although non-parametric methods have been proposed for two-sample comparisons, limited attention has been given to regression methods that directly address time-varying association effects in while-alive measures. We address this gap by developing a regression framework for exposure-weighted while-alive measures for composite survival outcomes that include a terminal component event. Our regression approach uses splines to model time-varying association between covariates and a generalized while-alive loss rate of all component events, and can be applied to both independent and clustered data. We derive the asymptotic properties of the regression estimator under both independent data and cluster-correlated data settings, and study the operating characteristics of our methods through simulations. Finally, we apply our regression method to analyze data two randomized clinical trials. The proposed methods are implemented in the WAreg R package.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145828750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A joint normal-ordinal (probit) model for ordinal and continuous longitudinal data. 用于序数和连续纵向数据的正态-序数(probit)联合模型。
IF 1.8 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxae014
Margaux Delporte, Geert Molenberghs, Steffen Fieuws, Geert Verbeke

In biomedical studies, continuous and ordinal longitudinal variables are frequently encountered. In many of these studies it is of interest to estimate the effect of one of these longitudinal variables on the other. Time-dependent covariates have, however, several limitations; they can, for example, not be included when the data is not collected at fixed intervals. The issues can be circumvented by implementing joint models, where two or more longitudinal variables are treated as a response and modeled with a correlated random effect. Next, by conditioning on these response(s), we can study the effect of one or more longitudinal variables on another. We propose a normal-ordinal(probit) joint model. First, we derive closed-form formulas to estimate the model-based correlations between the responses on their original scale. In addition, we derive the marginal model, where the interpretation is no longer conditional on the random effects. As a consequence, we can make predictions for a subvector of one response conditional on the other response and potentially a subvector of the history of the response. Next, we extend the approach to a high-dimensional case with more than two ordinal and/or continuous longitudinal variables. The methodology is applied to a case study where, among others, a longitudinal ordinal response is predicted with a longitudinal continuous variable.

在生物医学研究中,经常会遇到连续和顺序纵向变量。在许多这类研究中,估计其中一个纵向变量对另一个纵向变量的影响是很有意义的。然而,与时间相关的协变量有一些局限性;例如,当数据不是以固定的时间间隔收集时,就无法将其包括在内。要解决这些问题,可以采用联合模型,将两个或多个纵向变量视为一个响应变量,并用相关随机效应建模。接下来,通过对这些响应进行条件化,我们可以研究一个或多个纵向变量对另一个或多个纵向变量的影响。我们提出了一个正序(probit)联合模型。首先,我们推导出封闭式公式,以估计基于模型的原始尺度反应之间的相关性。此外,我们还推导出了边际模型,其中的解释不再以随机效应为条件。因此,我们可以以另一个反应为条件,对一个反应的子向量进行预测,也可以对反应历史的子向量进行预测。接下来,我们将该方法扩展到具有两个以上顺序变量和/或连续纵向变量的高维情况。我们将该方法应用于一个案例研究,其中包括用一个纵向连续变量来预测一个纵向序数响应。
{"title":"A joint normal-ordinal (probit) model for ordinal and continuous longitudinal data.","authors":"Margaux Delporte, Geert Molenberghs, Steffen Fieuws, Geert Verbeke","doi":"10.1093/biostatistics/kxae014","DOIUrl":"10.1093/biostatistics/kxae014","url":null,"abstract":"<p><p>In biomedical studies, continuous and ordinal longitudinal variables are frequently encountered. In many of these studies it is of interest to estimate the effect of one of these longitudinal variables on the other. Time-dependent covariates have, however, several limitations; they can, for example, not be included when the data is not collected at fixed intervals. The issues can be circumvented by implementing joint models, where two or more longitudinal variables are treated as a response and modeled with a correlated random effect. Next, by conditioning on these response(s), we can study the effect of one or more longitudinal variables on another. We propose a normal-ordinal(probit) joint model. First, we derive closed-form formulas to estimate the model-based correlations between the responses on their original scale. In addition, we derive the marginal model, where the interpretation is no longer conditional on the random effects. As a consequence, we can make predictions for a subvector of one response conditional on the other response and potentially a subvector of the history of the response. Next, we extend the approach to a high-dimensional case with more than two ordinal and/or continuous longitudinal variables. The methodology is applied to a case study where, among others, a longitudinal ordinal response is predicted with a longitudinal continuous variable.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141312354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Direct estimation and inference of higher-level correlations from lower-level measurements with applications in gene-pathway and proteomics studies. 从较低层次的测量结果直接估计和推断较高层次的相关性,并将其应用于基因通路和蛋白质组学研究。
IF 1.8 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxae027
Yue Wang, Haoran Shi

This paper tackles the challenge of estimating correlations between higher-level biological variables (e.g. proteins and gene pathways) when only lower-level measurements are directly observed (e.g. peptides and individual genes). Existing methods typically aggregate lower-level data into higher-level variables and then estimate correlations based on the aggregated data. However, different data aggregation methods can yield varying correlation estimates as they target different higher-level quantities. Our solution is a latent factor model that directly estimates these higher-level correlations from lower-level data without the need for data aggregation. We further introduce a shrinkage estimator to ensure the positive definiteness and improve the accuracy of the estimated correlation matrix. Furthermore, we establish the asymptotic normality of our estimator, enabling efficient computation of P-values for the identification of significant correlations. The effectiveness of our approach is demonstrated through comprehensive simulations and the analysis of proteomics and gene expression datasets. We develop the R package highcor for implementing our method.

本文探讨了在只能直接观测到较低层次测量数据(如肽和单个基因)的情况下,如何估算较高层次生物变量(如蛋白质和基因通路)之间的相关性这一难题。现有方法通常是将较低级别的数据聚合为较高级别的变量,然后根据聚合数据估计相关性。然而,不同的数据聚合方法会产生不同的相关性估计值,因为它们针对的是不同的高层次数量。我们的解决方案是采用潜因模型,无需数据聚合,直接从低层次数据中估算这些高层次相关性。我们进一步引入了收缩估计器,以确保正定性并提高相关矩阵估计的准确性。此外,我们还建立了估计器的渐近正态性,从而可以高效计算 P 值,识别重要的相关性。我们通过对蛋白质组学和基因表达数据集的全面模拟和分析,证明了我们方法的有效性。我们开发了用于实现我们方法的 R 软件包 highcor。
{"title":"Direct estimation and inference of higher-level correlations from lower-level measurements with applications in gene-pathway and proteomics studies.","authors":"Yue Wang, Haoran Shi","doi":"10.1093/biostatistics/kxae027","DOIUrl":"10.1093/biostatistics/kxae027","url":null,"abstract":"<p><p>This paper tackles the challenge of estimating correlations between higher-level biological variables (e.g. proteins and gene pathways) when only lower-level measurements are directly observed (e.g. peptides and individual genes). Existing methods typically aggregate lower-level data into higher-level variables and then estimate correlations based on the aggregated data. However, different data aggregation methods can yield varying correlation estimates as they target different higher-level quantities. Our solution is a latent factor model that directly estimates these higher-level correlations from lower-level data without the need for data aggregation. We further introduce a shrinkage estimator to ensure the positive definiteness and improve the accuracy of the estimated correlation matrix. Furthermore, we establish the asymptotic normality of our estimator, enabling efficient computation of P-values for the identification of significant correlations. The effectiveness of our approach is demonstrated through comprehensive simulations and the analysis of proteomics and gene expression datasets. We develop the R package highcor for implementing our method.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141861746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semiparametric mixture regression for asynchronous longitudinal data using multivariate functional principal component analysis. 基于多元泛函主成分分析的异步纵向数据半参数混合回归。
IF 1.8 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxaf008
Ruihan Lu, Yehua Li, Weixin Yao

The transitional phase of menopause induces significant hormonal fluctuations, exerting a profound influence on the long-term well-being of women. In an extensive longitudinal investigation of women's health during mid-life and beyond, known as the Study of Women's Health Across the Nation (SWAN), hormonal biomarkers are repeatedly assessed, following an asynchronous schedule compared to other error-prone covariates, such as physical and cardiovascular measurements. We conduct a subgroup analysis of the SWAN data employing a semiparametric mixture regression model, which allows us to explore how the relationship between hormonal responses and other time-varying or time-invariant covariates varies across subgroups. To address the challenges posed by asynchronous scheduling and measurement errors, we model the time-varying covariate trajectories as functional data with reduced-rank Karhunen-Loéve expansions, where splines are employed to capture the mean and eigenfunctions. Treating the latent subgroup membership and the functional principal component (FPC) scores as missing data, we propose an Expectation-Maximization algorithm to effectively fit the joint model, combining the mixture regression for the hormonal response and the FPC model for the asynchronous, time-varying covariates. In addition, we explore data-driven methods to determine the optimal number of subgroups within the population. Through our comprehensive analysis of the SWAN data, we unveil a crucial subgroup structure within the aging female population, shedding light on important distinctions and patterns among women undergoing menopause.

更年期的过渡阶段引起荷尔蒙的显著波动,对妇女的长期健康产生深远的影响。在一项关于中年及以后女性健康的广泛纵向调查中,被称为全国女性健康研究(SWAN),与其他容易出错的协变量(如身体和心血管测量)相比,激素生物标志物按照异步时间表被反复评估。我们采用半参数混合回归模型对SWAN数据进行了亚组分析,这使我们能够探索激素反应与其他时变或定常协变量之间的关系如何在亚组中变化。为了解决异步调度和测量误差带来的挑战,我们将时变协变量轨迹建模为具有降阶karhunen - losamuve展开的功能数据,其中样条用于捕获均值和特征函数。将潜在子群隶属度和功能主成分(FPC)分数作为缺失数据,我们提出了一种期望最大化算法来有效拟合联合模型,将激素反应的混合回归和异步时变协变量的FPC模型相结合。此外,我们还探索了数据驱动的方法来确定总体中子组的最佳数量。通过对SWAN数据的综合分析,我们揭示了老龄化女性人口中一个关键的亚群结构,揭示了更年期女性之间的重要区别和模式。
{"title":"Semiparametric mixture regression for asynchronous longitudinal data using multivariate functional principal component analysis.","authors":"Ruihan Lu, Yehua Li, Weixin Yao","doi":"10.1093/biostatistics/kxaf008","DOIUrl":"10.1093/biostatistics/kxaf008","url":null,"abstract":"<p><p>The transitional phase of menopause induces significant hormonal fluctuations, exerting a profound influence on the long-term well-being of women. In an extensive longitudinal investigation of women's health during mid-life and beyond, known as the Study of Women's Health Across the Nation (SWAN), hormonal biomarkers are repeatedly assessed, following an asynchronous schedule compared to other error-prone covariates, such as physical and cardiovascular measurements. We conduct a subgroup analysis of the SWAN data employing a semiparametric mixture regression model, which allows us to explore how the relationship between hormonal responses and other time-varying or time-invariant covariates varies across subgroups. To address the challenges posed by asynchronous scheduling and measurement errors, we model the time-varying covariate trajectories as functional data with reduced-rank Karhunen-Loéve expansions, where splines are employed to capture the mean and eigenfunctions. Treating the latent subgroup membership and the functional principal component (FPC) scores as missing data, we propose an Expectation-Maximization algorithm to effectively fit the joint model, combining the mixture regression for the hormonal response and the FPC model for the asynchronous, time-varying covariates. In addition, we explore data-driven methods to determine the optimal number of subgroups within the population. Through our comprehensive analysis of the SWAN data, we unveil a crucial subgroup structure within the aging female population, shedding light on important distinctions and patterns among women undergoing menopause.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11929387/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143694532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian estimation of covariate assisted principal regression for brain functional connectivity. 针对大脑功能连接性的协变量辅助主回归贝叶斯估计。
IF 1.8 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxae023
Hyung G Park

This paper presents a Bayesian reformulation of covariate-assisted principal regression for covariance matrix outcomes to identify low-dimensional components in the covariance associated with covariates. By introducing a geometric approach to the covariance matrices and leveraging Euclidean geometry, we estimate dimension reduction parameters and model covariance heterogeneity based on covariates. This method enables joint estimation and uncertainty quantification of relevant model parameters associated with heteroscedasticity. We demonstrate our approach through simulation studies and apply it to analyze associations between covariates and brain functional connectivity using data from the Human Connectome Project.

本文对协方差矩阵结果的协方差辅助主回归进行了贝叶斯重构,以识别协方差中与协方差相关的低维成分。通过对协方差矩阵引入几何方法并利用欧几里得几何,我们可以根据协方差估计降维参数并建立协方差异质性模型。这种方法可以对与异方差相关的模型参数进行联合估计和不确定性量化。我们通过模拟研究展示了我们的方法,并将其应用于利用人类连接组项目的数据分析协变量与大脑功能连接之间的关联。
{"title":"Bayesian estimation of covariate assisted principal regression for brain functional connectivity.","authors":"Hyung G Park","doi":"10.1093/biostatistics/kxae023","DOIUrl":"10.1093/biostatistics/kxae023","url":null,"abstract":"<p><p>This paper presents a Bayesian reformulation of covariate-assisted principal regression for covariance matrix outcomes to identify low-dimensional components in the covariance associated with covariates. By introducing a geometric approach to the covariance matrices and leveraging Euclidean geometry, we estimate dimension reduction parameters and model covariance heterogeneity based on covariates. This method enables joint estimation and uncertainty quantification of relevant model parameters associated with heteroscedasticity. We demonstrate our approach through simulation studies and apply it to analyze associations between covariates and brain functional connectivity using data from the Human Connectome Project.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11823071/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141565188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel high-dimensional model for identifying regional DNA methylation QTLs. 一种新的高维区域DNA甲基化qtl识别模型。
IF 2 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxaf032
Kaiqiong Zhao, Archer Y Yang, Karim Oualkacha, Yixiao Zeng, Kathleen Klein, Marie Hudson, Inés Colmegna, Sasha Bernatsky, Celia M T Greenwood

Varying coefficient models offer the flexibility to learn the dynamic changes of regression coefficients. Despite their good interpretability and diverse applications, in high-dimensional settings, existing estimation methods for such models have important limitations. For example, we routinely encounter the need for variable selection when faced with a large collection of covariates with nonlinear/varying effects on outcomes, and no ideal solutions exist. One illustration of this situation could be identifying a subset of genetic variants with local influence on methylation levels in a regulatory region. To address this problem, we propose a composite sparse penalty that encourages both sparsity and smoothness for the varying coefficients. We present an efficient proximal gradient descent algorithm that scales to high-dimensional predictor spaces, providing sparse solutions for the varying coefficients. A comprehensive simulation study has been conducted to evaluate the performance of our approach in terms of estimation, prediction and selection accuracy. We show that the inclusion of smoothness control yields much better results over sparsity-only approaches. An adaptive version of the penalty offers additional performance gains. We further demonstrate the utility of our method in identifying regional mQTLs from asymptomatic samples in the CARTaGENE cohort. The methodology is implemented in the R package sparseSOMNiBUS, available on GitHub.

变系数模型提供了学习回归系数动态变化的灵活性。尽管它们具有良好的可解释性和多种应用,但在高维环境中,现有的此类模型估计方法具有重要的局限性。例如,当我们面对大量对结果具有非线性/变化影响的协变量时,我们经常遇到变量选择的需要,并且不存在理想的解决方案。这种情况的一个例子可能是确定对调控区域甲基化水平有局部影响的遗传变异子集。为了解决这个问题,我们提出了一种复合稀疏惩罚,它既鼓励稀疏性,又鼓励变化系数的平滑性。我们提出了一种有效的近端梯度下降算法,该算法可扩展到高维预测空间,为变化系数提供稀疏解。进行了全面的仿真研究,以评估我们的方法在估计,预测和选择精度方面的性能。我们表明,包含平滑控制比仅稀疏性方法产生更好的结果。惩罚的自适应版本提供了额外的性能提升。我们进一步证明了我们的方法在从CARTaGENE队列的无症状样本中识别区域性mqtl的实用性。该方法在R包sparseSOMNiBUS中实现,可以在GitHub上获得。
{"title":"A novel high-dimensional model for identifying regional DNA methylation QTLs.","authors":"Kaiqiong Zhao, Archer Y Yang, Karim Oualkacha, Yixiao Zeng, Kathleen Klein, Marie Hudson, Inés Colmegna, Sasha Bernatsky, Celia M T Greenwood","doi":"10.1093/biostatistics/kxaf032","DOIUrl":"10.1093/biostatistics/kxaf032","url":null,"abstract":"<p><p>Varying coefficient models offer the flexibility to learn the dynamic changes of regression coefficients. Despite their good interpretability and diverse applications, in high-dimensional settings, existing estimation methods for such models have important limitations. For example, we routinely encounter the need for variable selection when faced with a large collection of covariates with nonlinear/varying effects on outcomes, and no ideal solutions exist. One illustration of this situation could be identifying a subset of genetic variants with local influence on methylation levels in a regulatory region. To address this problem, we propose a composite sparse penalty that encourages both sparsity and smoothness for the varying coefficients. We present an efficient proximal gradient descent algorithm that scales to high-dimensional predictor spaces, providing sparse solutions for the varying coefficients. A comprehensive simulation study has been conducted to evaluate the performance of our approach in terms of estimation, prediction and selection accuracy. We show that the inclusion of smoothness control yields much better results over sparsity-only approaches. An adaptive version of the penalty offers additional performance gains. We further demonstrate the utility of our method in identifying regional mQTLs from asymptomatic samples in the CARTaGENE cohort. The methodology is implemented in the R package sparseSOMNiBUS, available on GitHub.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12554007/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145373301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing treatment efficacy for interval-censored endpoints using multistate semi-Markov models fit to multiple data streams. 使用适合多个数据流的多状态半马尔可夫模型评估间隔截尾端点的治疗效果。
IF 2 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxaf038
Raphaël Morsomme, C Jason Liang, Allyson Mateja, Dean A Follmann, Meagan P O'Brien, Chenguang Wang, Jonathan Fintzi

We introduce a computationally efficient and general approach for utilizing multiple, possibly interval-censored, data streams to study complex biomedical endpoints using multistate semi-Markov models. Our motivating application is the REGEN-2069 trial, which investigated the protective efficacy (PE) of the monoclonal antibody combination REGEN-COV against SARS-CoV-2 when administered prophylactically to individuals in households at high risk of secondary transmission. Using data on symptom onset, episodic RT-qPCR sampling, and serological testing, we estimate the PE of REGEN-COV for asymptomatic infection, its effect on seroconversion following infection, and the duration of viral shedding. We find that REGEN-COV reduced the risk of asymptomatic infection and the duration of viral shedding, and led to lower rates of seroconversion among asymptomatically infected participants. Our algorithm for fitting semi-Markov models to interval-censored data employs a Monte Carlo expectation-maximization algorithm combined with importance sampling to efficiently address the intractability of the marginal likelihood when data are intermittently observed. Our algorithm provides substantial computational improvements over existing methods and allows us to fit semi-parametric models despite complex coarsening of the data.

我们介绍了一种计算效率高且通用的方法,用于利用多个可能间隔截尾的数据流来使用多状态半马尔可夫模型研究复杂的生物医学端点。我们的激励申请是REGEN-2069试验,该试验研究了单克隆抗体组合REGEN-COV对SARS-CoV-2的保护功效(PE),当对家庭中继发性传播高风险的个体进行预防性注射时。利用症状发作、时发性RT-qPCR取样和血清学检测的数据,我们估计了无症状感染时REGEN-COV的PE、其对感染后血清转化的影响以及病毒脱落的持续时间。我们发现,REGEN-COV降低了无症状感染的风险和病毒脱落的持续时间,并导致无症状感染参与者的血清转化率降低。我们将半马尔可夫模型拟合到区间截尾数据的算法采用蒙特卡罗期望最大化算法与重要抽样相结合,以有效地解决数据间歇观察时边际似然的难处性。与现有方法相比,我们的算法提供了实质性的计算改进,并允许我们在数据复杂粗糙化的情况下拟合半参数模型。
{"title":"Assessing treatment efficacy for interval-censored endpoints using multistate semi-Markov models fit to multiple data streams.","authors":"Raphaël Morsomme, C Jason Liang, Allyson Mateja, Dean A Follmann, Meagan P O'Brien, Chenguang Wang, Jonathan Fintzi","doi":"10.1093/biostatistics/kxaf038","DOIUrl":"10.1093/biostatistics/kxaf038","url":null,"abstract":"<p><p>We introduce a computationally efficient and general approach for utilizing multiple, possibly interval-censored, data streams to study complex biomedical endpoints using multistate semi-Markov models. Our motivating application is the REGEN-2069 trial, which investigated the protective efficacy (PE) of the monoclonal antibody combination REGEN-COV against SARS-CoV-2 when administered prophylactically to individuals in households at high risk of secondary transmission. Using data on symptom onset, episodic RT-qPCR sampling, and serological testing, we estimate the PE of REGEN-COV for asymptomatic infection, its effect on seroconversion following infection, and the duration of viral shedding. We find that REGEN-COV reduced the risk of asymptomatic infection and the duration of viral shedding, and led to lower rates of seroconversion among asymptomatically infected participants. Our algorithm for fitting semi-Markov models to interval-censored data employs a Monte Carlo expectation-maximization algorithm combined with importance sampling to efficiently address the intractability of the marginal likelihood when data are intermittently observed. Our algorithm provides substantial computational improvements over existing methods and allows us to fit semi-parametric models despite complex coarsening of the data.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12629085/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145558436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to: Scalable kernel balancing weights in a nationwide observational study of hospital profit status and heart attack outcomes. 修正:在一项全国性的医院盈利状况和心脏病发作结果的观察性研究中,可扩展的核平衡权值。
IF 1.8 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxae050
{"title":"Correction to: Scalable kernel balancing weights in a nationwide observational study of hospital profit status and heart attack outcomes.","authors":"","doi":"10.1093/biostatistics/kxae050","DOIUrl":"10.1093/biostatistics/kxae050","url":null,"abstract":"","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142883691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Biostatistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1