首页 > 最新文献

Biostatistics最新文献

英文 中文
Stratification-based instrumental variable analysis framework for nonlinear effect analysis. 基于分层的非线性效应分析工具变量分析框架。
IF 2 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxaf043
Haodong Tian, Ashish Patel, Stephen Burgess

Nonlinear causal effects are prevalent in many research scenarios involving continuous exposures, and instrumental variables (IVs) can be employed to investigate such effects, particularly in the presence of unmeasured confounders. However, common IV methods for nonlinear effect analysis, such as IV regression or the control-function method, have inherent limitations, leading to either low statistical power or potentially misleading conclusions. In this work, we propose an alternative IV framework for nonlinear effect analysis, which has recently emerged in genetic epidemiology and addresses many of the drawbacks of existing IV methods. The proposed IV framework consists of up to three key "S" elements: (i) the Stratification approach, which constructs multiple strata that are sub-samples of the population in which the IV core assumptions remain valid, (ii) the Scalar-on-function model and Scalar-on-scalar model, which connect local stratum-specific information to global effect estimation, and (iii) the Sum-of-single-effects method for effect estimation. This framework enables study of the effect function while avoiding unnecessary model assumptions. In particular, it facilitates the identification of change points or threshold values in causal effects. Through a wide variety of simulations, we demonstrate that our framework outperforms other representative nonlinear IV methods in predicting the effect shape when the instrument is weak and can accurately estimate the effect function as well as identify the change point and predict its value under various structural model and effect shape scenarios. We further apply our framework to assess the nonlinear effect of alcohol consumption on systolic blood pressure using a genetic instrument (ie Mendelian randomization) with UK Biobank data. Our analysis detects a threshold beyond which alcohol intake exhibits a clear causal effect on the outcome. Our results are consistent with published medical guidelines.

非线性因果效应在许多涉及连续暴露的研究场景中很普遍,可以使用工具变量(IVs)来研究这种效应,特别是在存在未测量混杂因素的情况下。然而,用于非线性效应分析的常用IV方法,如IV回归或控制函数法,具有固有的局限性,导致统计能力低或可能导致误导性结论。在这项工作中,我们提出了一种非线性效应分析的替代IV框架,该框架最近出现在遗传流行病学中,并解决了现有IV方法的许多缺点。拟议的IV框架由多达三个关键的“S”元素组成:(i)分层方法,该方法构建了多个分层,这些分层是IV核心假设仍然有效的人口的子样本;(ii)函数上标量模型和标量上标量模型,将局部特定层的信息与全局效应估计联系起来;(iii)单一效应求和方法用于效果估计。这个框架可以在研究效果函数的同时避免不必要的模型假设。特别是,它有助于识别因果效应中的变化点或阈值。通过各种各样的模拟,我们证明了我们的框架在预测仪器弱时的效应形状方面优于其他代表性的非线性IV方法,并且可以准确地估计效应函数,并在各种结构模型和效应形状场景下识别变化点并预测其值。我们进一步应用我们的框架,利用遗传工具(即孟德尔随机化)和英国生物银行的数据,评估酒精消耗对收缩压的非线性影响。我们的分析发现了一个阈值,超过这个阈值,酒精摄入对结果表现出明显的因果影响。我们的结果与出版的医学指南一致。
{"title":"Stratification-based instrumental variable analysis framework for nonlinear effect analysis.","authors":"Haodong Tian, Ashish Patel, Stephen Burgess","doi":"10.1093/biostatistics/kxaf043","DOIUrl":"10.1093/biostatistics/kxaf043","url":null,"abstract":"<p><p>Nonlinear causal effects are prevalent in many research scenarios involving continuous exposures, and instrumental variables (IVs) can be employed to investigate such effects, particularly in the presence of unmeasured confounders. However, common IV methods for nonlinear effect analysis, such as IV regression or the control-function method, have inherent limitations, leading to either low statistical power or potentially misleading conclusions. In this work, we propose an alternative IV framework for nonlinear effect analysis, which has recently emerged in genetic epidemiology and addresses many of the drawbacks of existing IV methods. The proposed IV framework consists of up to three key \"S\" elements: (i) the Stratification approach, which constructs multiple strata that are sub-samples of the population in which the IV core assumptions remain valid, (ii) the Scalar-on-function model and Scalar-on-scalar model, which connect local stratum-specific information to global effect estimation, and (iii) the Sum-of-single-effects method for effect estimation. This framework enables study of the effect function while avoiding unnecessary model assumptions. In particular, it facilitates the identification of change points or threshold values in causal effects. Through a wide variety of simulations, we demonstrate that our framework outperforms other representative nonlinear IV methods in predicting the effect shape when the instrument is weak and can accurately estimate the effect function as well as identify the change point and predict its value under various structural model and effect shape scenarios. We further apply our framework to assess the nonlinear effect of alcohol consumption on systolic blood pressure using a genetic instrument (ie Mendelian randomization) with UK Biobank data. Our analysis detects a threshold beyond which alcohol intake exhibits a clear causal effect on the outcome. Our results are consistent with published medical guidelines.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12665183/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145643011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction. 更正。
IF 1.8 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxae029
{"title":"Correction.","authors":"","doi":"10.1093/biostatistics/kxae029","DOIUrl":"10.1093/biostatistics/kxae029","url":null,"abstract":"","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11823215/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142074610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Simultaneous clustering and estimation of networks in multiple graphical models. 在多个图形模型中同时对网络进行聚类和估算。
IF 1.8 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxae015
Gen Li, Miaoyan Wang

Gaussian graphical models are widely used to study the dependence structure among variables. When samples are obtained from multiple conditions or populations, joint analysis of multiple graphical models are desired due to their capacity to borrow strength across populations. Nonetheless, existing methods often overlook the varying levels of similarity between populations, leading to unsatisfactory results. Moreover, in many applications, learning the population-level clustering structure itself is of particular interest. In this article, we develop a novel method, called Simultaneous Clustering and Estimation of Networks via Tensor decomposition (SCENT), that simultaneously clusters and estimates graphical models from multiple populations. Precision matrices from different populations are uniquely organized as a three-way tensor array, and a low-rank sparse model is proposed for joint population clustering and network estimation. We develop a penalized likelihood method and an augmented Lagrangian algorithm for model fitting. We also establish the clustering accuracy and norm consistency of the estimated precision matrices. We demonstrate the efficacy of the proposed method with comprehensive simulation studies. The application to the Genotype-Tissue Expression multi-tissue gene expression data provides important insights into tissue clustering and gene coexpression patterns in multiple brain tissues.

高斯图形模型被广泛用于研究变量之间的依赖结构。当样本来自多个条件或群体时,由于多个图形模型具有跨群体借力的能力,因此需要对其进行联合分析。然而,现有的方法往往忽略了群体间不同程度的相似性,导致结果不尽人意。此外,在许多应用中,学习种群级聚类结构本身也是特别令人感兴趣的。在本文中,我们开发了一种名为 "通过张量分解同时聚类和估计网络"(SCENT)的新方法,可同时对多个种群的图形模型进行聚类和估计。来自不同种群的精确度矩阵被独特地组织成一个三向张量阵列,并提出了一个低秩稀疏模型,用于联合种群聚类和网络估计。我们开发了用于模型拟合的惩罚似然法和增强拉格朗日算法。我们还确定了聚类精度和估计精度矩阵的规范一致性。我们通过全面的模拟研究证明了所提方法的有效性。该方法在基因型-组织表达多组织基因表达数据中的应用,为我们了解多脑组织的组织聚类和基因共表达模式提供了重要的启示。
{"title":"Simultaneous clustering and estimation of networks in multiple graphical models.","authors":"Gen Li, Miaoyan Wang","doi":"10.1093/biostatistics/kxae015","DOIUrl":"10.1093/biostatistics/kxae015","url":null,"abstract":"<p><p>Gaussian graphical models are widely used to study the dependence structure among variables. When samples are obtained from multiple conditions or populations, joint analysis of multiple graphical models are desired due to their capacity to borrow strength across populations. Nonetheless, existing methods often overlook the varying levels of similarity between populations, leading to unsatisfactory results. Moreover, in many applications, learning the population-level clustering structure itself is of particular interest. In this article, we develop a novel method, called Simultaneous Clustering and Estimation of Networks via Tensor decomposition (SCENT), that simultaneously clusters and estimates graphical models from multiple populations. Precision matrices from different populations are uniquely organized as a three-way tensor array, and a low-rank sparse model is proposed for joint population clustering and network estimation. We develop a penalized likelihood method and an augmented Lagrangian algorithm for model fitting. We also establish the clustering accuracy and norm consistency of the estimated precision matrices. We demonstrate the efficacy of the proposed method with comprehensive simulation studies. The application to the Genotype-Tissue Expression multi-tissue gene expression data provides important insights into tissue clustering and gene coexpression patterns in multiple brain tissues.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11826093/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141263584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HMM for discovering decision-making dynamics using reinforcement learning experiments. 利用强化学习实验发现决策动态的 HMM。
IF 1.8 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxae033
Xingche Guo, Donglin Zeng, Yuanjia Wang

Major depressive disorder (MDD), a leading cause of years of life lived with disability, presents challenges in diagnosis and treatment due to its complex and heterogeneous nature. Emerging evidence indicates that reward processing abnormalities may serve as a behavioral marker for MDD. To measure reward processing, patients perform computer-based behavioral tasks that involve making choices or responding to stimulants that are associated with different outcomes, such as gains or losses in the laboratory. Reinforcement learning (RL) models are fitted to extract parameters that measure various aspects of reward processing (e.g. reward sensitivity) to characterize how patients make decisions in behavioral tasks. Recent findings suggest the inadequacy of characterizing reward learning solely based on a single RL model; instead, there may be a switching of decision-making processes between multiple strategies. An important scientific question is how the dynamics of strategies in decision-making affect the reward learning ability of individuals with MDD. Motivated by the probabilistic reward task within the Establishing Moderators and Biosignatures of Antidepressant Response in Clinical Care (EMBARC) study, we propose a novel RL-HMM (hidden Markov model) framework for analyzing reward-based decision-making. Our model accommodates decision-making strategy switching between two distinct approaches under an HMM: subjects making decisions based on the RL model or opting for random choices. We account for continuous RL state space and allow time-varying transition probabilities in the HMM. We introduce a computationally efficient Expectation-maximization (EM) algorithm for parameter estimation and use a nonparametric bootstrap for inference. Extensive simulation studies validate the finite-sample performance of our method. We apply our approach to the EMBARC study to show that MDD patients are less engaged in RL compared to the healthy controls, and engagement is associated with brain activities in the negative affect circuitry during an emotional conflict task.

重度抑郁障碍(MDD)是导致残疾生活年数的主要原因,由于其复杂性和异质性,给诊断和治疗带来了挑战。新出现的证据表明,奖赏处理异常可作为重度抑郁症的行为标记。为了测量奖赏加工,患者要完成基于计算机的行为任务,其中涉及做出选择或对兴奋剂做出反应,而这些选择或反应与不同的结果有关,例如在实验室中的收益或损失。对强化学习(RL)模型进行拟合,以提取衡量奖赏处理各方面(如奖赏敏感性)的参数,从而描述患者在行为任务中如何做出决策。最近的研究结果表明,仅根据单一的 RL 模型来描述奖赏学习是不够的;相反,决策过程可能会在多种策略之间切换。一个重要的科学问题是,决策策略的动态变化如何影响 MDD 患者的奖赏学习能力。受 "建立临床护理中抗抑郁剂反应的调节因子和生物特征"(EMBARC)研究中的概率奖励任务的启发,我们提出了一种新的 RL-HMM(隐马尔可夫模型)框架,用于分析基于奖励的决策。我们的模型允许在 HMM 下的两种不同方法之间切换决策策略:受试者根据 RL 模型做出决策或选择随机选择。我们考虑了连续的 RL 状态空间,并允许 HMM 中的过渡概率随时间变化。我们引入了一种计算高效的期望最大化(EM)算法来进行参数估计,并使用非参数自举法进行推断。广泛的模拟研究验证了我们方法的有限样本性能。我们将我们的方法应用于 EMBARC 研究,结果表明与健康对照组相比,MDD 患者在 RL 中的参与度较低,而参与度与情绪冲突任务中负面情绪回路的大脑活动有关。
{"title":"HMM for discovering decision-making dynamics using reinforcement learning experiments.","authors":"Xingche Guo, Donglin Zeng, Yuanjia Wang","doi":"10.1093/biostatistics/kxae033","DOIUrl":"10.1093/biostatistics/kxae033","url":null,"abstract":"<p><p>Major depressive disorder (MDD), a leading cause of years of life lived with disability, presents challenges in diagnosis and treatment due to its complex and heterogeneous nature. Emerging evidence indicates that reward processing abnormalities may serve as a behavioral marker for MDD. To measure reward processing, patients perform computer-based behavioral tasks that involve making choices or responding to stimulants that are associated with different outcomes, such as gains or losses in the laboratory. Reinforcement learning (RL) models are fitted to extract parameters that measure various aspects of reward processing (e.g. reward sensitivity) to characterize how patients make decisions in behavioral tasks. Recent findings suggest the inadequacy of characterizing reward learning solely based on a single RL model; instead, there may be a switching of decision-making processes between multiple strategies. An important scientific question is how the dynamics of strategies in decision-making affect the reward learning ability of individuals with MDD. Motivated by the probabilistic reward task within the Establishing Moderators and Biosignatures of Antidepressant Response in Clinical Care (EMBARC) study, we propose a novel RL-HMM (hidden Markov model) framework for analyzing reward-based decision-making. Our model accommodates decision-making strategy switching between two distinct approaches under an HMM: subjects making decisions based on the RL model or opting for random choices. We account for continuous RL state space and allow time-varying transition probabilities in the HMM. We introduce a computationally efficient Expectation-maximization (EM) algorithm for parameter estimation and use a nonparametric bootstrap for inference. Extensive simulation studies validate the finite-sample performance of our method. We apply our approach to the EMBARC study to show that MDD patients are less engaged in RL compared to the healthy controls, and engagement is associated with brain activities in the negative affect circuitry during an emotional conflict task.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12090054/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142127451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exposure proximal immune correlates analysis. 接触近端免疫相关性分析。
IF 2 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxae031
Ying Huang, Dean Follmann

Immune response decays over time, and vaccine-induced protection often wanes. Understanding how vaccine efficacy changes over time is critical to guiding the development and application of vaccines in preventing infectious diseases. The objective of this article is to develop statistical methods that assess the effect of decaying immune responses on the risk of disease and on vaccine efficacy, within the context of Cox regression with sparse sampling of immune responses, in a baseline-naive population. We aim to further disentangle the various aspects of the time-varying vaccine effect, whether direct on disease or mediated through immune responses. Based on time-to-event data from a vaccine efficacy trial and sparse sampling of longitudinal immune responses, we propose a weighted estimated induced likelihood approach that models the longitudinal immune response trajectory and the time to event separately. This approach assesses the effects of the decaying immune response, the peak immune response, and/or the waning vaccine effect on the risk of disease. The proposed method is applicable not only to standard randomized trial designs but also to augmented vaccine trial designs that re-vaccinate uninfected placebo recipients at the end of the standard trial period. We conducted simulation studies to evaluate the performance of our method and applied the method to analyze immune correlates from a phase III SARS-CoV-2 vaccine trial.

免疫反应会随着时间的推移而衰减,疫苗诱导的保护作用往往会减弱。了解疫苗效力如何随时间而变化,对于指导疫苗的开发和应用以预防传染病至关重要。本文旨在开发统计方法,在对基线免疫人群的免疫反应进行稀疏采样的考克斯回归背景下,评估衰减的免疫反应对疾病风险和疫苗效力的影响。我们的目标是进一步厘清疫苗时变效应的各个方面,无论是直接影响疾病还是通过免疫反应介导。基于疫苗疗效试验的事件发生时间数据和纵向免疫反应的稀疏采样,我们提出了一种加权估计诱导似然法,该方法对纵向免疫反应轨迹和事件发生时间分别建模。这种方法可评估免疫反应衰减、免疫反应高峰和/或疫苗效果减弱对疾病风险的影响。所提出的方法不仅适用于标准随机试验设计,也适用于在标准试验期结束时对未感染的安慰剂受试者进行再接种的增强疫苗试验设计。我们进行了模拟研究来评估我们的方法的性能,并将该方法应用于分析 SARS-CoV-2 疫苗 III 期试验的免疫相关性。
{"title":"Exposure proximal immune correlates analysis.","authors":"Ying Huang, Dean Follmann","doi":"10.1093/biostatistics/kxae031","DOIUrl":"10.1093/biostatistics/kxae031","url":null,"abstract":"<p><p>Immune response decays over time, and vaccine-induced protection often wanes. Understanding how vaccine efficacy changes over time is critical to guiding the development and application of vaccines in preventing infectious diseases. The objective of this article is to develop statistical methods that assess the effect of decaying immune responses on the risk of disease and on vaccine efficacy, within the context of Cox regression with sparse sampling of immune responses, in a baseline-naive population. We aim to further disentangle the various aspects of the time-varying vaccine effect, whether direct on disease or mediated through immune responses. Based on time-to-event data from a vaccine efficacy trial and sparse sampling of longitudinal immune responses, we propose a weighted estimated induced likelihood approach that models the longitudinal immune response trajectory and the time to event separately. This approach assesses the effects of the decaying immune response, the peak immune response, and/or the waning vaccine effect on the risk of disease. The proposed method is applicable not only to standard randomized trial designs but also to augmented vaccine trial designs that re-vaccinate uninfected placebo recipients at the end of the standard trial period. We conducted simulation studies to evaluate the performance of our method and applied the method to analyze immune correlates from a phase III SARS-CoV-2 vaccine trial.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11823265/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141984050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Covariate-adjusted estimators of diagnostic accuracy in randomized trials. 随机试验中诊断准确性的协变量校正估计值。
IF 1.8 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxaf005
Jon A Steingrimsson

Randomized controlled trials evaluating the diagnostic accuracy of a marker frequently collect information on baseline covariates in addition to information on the marker and the reference standard. However, standard estimators of sensitivity and specificity do not use data on baseline covariates and restrict the analysis to data from participants with a positive reference standard in the intervention arm being evaluated. Covariate-adjusted estimators for marginal treatment effects have been developed and been advocated for by regulatory agencies because they can improve power compared to unadjusted estimators. Despite this, similar covariate-adjusted estimators for marginal sensitivity and specificity have not yet been developed. In this manuscript, we address this gap by developing covariate-adjusted estimators for marginal sensitivity and specificity of a diagnostic test that leverage baseline covariate information. The estimators also use data from all participants, not just participants with a positive reference standard in the intervention arm being evaluated. We derive the asymptotic properties of the estimators and evaluate the finite sample properties of the estimators using simulations and by analyzing data on lung cancer screening.

评估标志物诊断准确性的随机对照试验除了收集标志物和参考标准的信息外,还经常收集基线协变量的信息。然而,敏感性和特异性的标准估计不使用基线协变量的数据,并将分析限制在被评估干预组中具有阳性参考标准的参与者的数据。对于边际处理效果的协变量调整估计值已经被开发出来,并且被监管机构所提倡,因为与未调整的估计值相比,它们可以提高功率。尽管如此,类似的协变量调整的边际敏感性和特异性估计尚未开发。在本文中,我们通过开发利用基线协变量信息的诊断测试的边际敏感性和特异性的协变量调整估计值来解决这一差距。评估者也使用来自所有参与者的数据,而不仅仅是评估干预组中具有积极参考标准的参与者。我们通过模拟和分析肺癌筛查的数据,推导了估计量的渐近性质,并评估了估计量的有限样本性质。
{"title":"Covariate-adjusted estimators of diagnostic accuracy in randomized trials.","authors":"Jon A Steingrimsson","doi":"10.1093/biostatistics/kxaf005","DOIUrl":"10.1093/biostatistics/kxaf005","url":null,"abstract":"<p><p>Randomized controlled trials evaluating the diagnostic accuracy of a marker frequently collect information on baseline covariates in addition to information on the marker and the reference standard. However, standard estimators of sensitivity and specificity do not use data on baseline covariates and restrict the analysis to data from participants with a positive reference standard in the intervention arm being evaluated. Covariate-adjusted estimators for marginal treatment effects have been developed and been advocated for by regulatory agencies because they can improve power compared to unadjusted estimators. Despite this, similar covariate-adjusted estimators for marginal sensitivity and specificity have not yet been developed. In this manuscript, we address this gap by developing covariate-adjusted estimators for marginal sensitivity and specificity of a diagnostic test that leverage baseline covariate information. The estimators also use data from all participants, not just participants with a positive reference standard in the intervention arm being evaluated. We derive the asymptotic properties of the estimators and evaluate the finite sample properties of the estimators using simulations and by analyzing data on lung cancer screening.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143626881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust transfer learning for individualized treatment rules in the presence of missing data. 数据缺失情况下个性化治疗规则的鲁棒迁移学习。
IF 2 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxaf023
Zhiyu Sui, Ying Ding, Lu Tang

Individualized treatment rule (ITR) is a stepping stone to precision medicine. To ensure validity, ITRs are ideally derived from randomized trial data, but the use cases of ITRs extend beyond these trial populations. Transferring knowledge from experimental data to real-world data is of interest, while experimental data with selective inclusion criteria reflect a population distribution that may differ from the real-world target. In well-designed experiments, granular information crucial to decision making can be thoroughly collected. However, part of this may not be accessible in real-world scenarios. We propose a learning scheme for ITR that simultaneously addresses the issues of covariate shift and missing covariates with a quantile-based optimal treatment objective. Specifically, we compare the outcome uncertainty across treatment arms that is due to missing covariates and use it to guide treatment selection to reduce the likelihood of worse outcomes. The performance of this method is evaluated in simulations and a sepsis data application.

个体化治疗规则(ITR)是实现精准医疗的基石。为了确保有效性,itr的理想来源是随机试验数据,但itr的用例超出了这些试验人群。将知识从实验数据转移到现实世界数据是一个有趣的问题,而具有选择性纳入标准的实验数据反映了可能与现实世界目标不同的总体分布。在设计良好的实验中,可以彻底收集到对决策至关重要的细粒度信息。然而,其中一部分可能无法在实际场景中实现。我们提出了一种ITR的学习方案,该方案同时解决了协变量移位和缺失协变量的问题,具有基于分位数的最佳治疗目标。具体来说,我们比较了由于缺少协变量而导致的治疗组的结果不确定性,并用它来指导治疗选择,以减少不良结果的可能性。在仿真和脓毒症数据应用中评估了该方法的性能。
{"title":"Robust transfer learning for individualized treatment rules in the presence of missing data.","authors":"Zhiyu Sui, Ying Ding, Lu Tang","doi":"10.1093/biostatistics/kxaf023","DOIUrl":"10.1093/biostatistics/kxaf023","url":null,"abstract":"<p><p>Individualized treatment rule (ITR) is a stepping stone to precision medicine. To ensure validity, ITRs are ideally derived from randomized trial data, but the use cases of ITRs extend beyond these trial populations. Transferring knowledge from experimental data to real-world data is of interest, while experimental data with selective inclusion criteria reflect a population distribution that may differ from the real-world target. In well-designed experiments, granular information crucial to decision making can be thoroughly collected. However, part of this may not be accessible in real-world scenarios. We propose a learning scheme for ITR that simultaneously addresses the issues of covariate shift and missing covariates with a quantile-based optimal treatment objective. Specifically, we compare the outcome uncertainty across treatment arms that is due to missing covariates and use it to guide treatment selection to reduce the likelihood of worse outcomes. The performance of this method is evaluated in simulations and a sepsis data application.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12342780/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144838619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A surrogate endpoint-based provisional approval causal roadmap, illustrated by vaccine development. 基于替代终点的临时批准因果路线图,由疫苗开发说明。
IF 2 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxaf018
Peter B Gilbert, James Peng, Larry Han, Theis Lange, Yun Lu, Lei Nie, Mei-Chiung Shih, Salina P Waddy, Ken Wiley, Margot Yann, Zafar Zafari, Debashis Ghosh, Dean Follmann, Michal Juraska, Iván Díaz

For many rare diseases with no approved preventive interventions, promising interventions exist. However, it has proven difficult to conduct a pivotal phase 3 trial that could provide direct evidence demonstrating a beneficial effect of the intervention on the target disease outcome. When a promising putative surrogate endpoint(s) for the target outcome is available, surrogate-based provisional approval of an intervention may be pursued. Following the general Causal Roadmap rubric, we describe a surrogate endpoint-based provisional approval causal roadmap. Based on an observational study data set and a phase 3 randomized trial data set, this roadmap defines an approach to analyze the combined data set to draw a conservative inference about the treatment effect (TE) on the target outcome in the phase 3 study population. The observational study enrolls untreated individuals and collects baseline covariates, surrogate endpoints, and the target outcome, and is used to estimate the surrogate index-the regression of the target outcome on the surrogate endpoints and baseline covariates. The phase 3 trial randomizes participants to treated vs. untreated and collects the same data but is much smaller and hence very underpowered to directly assess TE, such that inference on TE is based on the surrogate index. This inference is made conservative by specifying 2 bias functions: one that expresses an imperfection of the surrogate index as a surrogate endpoint in the phase 3 study, and the other that expresses imperfect transport of the surrogate index in the untreated from the observational to the phase 3 study. Plug-in and nonparametric efficient one-step estimators of TE, with inferential procedures, are developed. The finite-sample performance of the estimators is evaluated in simulation studies. The causal roadmap is motivated by and illustrated with contemporary Group B Streptococcus vaccine development.

对于许多没有得到批准的预防性干预措施的罕见疾病,存在有希望的干预措施。然而,事实证明很难进行关键的3期试验,以提供直接证据证明干预对目标疾病结局的有益影响。当目标结果有一个有希望的假定替代终点时,可能会寻求基于替代的干预措施的临时批准。在一般因果路线图标题之后,我们将描述一个基于代理端点的临时审批因果路线图。基于一项观察性研究数据集和一项3期随机试验数据集,本路线图定义了一种分析联合数据集的方法,以得出关于3期研究人群中治疗效果(TE)对目标结局的保守推断。观察性研究纳入未经治疗的个体,收集基线协变量、替代终点和目标结果,并用于估计替代指数——目标结果在替代终点和基线协变量上的回归。3期试验将参与者随机分为治疗组和未治疗组,收集了相同的数据,但数据要小得多,因此无法直接评估TE,因此TE的推断是基于替代指数的。通过指定2个偏倚函数,这一推断是保守的:一个偏倚函数表示替代指数在3期研究中作为替代终点的不完善,另一个偏倚函数表示未经治疗的替代指数从观察性研究转移到3期研究的不完善。给出了插入式和非参数高效的一步估计,并给出了推导过程。在仿真研究中评估了估计器的有限样本性能。因果路线图是由当代B群链球菌疫苗的发展所激发和说明的。
{"title":"A surrogate endpoint-based provisional approval causal roadmap, illustrated by vaccine development.","authors":"Peter B Gilbert, James Peng, Larry Han, Theis Lange, Yun Lu, Lei Nie, Mei-Chiung Shih, Salina P Waddy, Ken Wiley, Margot Yann, Zafar Zafari, Debashis Ghosh, Dean Follmann, Michal Juraska, Iván Díaz","doi":"10.1093/biostatistics/kxaf018","DOIUrl":"10.1093/biostatistics/kxaf018","url":null,"abstract":"<p><p>For many rare diseases with no approved preventive interventions, promising interventions exist. However, it has proven difficult to conduct a pivotal phase 3 trial that could provide direct evidence demonstrating a beneficial effect of the intervention on the target disease outcome. When a promising putative surrogate endpoint(s) for the target outcome is available, surrogate-based provisional approval of an intervention may be pursued. Following the general Causal Roadmap rubric, we describe a surrogate endpoint-based provisional approval causal roadmap. Based on an observational study data set and a phase 3 randomized trial data set, this roadmap defines an approach to analyze the combined data set to draw a conservative inference about the treatment effect (TE) on the target outcome in the phase 3 study population. The observational study enrolls untreated individuals and collects baseline covariates, surrogate endpoints, and the target outcome, and is used to estimate the surrogate index-the regression of the target outcome on the surrogate endpoints and baseline covariates. The phase 3 trial randomizes participants to treated vs. untreated and collects the same data but is much smaller and hence very underpowered to directly assess TE, such that inference on TE is based on the surrogate index. This inference is made conservative by specifying 2 bias functions: one that expresses an imperfection of the surrogate index as a surrogate endpoint in the phase 3 study, and the other that expresses imperfect transport of the surrogate index in the untreated from the observational to the phase 3 study. Plug-in and nonparametric efficient one-step estimators of TE, with inferential procedures, are developed. The finite-sample performance of the estimators is evaluated in simulation studies. The causal roadmap is motivated by and illustrated with contemporary Group B Streptococcus vaccine development.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12205950/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144369548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Biomarker-assisted reporting in nutritional epidemiology: addressing measurement error in exposure-disease associations. 营养流行病学中的生物标志物辅助报告:处理暴露与疾病关联中的测量误差。
IF 2 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxaf014
Ying Huang, Ross L Prentice

In nutritional epidemiology, self-reported dietary data are commonly used to investigate diet-disease relationships. However, the resulting association estimates are often subject to biases due to random and systematic measurement errors. Regression calibration has emerged as a crucial method for addressing these biases by refining self-reported nutrient intake with objective biomarkers, which differ from the true values only by a random "noise" component. This paper presents methodological tools for analyzing nutritional epidemiology cohort studies involving time-to-event data when a biomarker subsample is available alongside dietary assessments. We introduce novel regression calibration methods to tackle two common challenges in this field. First, a widely used approach assumes that the log hazard ratio (HR) follows a linear function of dietary exposure. However, assessing whether this assumption holds-or if a more flexible model is needed to capture potential deviations from linearity-is often necessary. Second, another prevalent analytical strategy involves estimating HRs based on categorized dietary exposure variables. New methods are critically needed to minimize bias in defining category boundaries and estimating hazard ratios within exposure categories, both of which can be distorted by measurement error. We apply these methods to reassess the relationship between sodium and potassium intake and cardiovascular disease risk using data from the Women's Health Initiative.

在营养流行病学中,自我报告的饮食数据通常用于调查饮食与疾病的关系。然而,由此产生的关联估计往往受到随机和系统测量误差的影响。回归校准已经成为解决这些偏差的关键方法,通过使用客观生物标记物来改进自我报告的营养摄入量,这些生物标记物与真实值的区别只是随机的“噪声”成分。本文介绍了分析营养流行病学队列研究的方法学工具,这些研究涉及到事件发生时间数据,当生物标志物子样本与饮食评估一起可用时。我们引入新的回归校准方法来解决这一领域的两个常见挑战。首先,一种广泛使用的方法假设对数风险比(HR)遵循饮食暴露的线性函数。然而,评估这个假设是否成立——或者是否需要一个更灵活的模型来捕捉潜在的线性偏差——通常是必要的。其次,另一种流行的分析策略是基于分类的饮食暴露变量来估计hr。在定义类别边界和估计暴露类别内的危险比时,迫切需要新的方法来尽量减少偏差,这两者都可能因测量误差而扭曲。我们利用妇女健康倡议的数据,应用这些方法重新评估钠和钾摄入量与心血管疾病风险之间的关系。
{"title":"Biomarker-assisted reporting in nutritional epidemiology: addressing measurement error in exposure-disease associations.","authors":"Ying Huang, Ross L Prentice","doi":"10.1093/biostatistics/kxaf014","DOIUrl":"10.1093/biostatistics/kxaf014","url":null,"abstract":"<p><p>In nutritional epidemiology, self-reported dietary data are commonly used to investigate diet-disease relationships. However, the resulting association estimates are often subject to biases due to random and systematic measurement errors. Regression calibration has emerged as a crucial method for addressing these biases by refining self-reported nutrient intake with objective biomarkers, which differ from the true values only by a random \"noise\" component. This paper presents methodological tools for analyzing nutritional epidemiology cohort studies involving time-to-event data when a biomarker subsample is available alongside dietary assessments. We introduce novel regression calibration methods to tackle two common challenges in this field. First, a widely used approach assumes that the log hazard ratio (HR) follows a linear function of dietary exposure. However, assessing whether this assumption holds-or if a more flexible model is needed to capture potential deviations from linearity-is often necessary. Second, another prevalent analytical strategy involves estimating HRs based on categorized dietary exposure variables. New methods are critically needed to minimize bias in defining category boundaries and estimating hazard ratios within exposure categories, both of which can be distorted by measurement error. We apply these methods to reassess the relationship between sodium and potassium intake and cardiovascular disease risk using data from the Women's Health Initiative.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12129076/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144210340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Counterfactual fairness for small subgroups. 针对小群体的反事实公平性。
IF 2 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-31 DOI: 10.1093/biostatistics/kxaf046
Solvejg Wastvedt, Jared D Huling, Julian Wolfson

While methods for measuring and correcting differential performance in risk prediction models have proliferated in recent years, most existing techniques can only be used to assess fairness across relatively large subgroups. The purpose of algorithmic fairness efforts is often to redress discrimination against groups that are both marginalized and small, so this sample size limitation can prevent existing techniques from accomplishing their main aim. In clinical applications, this challenge combines with statistical issues that arise when models are used to guide treatment. We take a 3-step approach to addressing both of these challenges, building on the "counterfactual fairness" framework that accounts for confounding by treatment. First, we propose new estimands that leverage information across groups. Second, we estimate these quantities using a larger volume of data than existing techniques. Finally, we propose a novel data borrowing approach to incorporate "external data" that lacks outcomes and predictions but contains covariate and group membership information. We demonstrate application of our estimators to a risk prediction model used by a major Midwestern health system during the coronavirus disease 2019 (COVID-19) pandemic.

虽然测量和纠正风险预测模型中差异表现的方法近年来激增,但大多数现有技术只能用于评估相对较大的子群体的公平性。算法公平努力的目的通常是纠正对边缘化和小群体的歧视,因此这种样本量限制可能会阻止现有技术实现其主要目标。在临床应用中,这一挑战与使用模型指导治疗时出现的统计问题相结合。我们采取了三步走的方法来解决这两个挑战,建立在“反事实公平”框架的基础上,该框架考虑了治疗的混淆。首先,我们提出利用跨群体信息的新估计。其次,我们使用比现有技术更大的数据量来估计这些数量。最后,我们提出了一种新的数据借用方法来整合缺乏结果和预测但包含协变量和群体成员信息的“外部数据”。我们演示了将我们的估计器应用于中西部主要卫生系统在2019冠状病毒病(COVID-19)大流行期间使用的风险预测模型。
{"title":"Counterfactual fairness for small subgroups.","authors":"Solvejg Wastvedt, Jared D Huling, Julian Wolfson","doi":"10.1093/biostatistics/kxaf046","DOIUrl":"https://doi.org/10.1093/biostatistics/kxaf046","url":null,"abstract":"<p><p>While methods for measuring and correcting differential performance in risk prediction models have proliferated in recent years, most existing techniques can only be used to assess fairness across relatively large subgroups. The purpose of algorithmic fairness efforts is often to redress discrimination against groups that are both marginalized and small, so this sample size limitation can prevent existing techniques from accomplishing their main aim. In clinical applications, this challenge combines with statistical issues that arise when models are used to guide treatment. We take a 3-step approach to addressing both of these challenges, building on the \"counterfactual fairness\" framework that accounts for confounding by treatment. First, we propose new estimands that leverage information across groups. Second, we estimate these quantities using a larger volume of data than existing techniques. Finally, we propose a novel data borrowing approach to incorporate \"external data\" that lacks outcomes and predictions but contains covariate and group membership information. We demonstrate application of our estimators to a risk prediction model used by a major Midwestern health system during the coronavirus disease 2019 (COVID-19) pandemic.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145764587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Biostatistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1