首页 > 最新文献

Biometrics最新文献

英文 中文
Absolute risk from double nested case-control designs: cause-specific proportional hazards models with and without augmented estimating equations. 双嵌套病例对照设计的绝对风险:使用和不使用增强估计方程的特定病因比例危险模型。
IF 1.4 4区 数学 Q3 BIOLOGY Pub Date : 2024-07-01 DOI: 10.1093/biomtc/ujae062
Minjung Lee, Mitchell H Gail

We estimate relative hazards and absolute risks (or cumulative incidence or crude risk) under cause-specific proportional hazards models for competing risks from double nested case-control (DNCC) data. In the DNCC design, controls are time-matched not only to cases from the cause of primary interest, but also to cases from competing risks (the phase-two sample). Complete covariate data are available in the phase-two sample, but other cohort members only have information on survival outcomes and some covariates. Design-weighted estimators use inverse sampling probabilities computed from Samuelsen-type calculations for DNCC. To take advantage of additional information available on all cohort members, we augment the estimating equations with a term that is unbiased for zero but improves the efficiency of estimates from the cause-specific proportional hazards model. We establish the asymptotic properties of the proposed estimators, including the estimator of absolute risk, and derive consistent variance estimators. We show that augmented design-weighted estimators are more efficient than design-weighted estimators. Through simulations, we show that the proposed asymptotic methods yield nominal operating characteristics in practical sample sizes. We illustrate the methods using prostate cancer mortality data from the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial Study of the National Cancer Institute.

我们从双嵌套病例对照(DNCC)数据中,根据竞争风险的特定病因比例危险度模型估算相对危险度和绝对危险度(或累积发病率或粗风险)。在 DNCC 设计中,对照组不仅要与主要病因的病例进行时间匹配,还要与竞争风险的病例(第二阶段样本)进行时间匹配。第二阶段样本有完整的协变量数据,但其他队列成员只有生存结果和一些协变量信息。设计加权估计器使用的是根据 DNCC 的 Samuelsen 类型计算得出的反抽样概率。为了利用所有队列成员的额外信息,我们在估计方程中增加了一个对零无偏的项,但提高了特定成因比例危险模型的估计效率。我们建立了所建议的估计器(包括绝对风险估计器)的渐近特性,并推导出一致的方差估计器。我们表明,增强设计加权估计器比设计加权估计器更有效。通过模拟,我们表明所提出的渐近方法能在实际样本量中产生名义运行特征。我们使用美国国家癌症研究所的前列腺癌、肺癌、结肠直肠癌和卵巢癌筛查试验研究中的前列腺癌死亡率数据来说明这些方法。
{"title":"Absolute risk from double nested case-control designs: cause-specific proportional hazards models with and without augmented estimating equations.","authors":"Minjung Lee, Mitchell H Gail","doi":"10.1093/biomtc/ujae062","DOIUrl":"https://doi.org/10.1093/biomtc/ujae062","url":null,"abstract":"<p><p>We estimate relative hazards and absolute risks (or cumulative incidence or crude risk) under cause-specific proportional hazards models for competing risks from double nested case-control (DNCC) data. In the DNCC design, controls are time-matched not only to cases from the cause of primary interest, but also to cases from competing risks (the phase-two sample). Complete covariate data are available in the phase-two sample, but other cohort members only have information on survival outcomes and some covariates. Design-weighted estimators use inverse sampling probabilities computed from Samuelsen-type calculations for DNCC. To take advantage of additional information available on all cohort members, we augment the estimating equations with a term that is unbiased for zero but improves the efficiency of estimates from the cause-specific proportional hazards model. We establish the asymptotic properties of the proposed estimators, including the estimator of absolute risk, and derive consistent variance estimators. We show that augmented design-weighted estimators are more efficient than design-weighted estimators. Through simulations, we show that the proposed asymptotic methods yield nominal operating characteristics in practical sample sizes. We illustrate the methods using prostate cancer mortality data from the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial Study of the National Cancer Institute.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141589541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Factor-augmented transformation models for interval-censored failure time data. 用于间隔删失故障时间数据的因子增强变换模型。
IF 1.4 4区 数学 Q3 BIOLOGY Pub Date : 2024-07-01 DOI: 10.1093/biomtc/ujae078
Hongxi Li, Shuwei Li, Liuquan Sun, Xinyuan Song

Interval-censored failure time data frequently arise in various scientific studies where each subject experiences periodical examinations for the occurrence of the failure event of interest, and the failure time is only known to lie in a specific time interval. In addition, collected data may include multiple observed variables with a certain degree of correlation, leading to severe multicollinearity issues. This work proposes a factor-augmented transformation model to analyze interval-censored failure time data while reducing model dimensionality and avoiding multicollinearity elicited by multiple correlated covariates. We provide a joint modeling framework by comprising a factor analysis model to group multiple observed variables into a few latent factors and a class of semiparametric transformation models with the augmented factors to examine their and other covariate effects on the failure event. Furthermore, we propose a nonparametric maximum likelihood estimation approach and develop a computationally stable and reliable expectation-maximization algorithm for its implementation. We establish the asymptotic properties of the proposed estimators and conduct simulation studies to assess the empirical performance of the proposed method. An application to the Alzheimer's Disease Neuroimaging Initiative (ADNI) study is provided. An R package ICTransCFA is also available for practitioners. Data used in preparation of this article were obtained from the ADNI database.

区间删失失效时间数据经常出现在各种科学研究中,在这些研究中,每个受试者都经历了相关失效事件发生的定期检查,而失效时间只知道在一个特定的时间区间内。此外,收集到的数据可能包含多个具有一定相关性的观测变量,从而导致严重的多重共线性问题。本研究提出了一种因子增强变换模型,用于分析区间删失的故障时间数据,同时降低模型维度,避免多个相关协变量引起的多重共线性。我们提供了一个联合建模框架,其中包括一个因子分析模型,用于将多个观测变量归类为几个潜在因子,以及一类带有增强因子的半参数变换模型,用于检验这些因子和其他协变量对故障事件的影响。此外,我们还提出了一种非参数最大似然估计方法,并为其实现开发了一种计算稳定可靠的期望最大化算法。我们建立了所提估计器的渐近特性,并进行了模拟研究,以评估所提方法的经验性能。我们还提供了阿尔茨海默病神经影像倡议(ADNI)研究的应用。此外,还为实践者提供了一个 R 软件包 ICTransCFA。本文编写过程中使用的数据来自 ADNI 数据库。
{"title":"Factor-augmented transformation models for interval-censored failure time data.","authors":"Hongxi Li, Shuwei Li, Liuquan Sun, Xinyuan Song","doi":"10.1093/biomtc/ujae078","DOIUrl":"https://doi.org/10.1093/biomtc/ujae078","url":null,"abstract":"<p><p>Interval-censored failure time data frequently arise in various scientific studies where each subject experiences periodical examinations for the occurrence of the failure event of interest, and the failure time is only known to lie in a specific time interval. In addition, collected data may include multiple observed variables with a certain degree of correlation, leading to severe multicollinearity issues. This work proposes a factor-augmented transformation model to analyze interval-censored failure time data while reducing model dimensionality and avoiding multicollinearity elicited by multiple correlated covariates. We provide a joint modeling framework by comprising a factor analysis model to group multiple observed variables into a few latent factors and a class of semiparametric transformation models with the augmented factors to examine their and other covariate effects on the failure event. Furthermore, we propose a nonparametric maximum likelihood estimation approach and develop a computationally stable and reliable expectation-maximization algorithm for its implementation. We establish the asymptotic properties of the proposed estimators and conduct simulation studies to assess the empirical performance of the proposed method. An application to the Alzheimer's Disease Neuroimaging Initiative (ADNI) study is provided. An R package ICTransCFA is also available for practitioners. Data used in preparation of this article were obtained from the ADNI database.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142035125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Propensity weighting plus adjustment in proportional hazards model is not doubly robust. 比例危险模型中的倾向加权加调整不具有双重稳健性。
IF 1.4 4区 数学 Q3 BIOLOGY Pub Date : 2024-07-01 DOI: 10.1093/biomtc/ujae069
Erin E Gabriel, Michael C Sachs, Ingeborg Waernbaum, Els Goetghebeur, Paul F Blanche, Stijn Vansteelandt, Arvid Sjölander, Thomas Scheike

Recently, it has become common for applied works to combine commonly used survival analysis modeling methods, such as the multivariable Cox model and propensity score weighting, with the intention of forming a doubly robust estimator of an exposure effect hazard ratio that is unbiased in large samples when either the Cox model or the propensity score model is correctly specified. This combination does not, in general, produce a doubly robust estimator, even after regression standardization, when there is truly a causal effect. We demonstrate via simulation this lack of double robustness for the semiparametric Cox model, the Weibull proportional hazards model, and a simple proportional hazards flexible parametric model, with both the latter models fit via maximum likelihood. We provide a novel proof that the combination of propensity score weighting and a proportional hazards survival model, fit either via full or partial likelihood, is consistent under the null of no causal effect of the exposure on the outcome under particular censoring mechanisms if either the propensity score or the outcome model is correctly specified and contains all confounders. Given our results suggesting that double robustness only exists under the null, we outline 2 simple alternative estimators that are doubly robust for the survival difference at a given time point (in the above sense), provided the censoring mechanism can be correctly modeled, and one doubly robust method of estimation for the full survival curve. We provide R code to use these estimators for estimation and inference in the supporting information.

近来,应用研究普遍将常用的生存分析建模方法(如多变量 Cox 模型和倾向得分加权法)结合起来,目的是形成一个双重稳健的暴露效应危险比估计值,当 Cox 模型或倾向得分模型被正确指定时,该估计值在大样本中是无偏的。一般来说,当确实存在因果效应时,即使经过回归标准化处理,这种组合也不会产生双重稳健估计值。我们通过模拟证明了半参数 Cox 模型、Weibull 比例危险模型和简单比例危险灵活参数模型缺乏双重稳健性,后两种模型都是通过最大似然法拟合的。我们提供了一个新颖的证明,即如果倾向得分或结果模型指定正确且包含所有混杂因素,那么倾向得分加权与比例危险生存模型的组合,无论是通过完全似然法还是部分似然法拟合,在暴露对结果无因果效应的空值下,在特定的删减机制下都是一致的。鉴于我们的研究结果表明双重稳健性只存在于空值条件下,我们概述了 2 种简单的替代估计方法,它们对给定时间点上的生存率差异具有双重稳健性(在上述意义上),前提是能够正确地对剔除机制进行建模;我们还概述了一种对完整生存率曲线具有双重稳健性的估计方法。我们在辅助信息中提供了使用这些估计器进行估计和推断的 R 代码。
{"title":"Propensity weighting plus adjustment in proportional hazards model is not doubly robust.","authors":"Erin E Gabriel, Michael C Sachs, Ingeborg Waernbaum, Els Goetghebeur, Paul F Blanche, Stijn Vansteelandt, Arvid Sjölander, Thomas Scheike","doi":"10.1093/biomtc/ujae069","DOIUrl":"https://doi.org/10.1093/biomtc/ujae069","url":null,"abstract":"<p><p>Recently, it has become common for applied works to combine commonly used survival analysis modeling methods, such as the multivariable Cox model and propensity score weighting, with the intention of forming a doubly robust estimator of an exposure effect hazard ratio that is unbiased in large samples when either the Cox model or the propensity score model is correctly specified. This combination does not, in general, produce a doubly robust estimator, even after regression standardization, when there is truly a causal effect. We demonstrate via simulation this lack of double robustness for the semiparametric Cox model, the Weibull proportional hazards model, and a simple proportional hazards flexible parametric model, with both the latter models fit via maximum likelihood. We provide a novel proof that the combination of propensity score weighting and a proportional hazards survival model, fit either via full or partial likelihood, is consistent under the null of no causal effect of the exposure on the outcome under particular censoring mechanisms if either the propensity score or the outcome model is correctly specified and contains all confounders. Given our results suggesting that double robustness only exists under the null, we outline 2 simple alternative estimators that are doubly robust for the survival difference at a given time point (in the above sense), provided the censoring mechanism can be correctly modeled, and one doubly robust method of estimation for the full survival curve. We provide R code to use these estimators for estimation and inference in the supporting information.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141733497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving prediction of linear regression models by integrating external information from heterogeneous populations: James-Stein estimators. 通过整合来自异质种群的外部信息改进线性回归模型的预测:詹姆斯-斯坦估计器
IF 1.4 4区 数学 Q3 BIOLOGY Pub Date : 2024-07-01 DOI: 10.1093/biomtc/ujae072
Peisong Han, Haoyue Li, Sung Kyun Park, Bhramar Mukherjee, Jeremy M G Taylor

We consider the setting where (1) an internal study builds a linear regression model for prediction based on individual-level data, (2) some external studies have fitted similar linear regression models that use only subsets of the covariates and provide coefficient estimates for the reduced models without individual-level data, and (3) there is heterogeneity across these study populations. The goal is to integrate the external model summary information into fitting the internal model to improve prediction accuracy. We adapt the James-Stein shrinkage method to propose estimators that are no worse and are oftentimes better in the prediction mean squared error after information integration, regardless of the degree of study population heterogeneity. We conduct comprehensive simulation studies to investigate the numerical performance of the proposed estimators. We also apply the method to enhance a prediction model for patella bone lead level in terms of blood lead level and other covariates by integrating summary information from published literature.

我们考虑的情况是:(1) 一项内部研究根据个体水平数据建立了一个线性回归预测模型;(2) 一些外部研究拟合了类似的线性回归模型,这些模型只使用了协变量子集,并在没有个体水平数据的情况下提供了缩小模型的系数估计值;(3) 这些研究人群之间存在异质性。我们的目标是将外部模型的摘要信息整合到内部模型的拟合中,以提高预测的准确性。我们采用詹姆斯-斯泰因收缩方法,提出了在信息整合后预测均方误差不会变差的估计器,而且在很多情况下,无论研究人群的异质性程度如何,估计器的预测均方误差都会更好。我们进行了全面的模拟研究,以考察所提出的估计器的数值性能。我们还应用该方法,通过整合已发表文献的摘要信息,从血铅水平和其他协变量的角度增强了髌骨骨铅水平的预测模型。
{"title":"Improving prediction of linear regression models by integrating external information from heterogeneous populations: James-Stein estimators.","authors":"Peisong Han, Haoyue Li, Sung Kyun Park, Bhramar Mukherjee, Jeremy M G Taylor","doi":"10.1093/biomtc/ujae072","DOIUrl":"10.1093/biomtc/ujae072","url":null,"abstract":"<p><p>We consider the setting where (1) an internal study builds a linear regression model for prediction based on individual-level data, (2) some external studies have fitted similar linear regression models that use only subsets of the covariates and provide coefficient estimates for the reduced models without individual-level data, and (3) there is heterogeneity across these study populations. The goal is to integrate the external model summary information into fitting the internal model to improve prediction accuracy. We adapt the James-Stein shrinkage method to propose estimators that are no worse and are oftentimes better in the prediction mean squared error after information integration, regardless of the degree of study population heterogeneity. We conduct comprehensive simulation studies to investigate the numerical performance of the proposed estimators. We also apply the method to enhance a prediction model for patella bone lead level in terms of blood lead level and other covariates by integrating summary information from published literature.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11299067/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141888418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Gaussian-process approximation to a spatial SIR process using moment closures and emulators. 使用矩闭合和仿真器的空间 SIR 过程的高斯过程近似。
IF 1.4 4区 数学 Q3 BIOLOGY Pub Date : 2024-07-01 DOI: 10.1093/biomtc/ujae068
Parker Trostle, Joseph Guinness, Brian J Reich

The dynamics that govern disease spread are hard to model because infections are functions of both the underlying pathogen as well as human or animal behavior. This challenge is increased when modeling how diseases spread between different spatial locations. Many proposed spatial epidemiological models require trade-offs to fit, either by abstracting away theoretical spread dynamics, fitting a deterministic model, or by requiring large computational resources for many simulations. We propose an approach that approximates the complex spatial spread dynamics with a Gaussian process. We first propose a flexible spatial extension to the well-known SIR stochastic process, and then we derive a moment-closure approximation to this stochastic process. This moment-closure approximation yields ordinary differential equations for the evolution of the means and covariances of the susceptibles and infectious through time. Because these ODEs are a bottleneck to fitting our model by MCMC, we approximate them using a low-rank emulator. This approximation serves as the basis for our hierarchical model for noisy, underreported counts of new infections by spatial location and time. We demonstrate using our model to conduct inference on simulated infections from the underlying, true spatial SIR jump process. We then apply our method to model counts of new Zika infections in Brazil from late 2015 through early 2016.

控制疾病传播的动力学很难建模,因为感染既是潜在病原体的函数,也是人类或动物行为的函数。在模拟疾病如何在不同空间地点之间传播时,这一挑战就更大了。许多建议的空间流行病学模型需要权衡利弊才能拟合,要么抽象出理论传播动态,要么拟合确定性模型,要么需要大量计算资源进行多次模拟。我们提出了一种用高斯过程近似复杂空间传播动态的方法。我们首先对著名的 SIR 随机过程提出了灵活的空间扩展,然后推导出这一随机过程的时刻闭合近似值。这种时刻闭合近似得到了易感因子和感染因子的均值和协方差随时间演变的常微分方程。由于这些常微分方程是用 MCMC 拟合模型的瓶颈,因此我们使用低阶仿真器对其进行近似。这一近似值为我们的分层模型奠定了基础,该模型可用于按空间位置和时间计算有噪声的、未充分报告的新感染人数。我们演示了如何使用我们的模型,根据真实的空间 SIR 跳跃过程对模拟感染进行推断。然后,我们将我们的方法应用于巴西 2015 年末至 2016 年初的寨卡新发感染人数建模。
{"title":"A Gaussian-process approximation to a spatial SIR process using moment closures and emulators.","authors":"Parker Trostle, Joseph Guinness, Brian J Reich","doi":"10.1093/biomtc/ujae068","DOIUrl":"10.1093/biomtc/ujae068","url":null,"abstract":"<p><p>The dynamics that govern disease spread are hard to model because infections are functions of both the underlying pathogen as well as human or animal behavior. This challenge is increased when modeling how diseases spread between different spatial locations. Many proposed spatial epidemiological models require trade-offs to fit, either by abstracting away theoretical spread dynamics, fitting a deterministic model, or by requiring large computational resources for many simulations. We propose an approach that approximates the complex spatial spread dynamics with a Gaussian process. We first propose a flexible spatial extension to the well-known SIR stochastic process, and then we derive a moment-closure approximation to this stochastic process. This moment-closure approximation yields ordinary differential equations for the evolution of the means and covariances of the susceptibles and infectious through time. Because these ODEs are a bottleneck to fitting our model by MCMC, we approximate them using a low-rank emulator. This approximation serves as the basis for our hierarchical model for noisy, underreported counts of new infections by spatial location and time. We demonstrate using our model to conduct inference on simulated infections from the underlying, true spatial SIR jump process. We then apply our method to model counts of new Zika infections in Brazil from late 2015 through early 2016.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11261348/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141733496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A generalized outcome-adaptive sequential multiple assignment randomized trial design. 广义结果适应性顺序多重分配随机试验设计。
IF 1.4 4区 数学 Q3 BIOLOGY Pub Date : 2024-07-01 DOI: 10.1093/biomtc/ujae073
Xue Yang, Yu Cheng, Peter F Thall, Abdus S Wahed

A dynamic treatment regime (DTR) is a mathematical representation of a multistage decision process. When applied to sequential treatment selection in medical settings, DTRs are useful for identifying optimal therapies for chronic diseases such as AIDs, mental illnesses, substance abuse, and many cancers. Sequential multiple assignment randomized trials (SMARTs) provide a useful framework for constructing DTRs and providing unbiased between-DTR comparisons. A limitation of SMARTs is that they ignore data from past patients that may be useful for reducing the probability of exposing new patients to inferior treatments. In practice, this may result in decreased treatment adherence or dropouts. To address this problem, we propose a generalized outcome-adaptive (GO) SMART design that adaptively unbalances stage-specific randomization probabilities in favor of treatments observed to be more effective in previous patients. To correct for bias induced by outcome adaptive randomization, we propose G-estimators and inverse-probability-weighted estimators of DTR effects embedded in a GO-SMART and show analytically that they are consistent. We report simulation results showing that, compared to a SMART, Response-Adaptive SMART and SMART with adaptive randomization, a GO-SMART design treats significantly more patients with the optimal DTR and achieves a larger number of total responses while maintaining similar or better statistical power.

动态治疗机制(DTR)是多阶段决策过程的数学表示。当应用于医疗环境中的序贯治疗选择时,动态治疗机制有助于确定慢性疾病(如艾滋病、精神疾病、药物滥用和多种癌症)的最佳疗法。顺序多重分配随机试验(SMART)为构建 DTR 和提供 DTR 之间无偏见的比较提供了一个有用的框架。SMART 的局限性在于,它忽略了过去患者的数据,而这些数据可能有助于降低新患者接受劣质治疗的概率。在实践中,这可能会导致治疗依从性下降或患者放弃治疗。为了解决这个问题,我们提出了一种广义结果自适应(GO)SMART 设计,它能自适应地取消特定阶段随机化概率的平衡,使之有利于在既往患者身上观察到的更有效的治疗方法。为了纠正结果自适应随机化引起的偏差,我们提出了嵌入 GO-SMART 的 DTR 效果的 G 估计器和反概率加权估计器,并通过分析表明它们是一致的。我们报告的模拟结果表明,与 SMART、反应自适应 SMART 和带有自适应随机化的 SMART 相比,GO-SMART 设计能用最佳 DTR 治疗更多的患者,并获得更多的总反应数,同时保持相似或更好的统计功率。
{"title":"A generalized outcome-adaptive sequential multiple assignment randomized trial design.","authors":"Xue Yang, Yu Cheng, Peter F Thall, Abdus S Wahed","doi":"10.1093/biomtc/ujae073","DOIUrl":"https://doi.org/10.1093/biomtc/ujae073","url":null,"abstract":"<p><p>A dynamic treatment regime (DTR) is a mathematical representation of a multistage decision process. When applied to sequential treatment selection in medical settings, DTRs are useful for identifying optimal therapies for chronic diseases such as AIDs, mental illnesses, substance abuse, and many cancers. Sequential multiple assignment randomized trials (SMARTs) provide a useful framework for constructing DTRs and providing unbiased between-DTR comparisons. A limitation of SMARTs is that they ignore data from past patients that may be useful for reducing the probability of exposing new patients to inferior treatments. In practice, this may result in decreased treatment adherence or dropouts. To address this problem, we propose a generalized outcome-adaptive (GO) SMART design that adaptively unbalances stage-specific randomization probabilities in favor of treatments observed to be more effective in previous patients. To correct for bias induced by outcome adaptive randomization, we propose G-estimators and inverse-probability-weighted estimators of DTR effects embedded in a GO-SMART and show analytically that they are consistent. We report simulation results showing that, compared to a SMART, Response-Adaptive SMART and SMART with adaptive randomization, a GO-SMART design treats significantly more patients with the optimal DTR and achieves a larger number of total responses while maintaining similar or better statistical power.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141896689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Post-selection inference in regression models for group testing data. 分组测试数据回归模型中的后选择推断。
IF 1.4 4区 数学 Q3 BIOLOGY Pub Date : 2024-07-01 DOI: 10.1093/biomtc/ujae101
Qinyan Shen, Karl Gregory, Xianzheng Huang

We develop a methodology for valid inference after variable selection in logistic regression when the responses are partially observed, that is, when one observes a set of error-prone testing outcomes instead of the true values of the responses. Aiming at selecting important covariates while accounting for missing information in the response data, we apply the expectation-maximization algorithm to compute maximum likelihood estimators subject to LASSO penalization. Subsequent to variable selection, we make inferences on the selected covariate effects by extending post-selection inference methodology based on the polyhedral lemma. Empirical evidence from our extensive simulation study suggests that our post-selection inference results are more reliable than those from naive inference methods that use the same data to perform variable selection and inference without adjusting for variable selection.

我们开发了一种在逻辑回归中选择变量后进行有效推断的方法,这种方法适用于部分观察到反应的情况,即观察到一组容易出错的测试结果而不是反应的真实值。为了选择重要的协变量,同时考虑响应数据中的缺失信息,我们采用期望最大化算法来计算受 LASSO 惩罚的最大似然估计值。在变量选择之后,我们根据多面体(polyhedral)lemma 扩展了选择后推断方法,从而对所选协变量的影响进行推断。大量模拟研究的经验证据表明,与使用相同数据进行变量选择和推断而不对变量选择进行调整的天真推断方法相比,我们的后选择推断结果更加可靠。
{"title":"Post-selection inference in regression models for group testing data.","authors":"Qinyan Shen, Karl Gregory, Xianzheng Huang","doi":"10.1093/biomtc/ujae101","DOIUrl":"https://doi.org/10.1093/biomtc/ujae101","url":null,"abstract":"<p><p>We develop a methodology for valid inference after variable selection in logistic regression when the responses are partially observed, that is, when one observes a set of error-prone testing outcomes instead of the true values of the responses. Aiming at selecting important covariates while accounting for missing information in the response data, we apply the expectation-maximization algorithm to compute maximum likelihood estimators subject to LASSO penalization. Subsequent to variable selection, we make inferences on the selected covariate effects by extending post-selection inference methodology based on the polyhedral lemma. Empirical evidence from our extensive simulation study suggests that our post-selection inference results are more reliable than those from naive inference methods that use the same data to perform variable selection and inference without adjusting for variable selection.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142280082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Controlling false discovery rate for mediator selection in high-dimensional data. 控制高维数据中中介选择的错误发现率
IF 1.4 4区 数学 Q3 BIOLOGY Pub Date : 2024-07-01 DOI: 10.1093/biomtc/ujae064
Ran Dai, Ruiyang Li, Seonjoo Lee, Ying Liu

The need to select mediators from a high dimensional data source, such as neuroimaging data and genetic data, arises in much scientific research. In this work, we formulate a multiple-hypothesis testing framework for mediator selection from a high-dimensional candidate set, and propose a method, which extends the recent development in false discovery rate (FDR)-controlled variable selection with knockoff to select mediators with FDR control. We show that the proposed method and algorithm achieved finite sample FDR control. We present extensive simulation results to demonstrate the power and finite sample performance compared with the existing method. Lastly, we demonstrate the method for analyzing the Adolescent Brain Cognitive Development (ABCD) study, in which the proposed method selects several resting-state functional magnetic resonance imaging connectivity markers as mediators for the relationship between adverse childhood events and the crystallized composite score in the NIH toolbox.

许多科学研究都需要从神经影像数据和遗传数据等高维数据源中选择中介因子。在这项工作中,我们提出了一个从高维候选集中选择中介因子的多重假设检验框架,并提出了一种方法,该方法扩展了最近在虚假发现率(FDR)控制变量选择方面的发展,并将其用于选择具有 FDR 控制的中介因子。我们证明了所提出的方法和算法实现了有限样本 FDR 控制。我们展示了大量仿真结果,证明了与现有方法相比,该方法的强大功能和有限样本性能。最后,我们展示了分析青少年脑认知发展(ABCD)研究的方法,在该研究中,所提出的方法选择了几个静息态功能磁共振成像连接标志物,作为童年不良事件与美国国立卫生研究院工具箱中的结晶综合评分之间关系的中介因子。
{"title":"Controlling false discovery rate for mediator selection in high-dimensional data.","authors":"Ran Dai, Ruiyang Li, Seonjoo Lee, Ying Liu","doi":"10.1093/biomtc/ujae064","DOIUrl":"10.1093/biomtc/ujae064","url":null,"abstract":"<p><p>The need to select mediators from a high dimensional data source, such as neuroimaging data and genetic data, arises in much scientific research. In this work, we formulate a multiple-hypothesis testing framework for mediator selection from a high-dimensional candidate set, and propose a method, which extends the recent development in false discovery rate (FDR)-controlled variable selection with knockoff to select mediators with FDR control. We show that the proposed method and algorithm achieved finite sample FDR control. We present extensive simulation results to demonstrate the power and finite sample performance compared with the existing method. Lastly, we demonstrate the method for analyzing the Adolescent Brain Cognitive Development (ABCD) study, in which the proposed method selects several resting-state functional magnetic resonance imaging connectivity markers as mediators for the relationship between adverse childhood events and the crystallized composite score in the NIH toolbox.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11285112/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141787238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Causal meta-analysis by integrating multiple observational studies with multivariate outcomes. 通过整合具有多变量结果的多项观察研究进行因果荟萃分析。
IF 1.4 4区 数学 Q3 BIOLOGY Pub Date : 2024-07-01 DOI: 10.1093/biomtc/ujae070
Subharup Guha, Yi Li

Integrating multiple observational studies to make unconfounded causal or descriptive comparisons of group potential outcomes in a large natural population is challenging. Moreover, retrospective cohorts, being convenience samples, are usually unrepresentative of the natural population of interest and have groups with unbalanced covariates. We propose a general covariate-balancing framework based on pseudo-populations that extends established weighting methods to the meta-analysis of multiple retrospective cohorts with multiple groups. Additionally, by maximizing the effective sample sizes of the cohorts, we propose a FLEXible, Optimized, and Realistic (FLEXOR) weighting method appropriate for integrative analyses. We develop new weighted estimators for unconfounded inferences on wide-ranging population-level features and estimands relevant to group comparisons of quantitative, categorical, or multivariate outcomes. Asymptotic properties of these estimators are examined. Through simulation studies and meta-analyses of TCGA datasets, we demonstrate the versatility and reliability of the proposed weighting strategy, especially for the FLEXOR pseudo-population.

整合多项观察性研究,对大量自然人群中的群体潜在结果进行无依据的因果或描述性比较,是一项具有挑战性的工作。此外,回顾性队列作为方便样本,通常不能代表感兴趣的自然人群,而且其群体的协变量也不平衡。我们提出了一种基于伪人群的一般协变量平衡框架,将已有的加权方法扩展到多组回顾性队列的荟萃分析中。此外,通过最大化队列的有效样本量,我们提出了一种适用于综合分析的灵活、优化和现实(FLEXOR)加权方法。我们开发了新的加权估计器,用于对与定量、分类或多元结果的组间比较相关的各种人群水平特征和估计因子进行无约束推断。对这些估计器的渐近特性进行了研究。通过对 TCGA 数据集的模拟研究和荟萃分析,我们证明了所提出的加权策略的通用性和可靠性,尤其是在 FLEXOR 伪群体中。
{"title":"Causal meta-analysis by integrating multiple observational studies with multivariate outcomes.","authors":"Subharup Guha, Yi Li","doi":"10.1093/biomtc/ujae070","DOIUrl":"10.1093/biomtc/ujae070","url":null,"abstract":"<p><p>Integrating multiple observational studies to make unconfounded causal or descriptive comparisons of group potential outcomes in a large natural population is challenging. Moreover, retrospective cohorts, being convenience samples, are usually unrepresentative of the natural population of interest and have groups with unbalanced covariates. We propose a general covariate-balancing framework based on pseudo-populations that extends established weighting methods to the meta-analysis of multiple retrospective cohorts with multiple groups. Additionally, by maximizing the effective sample sizes of the cohorts, we propose a FLEXible, Optimized, and Realistic (FLEXOR) weighting method appropriate for integrative analyses. We develop new weighted estimators for unconfounded inferences on wide-ranging population-level features and estimands relevant to group comparisons of quantitative, categorical, or multivariate outcomes. Asymptotic properties of these estimators are examined. Through simulation studies and meta-analyses of TCGA datasets, we demonstrate the versatility and reliability of the proposed weighting strategy, especially for the FLEXOR pseudo-population.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11285113/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141787237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal refinement of strata to balance covariates. 优化细化分层,平衡协变量。
IF 1.4 4区 数学 Q3 BIOLOGY Pub Date : 2024-07-01 DOI: 10.1093/biomtc/ujae061
Katherine Brumberg, Dylan S Small, Paul R Rosenbaum

What is the best way to split one stratum into two to maximally reduce the within-stratum imbalance in many covariates? We formulate this as an integer program and approximate the solution by randomized rounding of a linear program. A linear program may assign a fraction of a person to each refined stratum. Randomized rounding views fractional people as probabilities, assigning intact people to strata using biased coins. Randomized rounding is a well-studied theoretical technique for approximating the optimal solution of certain insoluble integer programs. When the number of people in a stratum is large relative to the number of covariates, we prove the following new results: (i) randomized rounding to split a stratum does very little randomizing, so it closely resembles the linear programming relaxation without splitting intact people; (ii) the linear relaxation and the randomly rounded solution place lower and upper bounds on the unattainable integer programming solution; and because of (i), these bounds are often close, thereby ratifying the usable randomly rounded solution. We illustrate using an observational study that balanced many covariates by forming matched pairs composed of 2016 patients selected from 5735 using a propensity score. Instead, we form 5 propensity score strata and refine them into 10 strata, obtaining excellent covariate balance while retaining all patients. An R package optrefine at CRAN implements the method. Supplementary materials are available online.

怎样才能最好地将一个分层一分为二,从而最大限度地减少许多协变量在分层内的不平衡?我们将其表述为一个整数程序,并通过线性程序的随机舍入来近似求解。线性程序可能会将一部分人分配到每个细化分层。随机四舍五入法将小数人视为概率,使用有偏差的硬币将完整的人分配到分层中。随机四舍五入是一种经过充分研究的理论技术,用于逼近某些无法解决的整数程序的最优解。当分层中的人数相对于协变量的数量较多时,我们证明了以下新结果:(i) 随机舍入分割分层的随机化程度很低,因此它非常类似于不分割完整人数的线性规划松弛法;(ii) 线性松弛法和随机舍入解为无法实现的整数规划解设定了下限和上限;由于(i)的原因,这些上限往往很接近,从而证明了随机舍入解的可用性。我们使用一项观察性研究进行说明,该研究通过使用倾向得分从 5735 名患者中选出 2016 名患者组成匹配对,平衡了许多协变量。而我们形成了 5 个倾向得分层,并将其细化为 10 个层,从而在保留所有患者的同时获得了极佳的协变量平衡。CRAN 上的 R 软件包 optrefine 实现了这一方法。补充材料可在线获取。
{"title":"Optimal refinement of strata to balance covariates.","authors":"Katherine Brumberg, Dylan S Small, Paul R Rosenbaum","doi":"10.1093/biomtc/ujae061","DOIUrl":"https://doi.org/10.1093/biomtc/ujae061","url":null,"abstract":"<p><p>What is the best way to split one stratum into two to maximally reduce the within-stratum imbalance in many covariates? We formulate this as an integer program and approximate the solution by randomized rounding of a linear program. A linear program may assign a fraction of a person to each refined stratum. Randomized rounding views fractional people as probabilities, assigning intact people to strata using biased coins. Randomized rounding is a well-studied theoretical technique for approximating the optimal solution of certain insoluble integer programs. When the number of people in a stratum is large relative to the number of covariates, we prove the following new results: (i) randomized rounding to split a stratum does very little randomizing, so it closely resembles the linear programming relaxation without splitting intact people; (ii) the linear relaxation and the randomly rounded solution place lower and upper bounds on the unattainable integer programming solution; and because of (i), these bounds are often close, thereby ratifying the usable randomly rounded solution. We illustrate using an observational study that balanced many covariates by forming matched pairs composed of 2016 patients selected from 5735 using a propensity score. Instead, we form 5 propensity score strata and refine them into 10 strata, obtaining excellent covariate balance while retaining all patients. An R package optrefine at CRAN implements the method. Supplementary materials are available online.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141589543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Biometrics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1