Biometrics最新文献

Vine copula mixed models for meta-analysis of diagnostic accuracy studies without a gold standard.

IF 1.4 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2025-04-02 DOI: 10.1093/biomtc/ujaf037

Aristidis K Nikoloulopoulos

Numerous statistical models have been proposed for conducting meta-analysis of diagnostic accuracy studies when a gold standard is available. However, in real-world scenarios, the gold standard test may not be perfect due to several factors such as measurement error, non-availability, invasiveness, or high cost. A generalized linear mixed model (GLMM) is currently recommended to account for an imperfect reference test. We propose vine copula mixed models for meta-analysis of diagnostic test accuracy studies with an imperfect reference standard. Our general models include the GLMM as a special case, can have arbitrary univariate distributions for the random effects, and can provide tail dependencies and asymmetries. Our general methodology is demonstrated with an extensive simulation study and illustrated by insightfully re-analyzing the data of a meta-analysis of the Papanicolaou test that diagnoses cervical neoplasia. Our study suggests that there can be an improvement on GLMM and makes the argument for moving to vine copula random effects models.

在有金标准的情况下，人们提出了许多统计模型来对诊断准确性研究进行荟萃分析。然而，在现实世界中，由于测量误差、不可用性、侵入性或高成本等多种因素，金标准检验可能并不完美。目前推荐使用广义线性混合模型（GLMM）来考虑不完善的参考检验。我们提出了藤状共轭混合模型，用于对参考标准不完善的诊断测试准确性研究进行荟萃分析。我们的一般模型包括作为特例的 GLMM，随机效应可以有任意的单变量分布，并且可以提供尾部依赖性和非对称性。我们通过大量的模拟研究证明了我们的一般方法，并通过对诊断宫颈肿瘤的巴氏试验的荟萃分析数据进行深入的重新分析进行了说明。我们的研究表明，GLMM 可以有所改进，并为转向藤状 copula 随机效应模型提供了论据。

引用次数: 0

Statistical inference on the relative risk following covariate-adaptive randomization.

IF 1.4 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2025-04-02 DOI: 10.1093/biomtc/ujaf036

Fengyu Zhao, Yang Liu, Feifang Hu

Covariate-adaptive randomization (CAR) is widely adopted in clinical trials to ensure balanced treatment allocations across key baseline covariates. Although much research has focused on analyzing average treatment effects, the inference of relative risk under CAR experiments has been less thoroughly explored. In this study, we examine a covariate-adjusted estimate of relative risk and investigate the properties of its associated hypothesis tests under CAR. We first derive the theoretical properties of the covariate-adjusted relative risk for a broad class of CAR procedures. Our findings indicate that conventional tests for relative risk tend to be conservative, leading to reduced type I error rates. To mitigate this issue, we introduce model-based and model-robust methods that enhance the estimation of standard errors. We demonstrate the validity and usage of model-robust and model-based adjusted tests. Extensive numerical studies have been conducted to demonstrate our theoretical findings and the favorable properties of the proposed adjustment methods.

临床试验中广泛采用协变量自适应随机化（CAR），以确保关键基线协变量的治疗分配均衡。尽管很多研究都集中于分析平均治疗效果，但对 CAR 试验下相对风险的推断探讨得还不够深入。在本研究中，我们研究了经协变因素调整的相对风险估计值，并探讨了其在 CAR 条件下的相关假设检验特性。首先，我们推导出了一大类 CAR 程序的协变量调整后相对风险的理论属性。我们的研究结果表明，传统的相对风险检验趋于保守，导致 I 类错误率降低。为了缓解这一问题，我们引入了基于模型和模型稳健的方法，以加强对标准误差的估计。我们展示了基于模型和基于模型的调整检验的有效性和使用方法。我们进行了广泛的数值研究，以证明我们的理论发现和所提出的调整方法的有利特性。

引用次数: 0

Estimating weighted quantile treatment effects with missing outcome data by double sampling.

IF 1.4 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2025-04-02 DOI: 10.1093/biomtc/ujaf038

Shuo Sun, Sebastien Haneuse, Alexander W Levis, Catherine Lee, David E Arterburn, Heidi Fischer, Susan Shortreed, Rajarshi Mukherjee

Causal weighted quantile treatment effects (WQTEs) complement standard mean-focused causal contrasts when interest lies at the tails of the counterfactual distribution. However, existing methods for estimating and inferring causal WQTEs assume complete data on all relevant factors, which is often not the case in practice, particularly when the data are not collected for research purposes, such as electronic health records (EHRs) and disease registries. Furthermore, these data may be particularly susceptible to the outcome data being missing-not-at-random (MNAR). This paper proposes to use double sampling, through which the otherwise missing data are ascertained on a sub-sample of study units, as a strategy to mitigate bias due to MNAR data in estimating causal WQTEs. With the additional data, we present identifying conditions that do not require missingness assumptions in the original data. We then propose a novel inverse-probability weighted estimator and derive its asymptotic properties, both pointwise at specific quantiles and uniformly across quantiles over some compact subset of (0,1), allowing the propensity score and double-sampling probabilities to be estimated. For practical inference, we develop a bootstrap method that can be used for both pointwise and uniform inference. A simulation study is conducted to examine the finite sample performance of the proposed estimators. We illustrate the proposed method using EHR data examining the relative effects of 2 bariatric surgery procedures on BMI loss 3 years post-surgery.

当关注点位于反事实分布的尾部时，因果加权量子治疗效应（WQTE）是对标准的以平均值为重点的因果对比的补充。然而，估计和推断因果加权量子治疗效应的现有方法假定所有相关因素的数据都是完整的，而实际情况往往并非如此，特别是当数据不是出于研究目的而收集时，如电子健康记录（EHR）和疾病登记。此外，这些数据可能特别容易造成结果数据的非随机遗漏（MNAR）。本文建议使用双重抽样，即从研究单位的子样本中确定原本缺失的数据，以此作为一种策略，在估算因果性 WQTE 时减少因 MNAR 数据造成的偏差。利用附加数据，我们提出了不需要原始数据中缺失假设的识别条件。然后，我们提出了一种新颖的反概率加权估计器，并推导出其渐近特性，包括在特定量化点上的渐近特性，以及在某个紧凑子集（0,1）上均匀跨量化点的渐近特性，从而可以估计倾向得分和双重抽样概率。为了进行实际推断，我们开发了一种自举法，既可用于点推断，也可用于均匀推断。我们进行了一项模拟研究，以检验所提出的估计器的有限样本性能。我们使用电子病历数据说明了所提出的方法，该数据检验了两种减肥手术对术后 3 年体重指数下降的相对影响。

{"title":"Estimating weighted quantile treatment effects with missing outcome data by double sampling.","authors":"Shuo Sun, Sebastien Haneuse, Alexander W Levis, Catherine Lee, David E Arterburn, Heidi Fischer, Susan Shortreed, Rajarshi Mukherjee","doi":"10.1093/biomtc/ujaf038","DOIUrl":"10.1093/biomtc/ujaf038","url":null,"abstract":"Causal weighted quantile treatment effects (WQTEs) complement standard mean-focused causal contrasts when interest lies at the tails of the counterfactual distribution. However, existing methods for estimating and inferring causal WQTEs assume complete data on all relevant factors, which is often not the case in practice, particularly when the data are not collected for research purposes, such as electronic health records (EHRs) and disease registries. Furthermore, these data may be particularly susceptible to the outcome data being missing-not-at-random (MNAR). This paper proposes to use double sampling, through which the otherwise missing data are ascertained on a sub-sample of study units, as a strategy to mitigate bias due to MNAR data in estimating causal WQTEs. With the additional data, we present identifying conditions that do not require missingness assumptions in the original data. We then propose a novel inverse-probability weighted estimator and derive its asymptotic properties, both pointwise at specific quantiles and uniformly across quantiles over some compact subset of (0,1), allowing the propensity score and double-sampling probabilities to be estimated. For practical inference, we develop a bootstrap method that can be used for both pointwise and uniform inference. A simulation study is conducted to examine the finite sample performance of the proposed estimators. We illustrate the proposed method using EHR data examining the relative effects of 2 bariatric surgery procedures on BMI loss 3 years post-surgery.","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 2","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11973573/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143794495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Power-enhanced two-sample mean tests for high-dimensional microbiome compositional data.

IF 1.4 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2025-04-02 DOI: 10.1093/biomtc/ujaf034

Danning Li, Lingzhou Xue, Haoyi Yang, Xiufan Yu

Testing differences in mean vectors is a fundamental task in the analysis of high-dimensional microbiome compositional data. Existing methods may suffer from low power if the underlying signal pattern is in a situation that does not favor the deployed test. In this work, we develop 2-sample power-enhanced mean tests for high-dimensional compositional data based on the combination of $P$-values, which integrates strengths from 2 popular types of tests: the maximum-type test and the quadratic-type test. We provide rigorous theoretical guarantees on the proposed tests, showing accurate Type-I error rate control and enhanced testing power. Our method boosts the testing power toward a broader alternative space, which yields robust performance across a wide range of signal pattern settings. Our methodology and theory also contribute to the literature on power enhancement and Gaussian approximation for high-dimensional hypothesis testing. We demonstrate the performance of our method on both simulated data and real-world microbiome data, showing that our proposed approach improves the testing power substantially compared to existing methods.

{"title":"Power-enhanced two-sample mean tests for high-dimensional microbiome compositional data.","authors":"Danning Li, Lingzhou Xue, Haoyi Yang, Xiufan Yu","doi":"10.1093/biomtc/ujaf034","DOIUrl":"10.1093/biomtc/ujaf034","url":null,"abstract":"Testing differences in mean vectors is a fundamental task in the analysis of high-dimensional microbiome compositional data. Existing methods may suffer from low power if the underlying signal pattern is in a situation that does not favor the deployed test. In this work, we develop 2-sample power-enhanced mean tests for high-dimensional compositional data based on the combination of $P$-values, which integrates strengths from 2 popular types of tests: the maximum-type test and the quadratic-type test. We provide rigorous theoretical guarantees on the proposed tests, showing accurate Type-I error rate control and enhanced testing power. Our method boosts the testing power toward a broader alternative space, which yields robust performance across a wide range of signal pattern settings. Our methodology and theory also contribute to the literature on power enhancement and Gaussian approximation for high-dimensional hypothesis testing. We demonstrate the performance of our method on both simulated data and real-world microbiome data, showing that our proposed approach improves the testing power substantially compared to existing methods.","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 2","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11962435/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143762714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Change surface regression for nonlinear subgroup identification with application to warfarin pharmacogenomics data. 变化面回归非线性亚群识别在华法林药物基因组学数据中的应用。

IF 1.4 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2025-01-07 DOI: 10.1093/biomtc/ujae169

Pan Liu, Yaguang Li, Jialiang Li

Pharmacogenomics stands as a pivotal driver toward personalized medicine, aiming to optimize drug efficacy while minimizing adverse effects by uncovering the impact of genetic variations on inter-individual outcome variability. Despite its promise, the intricate landscape of drug metabolism introduces complexity, where the correlation between drug response and genes can be shaped by numerous nongenetic factors, often exhibiting heterogeneity across diverse subpopulations. This challenge is particularly pronounced in datasets such as the International Warfarin Pharmacogenetic Consortium (IWPC), which encompasses diverse patient information from multiple nations. To capture the between-patient heterogeneity in dosing requirement, we formulate a novel change surface model as a model-based approach for multiple subgroup identification in complex datasets. A key feature of our approach is its ability to accommodate nonlinear subgroup divisions, providing a clearer understanding of dynamic drug-gene associations. Furthermore, our model effectively handles high-dimensional data through a doubly penalized approach, ensuring both interpretability and adaptability. We propose an iterative 2-stage method that combines a change point detection technique in the first stage with a smoothed local adaptive majorize-minimization algorithm for surface regression in the second stage. Performance of the proposed methods is evaluated through extensive numerical studies. Application of our method to the IWPC dataset leads to significant new findings, where 3 subgroups subject to different pharmacogenomic relationships are identified, contributing valuable insights into the complex dynamics of drug-gene associations in patients.

药物基因组学是个性化医疗的关键驱动力，旨在通过揭示遗传变异对个体间结果变异性的影响来优化药物疗效，同时最大限度地减少不良反应。尽管前景很好，但药物代谢的复杂图景引入了复杂性，其中药物反应和基因之间的相关性可以由许多非遗传因素塑造，通常在不同的亚群中表现出异质性。这一挑战在国际华法林药物遗传联盟（IWPC）等数据集中尤其明显，该数据集包含来自多个国家的各种患者信息。为了捕捉患者之间剂量需求的异质性，我们制定了一种新的变化面模型，作为一种基于模型的方法，用于复杂数据集中的多亚组识别。我们方法的一个关键特征是它能够适应非线性亚群划分，提供对动态药物基因关联的更清晰理解。此外，我们的模型通过双重惩罚方法有效地处理高维数据，确保了可解释性和适应性。我们提出了一种迭代的两阶段方法，该方法结合了第一阶段的变化点检测技术和第二阶段的光滑局部自适应最大化算法用于表面回归。通过广泛的数值研究评估了所提出方法的性能。将我们的方法应用于IWPC数据集导致了重要的新发现，其中确定了受不同药物基因组学关系影响的3个亚组，为患者药物-基因关联的复杂动态提供了有价值的见解。

{"title":"Change surface regression for nonlinear subgroup identification with application to warfarin pharmacogenomics data.","authors":"Pan Liu, Yaguang Li, Jialiang Li","doi":"10.1093/biomtc/ujae169","DOIUrl":"https://doi.org/10.1093/biomtc/ujae169","url":null,"abstract":"Pharmacogenomics stands as a pivotal driver toward personalized medicine, aiming to optimize drug efficacy while minimizing adverse effects by uncovering the impact of genetic variations on inter-individual outcome variability. Despite its promise, the intricate landscape of drug metabolism introduces complexity, where the correlation between drug response and genes can be shaped by numerous nongenetic factors, often exhibiting heterogeneity across diverse subpopulations. This challenge is particularly pronounced in datasets such as the International Warfarin Pharmacogenetic Consortium (IWPC), which encompasses diverse patient information from multiple nations. To capture the between-patient heterogeneity in dosing requirement, we formulate a novel change surface model as a model-based approach for multiple subgroup identification in complex datasets. A key feature of our approach is its ability to accommodate nonlinear subgroup divisions, providing a clearer understanding of dynamic drug-gene associations. Furthermore, our model effectively handles high-dimensional data through a doubly penalized approach, ensuring both interpretability and adaptability. We propose an iterative 2-stage method that combines a change point detection technique in the first stage with a smoothed local adaptive majorize-minimization algorithm for surface regression in the second stage. Performance of the proposed methods is evaluated through extensive numerical studies. Application of our method to the IWPC dataset leads to significant new findings, where 3 subgroups subject to different pharmacogenomic relationships are identified, contributing valuable insights into the complex dynamics of drug-gene associations in patients.","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142999226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Causal inference with cross-temporal design. 跨时间设计的因果推理。

IF 1.4 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2025-01-07 DOI: 10.1093/biomtc/ujae163

Yi Cao, Pedro L Gozalo, Roee Gutman

When many participants in a randomized trial do not comply with their assigned intervention, the randomized encouragement design is a possible solution. In this design, the causal effects of the intervention can be estimated among participants who would have experienced the intervention if encouraged. For many policy interventions, encouragements cannot be randomized and investigators need to rely on observational data. To address this, we propose a cross-temporal design, which uses time to mimic a randomized encouragement experiment. However, time may be confounded with temporal trends that influence the outcomes. To disentangle these trends from the intervention effects, we replace the commonly used exclusion restrictions with temporal assumptions. We develop Bayesian procedures to estimate the causal effects and compare it to instrumental variables and matching approaches in simulations. The Bayesian approach outperforms the other 2 approaches in terms of estimation accuracy, and it is relatively robust to various violations of the common trends assumption. Taking advantage of the expansion of the Medicare Advantage (MA) program between 2011 and 2017, we implement the proposed method to estimate the effects of MA enrollment on the risk of skilled nursing facility residents being re-hospitalized within 30 days after discharge from the hospital.

当随机试验中的许多参与者不遵守分配的干预措施时，随机鼓励设计是一种可能的解决方案。在这个设计中，干预的因果效应可以在参与者中估计，如果鼓励的话，他们会经历干预。对于许多政策干预，鼓励措施不能是随机的，调查人员需要依靠观察数据。为了解决这个问题，我们提出了一个跨时间设计，它使用时间来模拟随机鼓励实验。但是，时间可能与影响结果的时间趋势相混淆。为了将这些趋势与干预效应区分开来，我们用时间假设取代了常用的排除限制。我们开发了贝叶斯程序来估计因果效应，并将其与模拟中的工具变量和匹配方法进行比较。贝叶斯方法在估计精度方面优于其他两种方法，并且对各种违反共同趋势假设的情况具有相对的鲁棒性。利用2011年至2017年间医疗保险优势（MA）计划的扩大，我们实施了所提出的方法来估计MA登记对熟练护理机构居民出院后30天内再次住院风险的影响。

{"title":"Causal inference with cross-temporal design.","authors":"Yi Cao, Pedro L Gozalo, Roee Gutman","doi":"10.1093/biomtc/ujae163","DOIUrl":"10.1093/biomtc/ujae163","url":null,"abstract":"When many participants in a randomized trial do not comply with their assigned intervention, the randomized encouragement design is a possible solution. In this design, the causal effects of the intervention can be estimated among participants who would have experienced the intervention if encouraged. For many policy interventions, encouragements cannot be randomized and investigators need to rely on observational data. To address this, we propose a cross-temporal design, which uses time to mimic a randomized encouragement experiment. However, time may be confounded with temporal trends that influence the outcomes. To disentangle these trends from the intervention effects, we replace the commonly used exclusion restrictions with temporal assumptions. We develop Bayesian procedures to estimate the causal effects and compare it to instrumental variables and matching approaches in simulations. The Bayesian approach outperforms the other 2 approaches in terms of estimation accuracy, and it is relatively robust to various violations of the common trends assumption. Taking advantage of the expansion of the Medicare Advantage (MA) program between 2011 and 2017, we implement the proposed method to estimate the effects of MA enrollment on the risk of skilled nursing facility residents being re-hospitalized within 30 days after discharge from the hospital.","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11725568/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142969461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Penalized G-estimation for effect modifier selection in a structural nested mean model for repeated outcomes. 在重复结果的结构嵌套平均模型中对效果修饰符选择的惩罚g估计。

IF 1.4 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2025-01-07 DOI: 10.1093/biomtc/ujae165

Ajmery Jaman, Guanbo Wang, Ashkan Ertefaie, Michèle Bally, Renée Lévesque, Robert W Platt, Mireille E Schnitzer

Effect modification occurs when the impact of the treatment on an outcome varies based on the levels of other covariates known as effect modifiers. Modeling these effect differences is important for etiological goals and for purposes of optimizing treatment. Structural nested mean models (SNMMs) are useful causal models for estimating the potentially heterogeneous effect of a time-varying exposure on the mean of an outcome in the presence of time-varying confounding. A data-adaptive selection approach is necessary if the effect modifiers are unknown a priori and need to be identified. Although variable selection techniques are available for estimating the conditional average treatment effects using marginal structural models or for developing optimal dynamic treatment regimens, all of these methods consider a single end-of-follow-up outcome. In the context of an SNMM for repeated outcomes, we propose a doubly robust penalized G-estimator for the causal effect of a time-varying exposure with a simultaneous selection of effect modifiers and prove the oracle property of our estimator. We conduct a simulation study for the evaluation of its performance in finite samples and verification of its double-robustness property. Our work is motivated by the study of hemodiafiltration for treating patients with end-stage renal disease at the Centre Hospitalier de l'Université de Montréal. We apply the proposed method to investigate the effect heterogeneity of dialysis facility on the repeated session-specific hemodiafiltration outcomes.

当治疗对结果的影响基于其他称为效果修饰因子的协变量的水平而变化时，就会发生效果修饰。模拟这些效应差异对于病因学目标和优化治疗非常重要。结构嵌套均值模型（snmm）是一种有用的因果模型，用于估计时变暴露对时变混杂存在下结果均值的潜在异质性影响。如果效果修饰符是先验未知的，需要识别，则需要采用数据自适应选择方法。尽管变量选择技术可用于使用边际结构模型估计条件平均治疗效果或开发最佳动态治疗方案，但所有这些方法都考虑单个随访结束结果。在重复结果的SNMM的背景下，我们提出了一个双鲁棒惩罚g估计量，用于时变暴露的因果效应，同时选择效应修饰符，并证明了我们的估计量的预言性。我们进行了仿真研究，以评估其在有限样本中的性能并验证其双鲁棒性。我们工作的动机是在蒙特里萨大学医院中心进行血液滤过治疗终末期肾病患者的研究。我们应用所提出的方法来研究透析设备的异质性对重复时段特异性血液滤过结果的影响。

{"title":"Penalized G-estimation for effect modifier selection in a structural nested mean model for repeated outcomes.","authors":"Ajmery Jaman, Guanbo Wang, Ashkan Ertefaie, Michèle Bally, Renée Lévesque, Robert W Platt, Mireille E Schnitzer","doi":"10.1093/biomtc/ujae165","DOIUrl":"https://doi.org/10.1093/biomtc/ujae165","url":null,"abstract":"Effect modification occurs when the impact of the treatment on an outcome varies based on the levels of other covariates known as effect modifiers. Modeling these effect differences is important for etiological goals and for purposes of optimizing treatment. Structural nested mean models (SNMMs) are useful causal models for estimating the potentially heterogeneous effect of a time-varying exposure on the mean of an outcome in the presence of time-varying confounding. A data-adaptive selection approach is necessary if the effect modifiers are unknown a priori and need to be identified. Although variable selection techniques are available for estimating the conditional average treatment effects using marginal structural models or for developing optimal dynamic treatment regimens, all of these methods consider a single end-of-follow-up outcome. In the context of an SNMM for repeated outcomes, we propose a doubly robust penalized G-estimator for the causal effect of a time-varying exposure with a simultaneous selection of effect modifiers and prove the oracle property of our estimator. We conduct a simulation study for the evaluation of its performance in finite samples and verification of its double-robustness property. Our work is motivated by the study of hemodiafiltration for treating patients with end-stage renal disease at the Centre Hospitalier de l'Université de Montréal. We apply the proposed method to investigate the effect heterogeneity of dialysis facility on the repeated session-specific hemodiafiltration outcomes.","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142999234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Weighted Q-learning for optimal dynamic treatment regimes with nonignorable missing covariates. 带不可忽略缺失协变量的最优动态治疗方案加权q学习。

IF 1.4 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2025-01-07 DOI: 10.1093/biomtc/ujae161

Jian Sun, Bo Fu, Li Su

Dynamic treatment regimes (DTRs) formalize medical decision-making as a sequence of rules for different stages, mapping patient-level information to recommended treatments. In practice, estimating an optimal DTR using observational data from electronic medical record (EMR) databases can be complicated by nonignorable missing covariates resulting from informative monitoring of patients. Since complete case analysis can provide consistent estimation of outcome model parameters under the assumption of outcome-independent missingness, Q-learning is a natural approach to accommodating nonignorable missing covariates. However, the backward induction algorithm used in Q-learning can introduce challenges, as nonignorable missing covariates at later stages can result in nonignorable missing pseudo-outcomes at earlier stages, leading to suboptimal DTRs, even if the longitudinal outcome variables are fully observed. To address this unique missing data problem in DTR settings, we propose 2 weighted Q-learning approaches where inverse probability weights for missingness of the pseudo-outcomes are obtained through estimating equations with valid nonresponse instrumental variables or sensitivity analysis. The asymptotic properties of the weighted Q-learning estimators are derived, and the finite-sample performance of the proposed methods is evaluated and compared with alternative methods through extensive simulation studies. Using EMR data from the Medical Information Mart for Intensive Care database, we apply the proposed methods to investigate the optimal fluid strategy for sepsis patients in intensive care units.

动态治疗方案（DTRs）将医疗决策形式化为不同阶段的一系列规则，将患者层面的信息映射到推荐的治疗方法。在实践中，使用来自电子病历（EMR）数据库的观察数据估计最佳DTR可能会因患者信息监测导致的不可忽略的缺失协变量而变得复杂。由于完整的案例分析可以在与结果无关的缺失假设下提供结果模型参数的一致估计，因此q -学习是容纳不可忽略的缺失协变量的自然方法。然而，q学习中使用的反向归纳算法可能会带来挑战，因为后期不可忽略的缺失协变量可能导致早期不可忽略的缺失伪结果，从而导致次优dtr，即使纵向结果变量被完全观察到。为了解决DTR设置中这种独特的缺失数据问题，我们提出了2种加权q学习方法，其中通过估计具有有效非响应工具变量或敏感性分析的方程来获得伪结果缺失的逆概率权重。推导了加权q学习估计量的渐近性质，并通过广泛的仿真研究评估了所提出方法的有限样本性能，并与其他方法进行了比较。利用重症监护医疗信息市场数据库的EMR数据，我们应用所提出的方法来研究重症监护病房脓毒症患者的最佳液体策略。

{"title":"Weighted Q-learning for optimal dynamic treatment regimes with nonignorable missing covariates.","authors":"Jian Sun, Bo Fu, Li Su","doi":"10.1093/biomtc/ujae161","DOIUrl":"https://doi.org/10.1093/biomtc/ujae161","url":null,"abstract":"Dynamic treatment regimes (DTRs) formalize medical decision-making as a sequence of rules for different stages, mapping patient-level information to recommended treatments. In practice, estimating an optimal DTR using observational data from electronic medical record (EMR) databases can be complicated by nonignorable missing covariates resulting from informative monitoring of patients. Since complete case analysis can provide consistent estimation of outcome model parameters under the assumption of outcome-independent missingness, Q-learning is a natural approach to accommodating nonignorable missing covariates. However, the backward induction algorithm used in Q-learning can introduce challenges, as nonignorable missing covariates at later stages can result in nonignorable missing pseudo-outcomes at earlier stages, leading to suboptimal DTRs, even if the longitudinal outcome variables are fully observed. To address this unique missing data problem in DTR settings, we propose 2 weighted Q-learning approaches where inverse probability weights for missingness of the pseudo-outcomes are obtained through estimating equations with valid nonresponse instrumental variables or sensitivity analysis. The asymptotic properties of the weighted Q-learning estimators are derived, and the finite-sample performance of the proposed methods is evaluated and compared with alternative methods through extensive simulation studies. Using EMR data from the Medical Information Mart for Intensive Care database, we apply the proposed methods to investigate the optimal fluid strategy for sepsis patients in intensive care units.","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142943773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

High-dimensional partially linear functional Cox models. 高维部分线性泛函Cox模型。

IF 1.4 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2025-01-07 DOI: 10.1093/biomtc/ujae164

Xin Chen, Hua Liu, Jiaqi Men, Jinhong You

As a commonly employed method for analyzing time-to-event data involving functional predictors, the functional Cox model assumes a linear relationship between the functional principal component (FPC) scores of the functional predictors and the hazard rates. However, in practical scenarios, such as our study on the survival time of kidney transplant recipients, this assumption often fails to hold. To address this limitation, we introduce a class of high-dimensional partially linear functional Cox models, which accommodates the non-linear effects of functional predictors on the response and allows for diverging numbers of scalar predictors and FPCs as the sample size increases. We employ the group smoothly clipped absolute deviation method to select relevant scalar predictors and FPCs, and use B-splines to obtain a smoothed estimate of the non-linear effect. The finite sample performance of the estimates is evaluated through simulation studies. The model is also applied to a kidney transplant dataset, allowing us to make inferences about the non-linear effects of functional predictors on patients' hazard rates, as well as to identify significant scalar predictors for long-term survival time.

作为分析涉及功能预测因子的时间到事件数据的常用方法，功能Cox模型假设功能预测因子的功能主成分（FPC）得分与风险率之间存在线性关系。然而，在实际情况下，例如我们对肾移植受者生存时间的研究，这种假设往往不成立。为了解决这一限制，我们引入了一类高维部分线性功能Cox模型，该模型适应功能预测因子对响应的非线性影响，并允许随着样本量的增加而分散标量预测因子和fpc的数量。我们采用组平滑裁剪绝对偏差法选择相关的标量预测因子和fpc，并使用b样条获得非线性效应的平滑估计。通过仿真研究评估了估计的有限样本性能。该模型还应用于肾移植数据集，使我们能够推断功能预测因子对患者危险率的非线性影响，并确定长期生存时间的重要标量预测因子。

{"title":"High-dimensional partially linear functional Cox models.","authors":"Xin Chen, Hua Liu, Jiaqi Men, Jinhong You","doi":"10.1093/biomtc/ujae164","DOIUrl":"https://doi.org/10.1093/biomtc/ujae164","url":null,"abstract":"As a commonly employed method for analyzing time-to-event data involving functional predictors, the functional Cox model assumes a linear relationship between the functional principal component (FPC) scores of the functional predictors and the hazard rates. However, in practical scenarios, such as our study on the survival time of kidney transplant recipients, this assumption often fails to hold. To address this limitation, we introduce a class of high-dimensional partially linear functional Cox models, which accommodates the non-linear effects of functional predictors on the response and allows for diverging numbers of scalar predictors and FPCs as the sample size increases. We employ the group smoothly clipped absolute deviation method to select relevant scalar predictors and FPCs, and use B-splines to obtain a smoothed estimate of the non-linear effect. The finite sample performance of the estimates is evaluated through simulation studies. The model is also applied to a kidney transplant dataset, allowing us to make inferences about the non-linear effects of functional predictors on patients' hazard rates, as well as to identify significant scalar predictors for long-term survival time.","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142977394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Jointly modeling means and variances for nonlinear mixed effects models with measurement errors and outliers.

IF 1.4 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2025-01-07 DOI: 10.1093/biomtc/ujaf018

Qian Ye, Lang Wu, Viviane Dias Lima

In longitudinal data analyses, the within-individual repeated measurements often exhibit large variations and these variations appear to change over time. Understanding the nature of the within-individual systematic and random variations allows us to conduct more efficient statistical inferences. Motivated by human immunodeficiency virus (HIV) viral dynamic studies, we considered a nonlinear mixed effects model for modeling the longitudinal means, together with a model for the within-individual variances which also allows us to address outliers in the repeated measurements. Statistical inference was then based on a joint model for the mean and variance, implemented by a computationally efficient approximate method. Extensive simulations evaluated the proposed method. We found that the proposed method produces more efficient estimates than the corresponding method without modeling the variances. Moreover, the proposed method provides robust inference against outliers. The proposed method was applied to a recent HIV-related dataset, with interesting new findings.

引用次数: 0