Pub Date : 2024-12-10Epub Date: 2024-10-23DOI: 10.1002/sim.10249
Hannah Klinkhammer, Christian Staerk, Carlo Maj, Peter M Krawitz, Andreas Mayr
Polygenic risk scores (PRS) aim to predict a trait from genetic information, relying on common genetic variants with low to medium effect sizes. As genotype data are high-dimensional in nature, it is crucial to develop methods that can be applied to large-scale data (large and large ). Many PRS tools aggregate univariate summary statistics from genome-wide association studies into a single score. Recent advancements allow simultaneous modeling of variant effects from individual-level genotype data. In this context, we introduced snpboost, an algorithm that applies statistical boosting on individual-level genotype data to estimate PRS via multivariable regression models. By processing variants iteratively in batches, snpboost can deal with large-scale cohort data. Having solved the technical obstacles due to data dimensionality, the methodological scope can now be broadened-focusing on key objectives for the clinical application of PRS. Similar to most methods in this context, snpboost has, so far, been restricted to quantitative and binary traits. Now, we incorporate more advanced alternatives-targeted to the particular aim and outcome. Adapting the loss function extends the snpboost framework to further data situations such as time-to-event and count data. Furthermore, alternative loss functions for continuous outcomes allow us to focus not only on the mean of the conditional distribution but also on other aspects that may be more helpful in the risk stratification of individual patients and can quantify prediction uncertainty, for example, median or quantile regression. This work enhances PRS fitting across multiple model classes previously unfeasible for this data type.
{"title":"Genetic Prediction Modeling in Large Cohort Studies via Boosting Targeted Loss Functions.","authors":"Hannah Klinkhammer, Christian Staerk, Carlo Maj, Peter M Krawitz, Andreas Mayr","doi":"10.1002/sim.10249","DOIUrl":"10.1002/sim.10249","url":null,"abstract":"<p><p>Polygenic risk scores (PRS) aim to predict a trait from genetic information, relying on common genetic variants with low to medium effect sizes. As genotype data are high-dimensional in nature, it is crucial to develop methods that can be applied to large-scale data (large <math> <semantics><mrow><mi>n</mi></mrow> <annotation>$$ n $$</annotation></semantics> </math> and large <math> <semantics><mrow><mi>p</mi></mrow> <annotation>$$ p $$</annotation></semantics> </math> ). Many PRS tools aggregate univariate summary statistics from genome-wide association studies into a single score. Recent advancements allow simultaneous modeling of variant effects from individual-level genotype data. In this context, we introduced snpboost, an algorithm that applies statistical boosting on individual-level genotype data to estimate PRS via multivariable regression models. By processing variants iteratively in batches, snpboost can deal with large-scale cohort data. Having solved the technical obstacles due to data dimensionality, the methodological scope can now be broadened-focusing on key objectives for the clinical application of PRS. Similar to most methods in this context, snpboost has, so far, been restricted to quantitative and binary traits. Now, we incorporate more advanced alternatives-targeted to the particular aim and outcome. Adapting the loss function extends the snpboost framework to further data situations such as time-to-event and count data. Furthermore, alternative loss functions for continuous outcomes allow us to focus not only on the mean of the conditional distribution but also on other aspects that may be more helpful in the risk stratification of individual patients and can quantify prediction uncertainty, for example, median or quantile regression. This work enhances PRS fitting across multiple model classes previously unfeasible for this data type.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":" ","pages":"5412-5430"},"PeriodicalIF":1.8,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11586906/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142508279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-10Epub Date: 2024-10-17DOI: 10.1002/sim.10246
Michael Schomaker, Helen McIlleron, Paolo Denti, Iván Díaz
There are limited options to estimate the treatment effects of variables which are continuous and measured at multiple time points, particularly if the true dose-response curve should be estimated as closely as possible. However, these situations may be of relevance: in pharmacology, one may be interested in how outcomes of people living with-and treated for-HIV, such as viral failure, would vary for time-varying interventions such as different drug concentration trajectories. A challenge for doing causal inference with continuous interventions is that the positivity assumption is typically violated. To address positivity violations, we develop projection functions, which reweigh and redefine the estimand of interest based on functions of the conditional support for the respective interventions. With these functions, we obtain the desired dose-response curve in areas of enough support, and otherwise a meaningful estimand that does not require the positivity assumption. We develop -computation type plug-in estimators for this case. Those are contrasted with g-computation estimators which are applied to continuous interventions without specifically addressing positivity violations, which we propose to be presented with diagnostics. The ideas are illustrated with longitudinal data from HIV positive children treated with an efavirenz-based regimen as part of the CHAPAS-3 trial, which enrolled children years in Zambia/Uganda. Simulations show in which situations a standard g-computation approach is appropriate, and in which it leads to bias and how the proposed weighted estimation approach then recovers the alternative estimand of interest.
对于连续性变量和在多个时间点测量的变量,估算其治疗效果的方法很有限,尤其是在需要尽可能接近真实剂量-反应曲线的情况下。然而,这些情况可能与此有关:在药理学中,人们可能会感兴趣的是,对于不同药物浓度轨迹等随时间变化的干预措施,艾滋病毒感染者和接受治疗者的结果(如病毒衰竭)会如何变化。对连续干预进行因果推断的一个挑战是,通常会违反正向性假设。为了解决违反正向性假设的问题,我们开发了投影函数,根据各干预措施的条件支持函数重新权衡和定义感兴趣的估计值。有了这些函数,我们就能在有足够支持度的区域获得所需的剂量-反应曲线,否则就能获得不需要正相关假设的有意义的估计值。在这种情况下,我们开发了 g $$ g $$ 计算型插件估计器。这些估算器与 g 计算估算器形成了鲜明对比,后者适用于连续干预,但不专门处理违反正相关性的情况,我们建议将其与诊断一起提出。我们使用 CHAPAS-3 试验中以依非韦伦为基础的治疗方案治疗的 HIV 阳性儿童的纵向数据来说明我们的想法,该试验在赞比亚/乌干达招募了 13 美元的儿童。模拟显示了标准 g 计算方法在哪些情况下是合适的,在哪些情况下会导致偏差,以及拟议的加权估计方法如何恢复感兴趣的替代估计值。
{"title":"Causal Inference for Continuous Multiple Time Point Interventions.","authors":"Michael Schomaker, Helen McIlleron, Paolo Denti, Iván Díaz","doi":"10.1002/sim.10246","DOIUrl":"10.1002/sim.10246","url":null,"abstract":"<p><p>There are limited options to estimate the treatment effects of variables which are continuous and measured at multiple time points, particularly if the true dose-response curve should be estimated as closely as possible. However, these situations may be of relevance: in pharmacology, one may be interested in how outcomes of people living with-and treated for-HIV, such as viral failure, would vary for time-varying interventions such as different drug concentration trajectories. A challenge for doing causal inference with continuous interventions is that the positivity assumption is typically violated. To address positivity violations, we develop projection functions, which reweigh and redefine the estimand of interest based on functions of the conditional support for the respective interventions. With these functions, we obtain the desired dose-response curve in areas of enough support, and otherwise a meaningful estimand that does not require the positivity assumption. We develop <math> <semantics><mrow><mi>g</mi></mrow> <annotation>$$ g $$</annotation></semantics> </math> -computation type plug-in estimators for this case. Those are contrasted with g-computation estimators which are applied to continuous interventions without specifically addressing positivity violations, which we propose to be presented with diagnostics. The ideas are illustrated with longitudinal data from HIV positive children treated with an efavirenz-based regimen as part of the CHAPAS-3 trial, which enrolled children <math> <semantics><mrow><mo><</mo> <mn>13</mn></mrow> <annotation>$$ <13 $$</annotation></semantics> </math> years in Zambia/Uganda. Simulations show in which situations a standard g-computation approach is appropriate, and in which it leads to bias and how the proposed weighted estimation approach then recovers the alternative estimand of interest.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":" ","pages":"5380-5400"},"PeriodicalIF":1.8,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11586917/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142475167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-10Epub Date: 2024-10-15DOI: 10.1002/sim.10229
Xinxin Sun, Yongyun Shin, Jennifer Elston Lafata, Stephen W Raudenbush
Within each of 170 physicians, patients were randomized to access e-assist, an online program that aimed to increase colorectal cancer screening (CRCS), or control. Compliance was partial: of the experimental patients accessed e-assist while no controls were provided the access. Of interest are the average causal effect of assignment to treatment and the complier average causal effect as well as the variation of these causal effects across physicians. Each physician generates probabilities of screening for experimental compliers (experimental patients who accessed e-assist), control compliers (controls who would have accessed e-assist had they been assigned to e-assist), and never takers (patients who would have avoided e-assist no matter what). Estimating physician-specific probabilities jointly over physicians poses novel challenges. We address these challenges by maximum likelihood, factoring a "complete-data likelihood" uniquely into the conditional distribution of screening and partially observed compliance given random effects and the distribution of random effects. We marginalize this likelihood using adaptive Gauss-Hermite quadrature. The approach is doubly iterative in that the conditional distribution defies analytic evaluation. Because the small sample size per physician constrains estimability of multiple random effects, we reduce their dimensionality using a shared random effects model having a factor analytic structure. We assess estimators and recommend sample sizes to produce reasonably accurate and precise estimates by simulation, and analyze data from a trial of a CRCS intervention.
{"title":"Variability in Causal Effects and Noncompliance in a Multisite Trial: A Bivariate Hierarchical Generalized Random Coefficients Model for a Binary Outcome.","authors":"Xinxin Sun, Yongyun Shin, Jennifer Elston Lafata, Stephen W Raudenbush","doi":"10.1002/sim.10229","DOIUrl":"10.1002/sim.10229","url":null,"abstract":"<p><p>Within each of 170 physicians, patients were randomized to access e-assist, an online program that aimed to increase colorectal cancer screening (CRCS), or control. Compliance was partial: <math> <semantics><mrow><mn>78.34</mn> <mo>%</mo></mrow> <annotation>$$ 78.34% $$</annotation></semantics> </math> of the experimental patients accessed e-assist while no controls were provided the access. Of interest are the average causal effect of assignment to treatment and the complier average causal effect as well as the variation of these causal effects across physicians. Each physician generates probabilities of screening for experimental compliers (experimental patients who accessed e-assist), control compliers (controls who would have accessed e-assist had they been assigned to e-assist), and never takers (patients who would have avoided e-assist no matter what). Estimating physician-specific probabilities jointly over physicians poses novel challenges. We address these challenges by maximum likelihood, factoring a \"complete-data likelihood\" uniquely into the conditional distribution of screening and partially observed compliance given random effects and the distribution of random effects. We marginalize this likelihood using adaptive Gauss-Hermite quadrature. The approach is doubly iterative in that the conditional distribution defies analytic evaluation. Because the small sample size per physician constrains estimability of multiple random effects, we reduce their dimensionality using a shared random effects model having a factor analytic structure. We assess estimators and recommend sample sizes to produce reasonably accurate and precise estimates by simulation, and analyze data from a trial of a CRCS intervention.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":" ","pages":"5353-5365"},"PeriodicalIF":1.8,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11586915/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142475172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-10Epub Date: 2024-10-10DOI: 10.1002/sim.10241
Gail E Potter, Michael A Proschan
The first Adaptive COVID-19 Treatment Trial (ACTT-1) showed that remdesivir improved COVID-19 recovery time compared with placebo in hospitalized adults. The secondary outcome of mortality was almost significant overall (p = 0.07) and highly significant for people receiving supplemental oxygen at enrollment (p = 0.002), suggesting a mortality benefit concentrated in this group. We explore analysis methods that are helpful when a single subgroup benefits from treatment and apply them to ACTT-1, using baseline oxygen use to define subgroups. We consider two questions: (1) is the remdesivir effect for people receiving supplemental oxygen real, and (2) does this effect differ from the overall effect? For Question 1, we apply a Bonferroni adjustment to subgroup-specific hypothesis tests and the Westfall and Young permutation test, which is valid when small cell counts preclude normally distributed test statistics (a frequently unexamined condition in subgroup analyses). For Question 2, we introduce Qmax, the largest standardized difference between subgroup-specific effects and the overall effect. Qmax simultaneously tests whether any subgroup effect differs from the overall effect and identifies the subgroup benefitting most. We demonstrate that Qmax strongly controls the familywise error rate (FWER) when test statistics are normally distributed with no mean-variance relationship. We compare Qmax to a related permutation test, SEAMOS, which was previously proposed but not extensively applied or tested. We show that SEAMOS can have inflated Type 1 error under the global null when control arm event rates differ between subgroups. Our results support a mortality benefit from remdesivir in people receiving supplemental oxygen.
{"title":"Does Remdesivir Lower COVID-19 Mortality? A Subgroup Analysis of Hospitalized Adults Receiving Supplemental Oxygen.","authors":"Gail E Potter, Michael A Proschan","doi":"10.1002/sim.10241","DOIUrl":"10.1002/sim.10241","url":null,"abstract":"<p><p>The first Adaptive COVID-19 Treatment Trial (ACTT-1) showed that remdesivir improved COVID-19 recovery time compared with placebo in hospitalized adults. The secondary outcome of mortality was almost significant overall (p = 0.07) and highly significant for people receiving supplemental oxygen at enrollment (p = 0.002), suggesting a mortality benefit concentrated in this group. We explore analysis methods that are helpful when a single subgroup benefits from treatment and apply them to ACTT-1, using baseline oxygen use to define subgroups. We consider two questions: (1) is the remdesivir effect for people receiving supplemental oxygen real, and (2) does this effect differ from the overall effect? For Question 1, we apply a Bonferroni adjustment to subgroup-specific hypothesis tests and the Westfall and Young permutation test, which is valid when small cell counts preclude normally distributed test statistics (a frequently unexamined condition in subgroup analyses). For Question 2, we introduce Q<sub>max</sub>, the largest standardized difference between subgroup-specific effects and the overall effect. Q<sub>max</sub> simultaneously tests whether any subgroup effect differs from the overall effect and identifies the subgroup benefitting most. We demonstrate that Q<sub>max</sub> strongly controls the familywise error rate (FWER) when test statistics are normally distributed with no mean-variance relationship. We compare Q<sub>max</sub> to a related permutation test, SEAMOS, which was previously proposed but not extensively applied or tested. We show that SEAMOS can have inflated Type 1 error under the global null when control arm event rates differ between subgroups. Our results support a mortality benefit from remdesivir in people receiving supplemental oxygen.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":" ","pages":"5285-5299"},"PeriodicalIF":1.8,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11586907/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142393473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-30Epub Date: 2024-10-07DOI: 10.1002/sim.10230
Jin Yang, Wei Zhang, Paul S Albert, Aiyi Liu, Zhen Chen
We consider the problem of combining multiple biomarkers to improve the diagnostic accuracy of detecting a disease when only group-tested data on the disease status are available. There are several challenges in addressing this problem, including unavailable individual disease statuses, differential misclassification depending on group size and number of diseased individuals in the group, and extensive computation due to a large number of possible combinations of multiple biomarkers. To tackle these issues, we propose a pairwise model fitting approach to estimating the distribution of the optimal linear combination of biomarkers and its diagnostic accuracy under the assumption of a multivariate normal distribution. The approach is evaluated in simulation studies and applied to data on chlamydia detection and COVID-19 diagnosis.
{"title":"Combining Biomarkers to Improve Diagnostic Accuracy in Detecting Diseases With Group-Tested Data.","authors":"Jin Yang, Wei Zhang, Paul S Albert, Aiyi Liu, Zhen Chen","doi":"10.1002/sim.10230","DOIUrl":"10.1002/sim.10230","url":null,"abstract":"<p><p>We consider the problem of combining multiple biomarkers to improve the diagnostic accuracy of detecting a disease when only group-tested data on the disease status are available. There are several challenges in addressing this problem, including unavailable individual disease statuses, differential misclassification depending on group size and number of diseased individuals in the group, and extensive computation due to a large number of possible combinations of multiple biomarkers. To tackle these issues, we propose a pairwise model fitting approach to estimating the distribution of the optimal linear combination of biomarkers and its diagnostic accuracy under the assumption of a multivariate normal distribution. The approach is evaluated in simulation studies and applied to data on chlamydia detection and COVID-19 diagnosis.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":" ","pages":"5182-5192"},"PeriodicalIF":1.8,"publicationDate":"2024-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11583953/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142393472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-30Epub Date: 2024-10-08DOI: 10.1002/sim.10224
Ping Xie, Jie Ding, Xiaoguang Wang
It is becoming increasingly common for researchers to consider leveraging information from external sources to enhance the analysis of small-scale studies. While much attention has focused on univariate survival data, correlated survival data are prevalent in epidemiological investigations. In this article, we propose a unified framework to improve the estimation of the marginal accelerated failure time model with correlated survival data by integrating additional information given in the form of covariate effects evaluated in a reduced accelerated failure time model. Such auxiliary information can be summarized by using valid estimating equations and hence can then be combined with the internal linear rank-estimating equations via the generalized method of moments. We investigate the asymptotic properties of the proposed estimator and show that it is more efficient than the conventional estimator using internal data only. When population heterogeneity exists, we revise the proposed estimation procedure and present a shrinkage estimator to protect against bias and loss of efficiency. Moreover, the proposed estimation procedure can be further refined to accommodate the non-negligible uncertainty in the auxiliary information, leading to more trustable inference conclusions. Simulation results demonstrate the finite sample performance of the proposed methods, and empirical application on the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial substantiates its practical relevance.
{"title":"Leveraging External Aggregated Information for the Marginal Accelerated Failure Time Model.","authors":"Ping Xie, Jie Ding, Xiaoguang Wang","doi":"10.1002/sim.10224","DOIUrl":"10.1002/sim.10224","url":null,"abstract":"<p><p>It is becoming increasingly common for researchers to consider leveraging information from external sources to enhance the analysis of small-scale studies. While much attention has focused on univariate survival data, correlated survival data are prevalent in epidemiological investigations. In this article, we propose a unified framework to improve the estimation of the marginal accelerated failure time model with correlated survival data by integrating additional information given in the form of covariate effects evaluated in a reduced accelerated failure time model. Such auxiliary information can be summarized by using valid estimating equations and hence can then be combined with the internal linear rank-estimating equations via the generalized method of moments. We investigate the asymptotic properties of the proposed estimator and show that it is more efficient than the conventional estimator using internal data only. When population heterogeneity exists, we revise the proposed estimation procedure and present a shrinkage estimator to protect against bias and loss of efficiency. Moreover, the proposed estimation procedure can be further refined to accommodate the non-negligible uncertainty in the auxiliary information, leading to more trustable inference conclusions. Simulation results demonstrate the finite sample performance of the proposed methods, and empirical application on the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial substantiates its practical relevance.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":" ","pages":"5203-5216"},"PeriodicalIF":1.8,"publicationDate":"2024-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142393474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-30Epub Date: 2024-10-04DOI: 10.1002/sim.10219
Ana M Ortega-Villa, Martha C Nason, Michael P Fay, Sara Alehashemi, Raphaela Goldbach-Mansky, Dean A Follmann
Motivated by a small sample example in neonatal onset multisystem inflammatory disease (NOMID), we propose a method that can be used when the interest is testing for an association between a changes in disease progression with start of treatment compared to historical disease progression prior to treatment. Our method estimates the longitudinal trajectory of the outcome variable and adds an interaction term between an intervention indicator variable and the time since initiation of the intervention. This method is appropriate for a situation in which the intervention slows or arrests the effect of the disease on the outcome, as is the case in our motivating example. By simulation in small samples and restricted sets of treatment initiation times, we show that the generalized estimating equations (GEE) formulation with small sample adjustments can bound the Type I error rate better than GEE and linear mixed models without small sample adjustments. Permutation tests (permuting the time of treatment initiation) is another valid approach that can also be useful. We illustrate the methodology through an application to a prospective cohort of NOMID patients enrolled at the NIH clinical center.
受新生儿发病多系统炎症性疾病(NOMID)小样本实例的启发,我们提出了一种方法,可用于测试疾病进展变化与开始治疗前历史疾病进展之间的关联。我们的方法估计了结果变量的纵向轨迹,并在干预指标变量和开始干预后的时间之间添加了一个交互项。这种方法适用于干预措施减缓或阻止了疾病对结果的影响的情况,我们的激励性例子就是这种情况。通过对小样本和受限的治疗开始时间集进行模拟,我们发现,与不进行小样本调整的广义估计方程(GEE)相比,进行了小样本调整的广义估计方程(GEE)能更好地约束 I 类错误率。换位检验(对治疗开始时间进行换位)是另一种有效的方法,也很有用。我们将在美国国立卫生研究院临床中心登记的 NOMID 患者前瞻性队列中应用该方法进行说明。
{"title":"Regression Approaches to Assess Effect of Treatments That Arrest Progression of Symptoms.","authors":"Ana M Ortega-Villa, Martha C Nason, Michael P Fay, Sara Alehashemi, Raphaela Goldbach-Mansky, Dean A Follmann","doi":"10.1002/sim.10219","DOIUrl":"10.1002/sim.10219","url":null,"abstract":"<p><p>Motivated by a small sample example in neonatal onset multisystem inflammatory disease (NOMID), we propose a method that can be used when the interest is testing for an association between a changes in disease progression with start of treatment compared to historical disease progression prior to treatment. Our method estimates the longitudinal trajectory of the outcome variable and adds an interaction term between an intervention indicator variable and the time since initiation of the intervention. This method is appropriate for a situation in which the intervention slows or arrests the effect of the disease on the outcome, as is the case in our motivating example. By simulation in small samples and restricted sets of treatment initiation times, we show that the generalized estimating equations (GEE) formulation with small sample adjustments can bound the Type I error rate better than GEE and linear mixed models without small sample adjustments. Permutation tests (permuting the time of treatment initiation) is another valid approach that can also be useful. We illustrate the methodology through an application to a prospective cohort of NOMID patients enrolled at the NIH clinical center.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":" ","pages":"5155-5165"},"PeriodicalIF":1.8,"publicationDate":"2024-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142372927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-30Epub Date: 2024-09-29DOI: 10.1002/sim.10221
Feipeng Zhang, Xi Chen, Peng Liu, Caiyun Fan
As a favorable alternative to the censored quantile regression, censored expectile regression has been popular in survival analysis due to its flexibility in modeling the heterogeneous effect of covariates. The existing weighted expectile regression (WER) method assumes that the censoring variable and covariates are independent, and that the covariates effects has a global linear structure. However, these two assumptions are too restrictive to capture the complex and nonlinear pattern of the underlying covariates effects. In this article, we developed a novel weighted expectile regression neural networks (WERNN) method by incorporating the deep neural network structure into the censored expectile regression framework. To handle the random censoring, we employ the inverse probability of censoring weighting (IPCW) technique in the expectile loss function. The proposed WERNN method is flexible enough to fit nonlinear patterns and therefore achieves more accurate prediction performance than the existing WER method for right censored data. Our findings are supported by extensive Monte Carlo simulation studies and a real data application.
作为删减量化回归的有利替代方法,删减期望回归因其在建立协变量异质效应模型方面的灵活性而在生存分析中广受欢迎。现有的加权期望值回归(WER)方法假设剔除变量和协变量是独立的,并且协变量效应具有全局线性结构。然而,这两个假设限制性太大,无法捕捉潜在协变量效应的复杂和非线性模式。在本文中,我们通过将深度神经网络结构融入有删减的期望回归框架,开发了一种新的加权期望回归神经网络(WERNN)方法。为了处理随机普查,我们在期望损失函数中采用了普查反概率加权(IPCW)技术。所提出的 WERNN 方法具有足够的灵活性来适应非线性模式,因此在右删减数据方面比现有的 WER 方法获得了更准确的预测性能。我们的研究结果得到了大量蒙特卡罗模拟研究和实际数据应用的支持。
{"title":"Weighted Expectile Regression Neural Networks for Right Censored Data.","authors":"Feipeng Zhang, Xi Chen, Peng Liu, Caiyun Fan","doi":"10.1002/sim.10221","DOIUrl":"10.1002/sim.10221","url":null,"abstract":"<p><p>As a favorable alternative to the censored quantile regression, censored expectile regression has been popular in survival analysis due to its flexibility in modeling the heterogeneous effect of covariates. The existing weighted expectile regression (WER) method assumes that the censoring variable and covariates are independent, and that the covariates effects has a global linear structure. However, these two assumptions are too restrictive to capture the complex and nonlinear pattern of the underlying covariates effects. In this article, we developed a novel weighted expectile regression neural networks (WERNN) method by incorporating the deep neural network structure into the censored expectile regression framework. To handle the random censoring, we employ the inverse probability of censoring weighting (IPCW) technique in the expectile loss function. The proposed WERNN method is flexible enough to fit nonlinear patterns and therefore achieves more accurate prediction performance than the existing WER method for right censored data. Our findings are supported by extensive Monte Carlo simulation studies and a real data application.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":" ","pages":"5100-5114"},"PeriodicalIF":1.8,"publicationDate":"2024-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142354091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-30Epub Date: 2024-10-06DOI: 10.1002/sim.10222
Haodong Tian, Ashish Patel, Stephen Burgess
Mendelian randomization is an instrumental variable method that utilizes genetic information to investigate the causal effect of a modifiable exposure on an outcome. In most cases, the exposure changes over time. Understanding the time-varying causal effect of the exposure can yield detailed insights into mechanistic effects and the potential impact of public health interventions. Recently, a growing number of Mendelian randomization studies have attempted to explore time-varying causal effects. However, the proposed approaches oversimplify temporal information and rely on overly restrictive structural assumptions, limiting their reliability in addressing time-varying causal problems. This article considers a novel approach to estimate time-varying effects through continuous-time modelling by combining functional principal component analysis and weak-instrument-robust techniques. Our method effectively utilizes available data without making strong structural assumptions and can be applied in general settings where the exposure measurements occur at different timepoints for different individuals. We demonstrate through simulations that our proposed method performs well in estimating time-varying effects and provides reliable inference when the time-varying effect form is correctly specified. The method could theoretically be used to estimate arbitrarily complex time-varying effects. However, there is a trade-off between model complexity and instrument strength. Estimating complex time-varying effects requires instruments that are unrealistically strong. We illustrate the application of this method in a case study examining the time-varying effects of systolic blood pressure on urea levels.
{"title":"Estimating Time-Varying Exposure Effects Through Continuous-Time Modelling in Mendelian Randomization.","authors":"Haodong Tian, Ashish Patel, Stephen Burgess","doi":"10.1002/sim.10222","DOIUrl":"10.1002/sim.10222","url":null,"abstract":"<p><p>Mendelian randomization is an instrumental variable method that utilizes genetic information to investigate the causal effect of a modifiable exposure on an outcome. In most cases, the exposure changes over time. Understanding the time-varying causal effect of the exposure can yield detailed insights into mechanistic effects and the potential impact of public health interventions. Recently, a growing number of Mendelian randomization studies have attempted to explore time-varying causal effects. However, the proposed approaches oversimplify temporal information and rely on overly restrictive structural assumptions, limiting their reliability in addressing time-varying causal problems. This article considers a novel approach to estimate time-varying effects through continuous-time modelling by combining functional principal component analysis and weak-instrument-robust techniques. Our method effectively utilizes available data without making strong structural assumptions and can be applied in general settings where the exposure measurements occur at different timepoints for different individuals. We demonstrate through simulations that our proposed method performs well in estimating time-varying effects and provides reliable inference when the time-varying effect form is correctly specified. The method could theoretically be used to estimate arbitrarily complex time-varying effects. However, there is a trade-off between model complexity and instrument strength. Estimating complex time-varying effects requires instruments that are unrealistically strong. We illustrate the application of this method in a case study examining the time-varying effects of systolic blood pressure on urea levels.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":" ","pages":"5166-5181"},"PeriodicalIF":1.8,"publicationDate":"2024-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7616825/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142381657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-30Epub Date: 2024-10-09DOI: 10.1002/sim.10237
Annabel L Davies, Julian P T Higgins
Network meta-analysis (NMA) combines evidence from multiple trials to compare the effectiveness of a set of interventions. In many areas of research, interventions are often complex, made up of multiple components or features. This makes it difficult to define a common set of interventions on which to perform the analysis. One approach to this problem is component network meta-analysis (CNMA) which uses a meta-regression framework to define each intervention as a subset of components whose individual effects combine additively. In this article, we are motivated by a systematic review of complex interventions to prevent obesity in children. Due to considerable heterogeneity across the trials, these interventions cannot be expressed as a subset of components but instead are coded against a framework of characteristic features. To analyse these data, we develop a bespoke CNMA-inspired model that allows us to identify the most important features of interventions. We define a meta-regression model with covariates on three levels: intervention, study, and follow-up time, as well as flexible interaction terms. By specifying different regression structures for trials with and without a control arm, we relax the assumption from previous CNMA models that a control arm is the absence of intervention components. Furthermore, we derive a correlation structure that accounts for trials with multiple intervention arms and multiple follow-up times. Although, our model was developed for the specifics of the obesity data set, it has wider applicability to any set of complex interventions that can be coded according to a set of shared features.
{"title":"A Complex Meta-Regression Model to Identify Effective Features of Interventions From Multi-Arm, Multi-Follow-Up Trials.","authors":"Annabel L Davies, Julian P T Higgins","doi":"10.1002/sim.10237","DOIUrl":"10.1002/sim.10237","url":null,"abstract":"<p><p>Network meta-analysis (NMA) combines evidence from multiple trials to compare the effectiveness of a set of interventions. In many areas of research, interventions are often complex, made up of multiple components or features. This makes it difficult to define a common set of interventions on which to perform the analysis. One approach to this problem is component network meta-analysis (CNMA) which uses a meta-regression framework to define each intervention as a subset of components whose individual effects combine additively. In this article, we are motivated by a systematic review of complex interventions to prevent obesity in children. Due to considerable heterogeneity across the trials, these interventions cannot be expressed as a subset of components but instead are coded against a framework of characteristic features. To analyse these data, we develop a bespoke CNMA-inspired model that allows us to identify the most important features of interventions. We define a meta-regression model with covariates on three levels: intervention, study, and follow-up time, as well as flexible interaction terms. By specifying different regression structures for trials with and without a control arm, we relax the assumption from previous CNMA models that a control arm is the absence of intervention components. Furthermore, we derive a correlation structure that accounts for trials with multiple intervention arms and multiple follow-up times. Although, our model was developed for the specifics of the obesity data set, it has wider applicability to any set of complex interventions that can be coded according to a set of shared features.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":" ","pages":"5217-5233"},"PeriodicalIF":1.8,"publicationDate":"2024-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11583959/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142393469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}