Arce Domingo-Relloso, Yuchen Zhang, Ziqing Wang, Astrid M Suchy-Dicey, Dedra S Buchwald, Ana Navas-Acien, Joel Schwartz, Kiros Berhane, Brent A Coull, Linda Valeri
Not accounting for competing events in survival analysis can lead to biased estimates, as individuals who die from other causes do not have the opportunity to develop the event of interest. Formal definitions and considerations for causal effects in the presence of competing risks have been published, but not for the mediation analysis setting when the exposure is not separable and both the outcome and the mediator are nonterminal events. We propose, for the first time, an approach based on the path-specific effects framework to account for competing risks in longitudinal mediation analysis with time-to-event outcomes. We do so by considering the pathway through the competing event as another mediator, which is nested within our longitudinal mediator of interest. We provide a theoretical formulation and related definitions of the effects of interest based on the mediational g-formula, as well as a detailed description of the algorithm. We also present a simulation study and an application of our algorithm to data from the Strong Heart Study, a prospective cohort of American Indian adults. In this application, we evaluated the mediating role of the blood pressure trajectory (measured in three visits) on the association of arsenic and cadmium with time to cardiovascular disease, accounting for competing risks by death. Identifying the effects through different paths enables us to evaluate the impact of metals on the outcome of interest, as well as through competing risks, more transparently.
{"title":"A Path-Specific Effect Approach to Mediation Analysis With Time-Varying Mediators and Time-to-Event Outcomes Accounting for Competing Risks.","authors":"Arce Domingo-Relloso, Yuchen Zhang, Ziqing Wang, Astrid M Suchy-Dicey, Dedra S Buchwald, Ana Navas-Acien, Joel Schwartz, Kiros Berhane, Brent A Coull, Linda Valeri","doi":"10.1002/sim.70425","DOIUrl":"10.1002/sim.70425","url":null,"abstract":"<p><p>Not accounting for competing events in survival analysis can lead to biased estimates, as individuals who die from other causes do not have the opportunity to develop the event of interest. Formal definitions and considerations for causal effects in the presence of competing risks have been published, but not for the mediation analysis setting when the exposure is not separable and both the outcome and the mediator are nonterminal events. We propose, for the first time, an approach based on the path-specific effects framework to account for competing risks in longitudinal mediation analysis with time-to-event outcomes. We do so by considering the pathway through the competing event as another mediator, which is nested within our longitudinal mediator of interest. We provide a theoretical formulation and related definitions of the effects of interest based on the mediational g-formula, as well as a detailed description of the algorithm. We also present a simulation study and an application of our algorithm to data from the Strong Heart Study, a prospective cohort of American Indian adults. In this application, we evaluated the mediating role of the blood pressure trajectory (measured in three visits) on the association of arsenic and cadmium with time to cardiovascular disease, accounting for competing risks by death. Identifying the effects through different paths enables us to evaluate the impact of metals on the outcome of interest, as well as through competing risks, more transparently.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70425"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12873459/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adel Ahmadi Nadi, Stefan H Steiner, Nathaniel T Stevens
Surgical learning curves are graphical tools used to evaluate a trainee's progress in the early stages of their career and determine whether they have achieved proficiency after completing a specified number of surgeries. Cumulative sum (CUSUM)-based techniques are commonly used to assess learning curves due to their simplicity, but they face criticism for relying on fixed performance thresholds and lacking interpretability. This paper introduces a risk-adjusted surgical learning curve assessment (SLCA) method that focuses on estimation rather than hypothesis testing (which is characteristic of CUSUM-type methods). The proposed method is specifically designed to accommodate right-skewed outcomes, such as surgery durations, which are well-characterized by the Weibull distribution. To evaluate the learning process, the proposed SLCA approach sequentially estimates comparative probability metrics that assess the likelihood of a clinically important difference between the trainee's performance and a standard performance. Given the expectation that a trainee's performance will improve over time, we employ a weighted estimating equations approach to the estimation framework to assign greater weight to more recent outcomes compared to earlier ones. Compared to CUSUM-based methods, the proposed methodology offers enhanced interpretability and deeper insights. It also avoids reliance on externally defined performance levels, which are often difficult to determine in practice, and it emphasizes assessing clinical equivalence or noninferiority rather than simply identifying a lack of difference. The effectiveness of the proposed method is demonstrated through a case study on a colorectal surgery dataset as well as a numerical study.
{"title":"Risk-Adjusted Surgical Learning Curve Assessment Using Comparative Probability Metrics.","authors":"Adel Ahmadi Nadi, Stefan H Steiner, Nathaniel T Stevens","doi":"10.1002/sim.70419","DOIUrl":"https://doi.org/10.1002/sim.70419","url":null,"abstract":"<p><p>Surgical learning curves are graphical tools used to evaluate a trainee's progress in the early stages of their career and determine whether they have achieved proficiency after completing a specified number of surgeries. Cumulative sum (CUSUM)-based techniques are commonly used to assess learning curves due to their simplicity, but they face criticism for relying on fixed performance thresholds and lacking interpretability. This paper introduces a risk-adjusted surgical learning curve assessment (SLCA) method that focuses on estimation rather than hypothesis testing (which is characteristic of CUSUM-type methods). The proposed method is specifically designed to accommodate right-skewed outcomes, such as surgery durations, which are well-characterized by the Weibull distribution. To evaluate the learning process, the proposed SLCA approach sequentially estimates comparative probability metrics that assess the likelihood of a clinically important difference between the trainee's performance and a standard performance. Given the expectation that a trainee's performance will improve over time, we employ a weighted estimating equations approach to the estimation framework to assign greater weight to more recent outcomes compared to earlier ones. Compared to CUSUM-based methods, the proposed methodology offers enhanced interpretability and deeper insights. It also avoids reliance on externally defined performance levels, which are often difficult to determine in practice, and it emphasizes assessing clinical equivalence or noninferiority rather than simply identifying a lack of difference. The effectiveness of the proposed method is demonstrated through a case study on a colorectal surgery dataset as well as a numerical study.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70419"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146195714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We develop an integrative joint model for multivariate sparse functional and survival data to analyze Alzheimer's disease (AD) across multiple studies. To address missing-by-design outcomes in multi-cohort studies, our approach extends the multivariate functional mixed model (MFMM), which integrates longitudinal outcomes to extract shared disease progression trajectories and links these outcomes to time-to-event data through a parsimonious survival model. This framework balances flexibility and interpretability by modeling shared progression trajectories while accommodating cohort-specific mean functions and survival parameters. For efficient estimation, we incorporate penalized splines into an EM algorithm. Application to three AD cohorts demonstrates the model's ability to capture disease trajectories and account for inter-cohort variability. Simulation studies confirm its robustness and accuracy, highlighting its value in advancing the understanding of AD progression and supporting clinical decision-making in multi-cohort settings.
{"title":"A Functional Joint Model for Survival and Multivariate Sparse Functional Data in Multi-Cohort Alzheimer's Disease Study.","authors":"Wenyi Wang, Luo Xiao, Ruonan Li, Sheng Luo","doi":"10.1002/sim.70442","DOIUrl":"10.1002/sim.70442","url":null,"abstract":"<p><p>We develop an integrative joint model for multivariate sparse functional and survival data to analyze Alzheimer's disease (AD) across multiple studies. To address missing-by-design outcomes in multi-cohort studies, our approach extends the multivariate functional mixed model (MFMM), which integrates longitudinal outcomes to extract shared disease progression trajectories and links these outcomes to time-to-event data through a parsimonious survival model. This framework balances flexibility and interpretability by modeling shared progression trajectories while accommodating cohort-specific mean functions and survival parameters. For efficient estimation, we incorporate penalized splines into an EM algorithm. Application to three AD cohorts demonstrates the model's ability to capture disease trajectories and account for inter-cohort variability. Simulation studies confirm its robustness and accuracy, highlighting its value in advancing the understanding of AD progression and supporting clinical decision-making in multi-cohort settings.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70442"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12902813/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146182651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In a weighted logrank test, such as the Harrington-Fleming test and the Tarone-Ware test, predetermined weights are used to emphasize early, middle, or late differences in survival distributions to maximize the test's power. The optimal weight function under an alternative, which depends on the true hazard functions of the groups being compared, has been derived. However, that optimal weight function cannot be directly used to construct an optimal test since the resulting test does not properly control the type I error rate. We further show that the power of a weighted logrank test with proper type I error control has an upper bound that cannot be achieved. Based on the theory, we propose a weighted logrank test that self-adaptively determines an "optimal" weight function. The new test is more powerful than existing standard and weighted logrank tests while maintaining proper type I error rates by tuning a parameter. We demonstrate through extensive simulation studies that the proposed test is both powerful and highly robust in a wide range of scenarios. The method is illustrated with data from several clinical trials in lung cancer.
{"title":"A Powerful and Self-Adaptive Weighted Logrank Test.","authors":"Zhiguo Li, Xiaofei Wang","doi":"10.1002/sim.70390","DOIUrl":"https://doi.org/10.1002/sim.70390","url":null,"abstract":"<p><p>In a weighted logrank test, such as the Harrington-Fleming test and the Tarone-Ware test, predetermined weights are used to emphasize early, middle, or late differences in survival distributions to maximize the test's power. The optimal weight function under an alternative, which depends on the true hazard functions of the groups being compared, has been derived. However, that optimal weight function cannot be directly used to construct an optimal test since the resulting test does not properly control the type I error rate. We further show that the power of a weighted logrank test with proper type I error control has an upper bound that cannot be achieved. Based on the theory, we propose a weighted logrank test that self-adaptively determines an \"optimal\" weight function. The new test is more powerful than existing standard and weighted logrank tests while maintaining proper type I error rates by tuning a parameter. We demonstrate through extensive simulation studies that the proposed test is both powerful and highly robust in a wide range of scenarios. The method is illustrated with data from several clinical trials in lung cancer.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70390"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sparse regression problems, where the goal is to identify a small set of relevant predictors, often require modeling not only main effects but also meaningful interactions through other variables. While the pliable lasso has emerged as a powerful frequentist tool for modeling such interactions under strong heredity constraints, it lacks a natural framework for uncertainty quantification and incorporation of prior knowledge. In this paper, we propose a Bayesian pliable lasso that extends this approach by placing sparsity-inducing priors, such as the horseshoe, on both main and interaction effects. The hierarchical prior structure enforces heredity constraints while adaptively shrinking irrelevant coefficients and allowing important effects to persist. We extend this framework to generalized linear models and develop a tailored approach to handle missing responses. To facilitate posterior inference, we develop an efficient Gibbs sampling algorithm based on a reparameterization of the horseshoe prior. Our Bayesian framework yields sparse, interpretable interaction structures, and principled measures of uncertainty. Through simulations and real-data studies, we demonstrate its advantages over existing methods in recovering complex interaction patterns under both complete and incomplete data. Our method is implemented in the package hspliable available on Github: https://github.com/tienmt/hspliable.
{"title":"Bayesian Pliable Lasso With Horseshoe Prior for Interaction Effects in GLMs With Missing Responses.","authors":"The Tien Mai","doi":"10.1002/sim.70406","DOIUrl":"https://doi.org/10.1002/sim.70406","url":null,"abstract":"<p><p>Sparse regression problems, where the goal is to identify a small set of relevant predictors, often require modeling not only main effects but also meaningful interactions through other variables. While the pliable lasso has emerged as a powerful frequentist tool for modeling such interactions under strong heredity constraints, it lacks a natural framework for uncertainty quantification and incorporation of prior knowledge. In this paper, we propose a Bayesian pliable lasso that extends this approach by placing sparsity-inducing priors, such as the horseshoe, on both main and interaction effects. The hierarchical prior structure enforces heredity constraints while adaptively shrinking irrelevant coefficients and allowing important effects to persist. We extend this framework to generalized linear models and develop a tailored approach to handle missing responses. To facilitate posterior inference, we develop an efficient Gibbs sampling algorithm based on a reparameterization of the horseshoe prior. Our Bayesian framework yields sparse, interpretable interaction structures, and principled measures of uncertainty. Through simulations and real-data studies, we demonstrate its advantages over existing methods in recovering complex interaction patterns under both complete and incomplete data. Our method is implemented in the package hspliable available on Github: https://github.com/tienmt/hspliable.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70406"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Misclassification Simulation-Extrapolation (MC-SIMEX) is an established method to correct for misclassification in binary covariates in a model. It involves the use of a simulation component which simulates pseudo-datasets with added degree of misclassification in the binary covariate and an extrapolation component which models the covariate's regression coefficients obtained at each level of misclassification using a quadratic function. This quadratic function is then used to extrapolate the covariate's regression coefficients to a point of "no error" in the classification of the binary covariate under question. However, extrapolation functions are not usually known accurately beforehand and are therefore only approximated versions. In this article, we propose an innovative method that uses the exact (not approximated) extrapolation function through the use of a derived relationship between the naïve regression coefficient estimates and the true coefficients in generalized linear models. Simulation studies are conducted to study and compare the numerical properties of the resulting estimator to the original MC-SIMEX estimator. Real data analysis using colon cancer data from the MSKCC cancer registry is also provided.
{"title":"An Improved Misclassification Simulation Extrapolation (MC-SIMEX) Algorithm.","authors":"Varadan Sevilimedu, Lili Yu","doi":"10.1002/sim.70418","DOIUrl":"https://doi.org/10.1002/sim.70418","url":null,"abstract":"<p><p>Misclassification Simulation-Extrapolation (MC-SIMEX) is an established method to correct for misclassification in binary covariates in a model. It involves the use of a simulation component which simulates pseudo-datasets with added degree of misclassification in the binary covariate and an extrapolation component which models the covariate's regression coefficients obtained at each level of misclassification using a quadratic function. This quadratic function is then used to extrapolate the covariate's regression coefficients to a point of \"no error\" in the classification of the binary covariate under question. However, extrapolation functions are not usually known accurately beforehand and are therefore only approximated versions. In this article, we propose an innovative method that uses the exact (not approximated) extrapolation function through the use of a derived relationship between the naïve regression coefficient estimates and the true coefficients in generalized linear models. Simulation studies are conducted to study and compare the numerical properties of the resulting estimator to the original MC-SIMEX estimator. Real data analysis using colon cancer data from the MSKCC cancer registry is also provided.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70418"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A popular approach to growth reference centile estimation is the LMS (Lambda-Mu-Sigma) method, which assumes a parametric distribution for response variable and fits the location, scale and shape parameters of the distribution of as smooth functions of explanatory variable . This article provides two methods, transformation and adaptive smoothing, for improving the centile estimation when there is high curvature (i.e., rapid change in slope) with respect to in one or more of the distribution parameters. In general, high curvature is reduced (i.e., attenuated or dampened) by smoothing. In the first method, is transformed to variable to reduce this high curvature, and the distribution parameters are fitted as smooth functions of . Three different transformations of are described. In the second method, the distribution parameters are adaptively smoothed against by allowing the smoothing parameter itself to vary continuously with . Simulations are used to compare the performance of the two methods. Three examples show how the process can lead to substantially smoother and better fitting centiles.
一种常用的生长参考百分位数估计方法是LMS (Lambda-Mu-Sigma)方法,该方法假设响应变量Y $$ Y $$的参数分布,并将Y $$ Y $$分布的位置、规模和形状参数拟合为解释变量X $$ X $$的光滑函数。本文提供了变换和自适应平滑两种方法,用于在一个或多个Y $$ Y $$分布参数中存在相对于X $$ X $$的高曲率(即斜率的快速变化)时改进分位数估计。一般来说,通过平滑可以减少高曲率(即衰减或阻尼)。在第一种方法中,将X $$ X $$转换为变量T $$ T $$以减小这种高曲率,并将Y $$ Y $$分布参数拟合为T $$ T $$的光滑函数。描述了X $$ X $$的三种不同变换。在第二种方法中,通过允许平滑参数本身随Y $$ Y $$连续变化,Y $$ Y $$分布参数针对X $$ X $$进行自适应平滑。通过仿真比较了两种方法的性能。三个例子显示了该过程如何导致更平滑和更好的拟合百分位数。
{"title":"Improved Centile Estimation by Transformation And/Or Adaptive Smoothing of the Explanatory Variable.","authors":"R A Rigby, D M Stasinopoulos, T J Cole","doi":"10.1002/sim.70414","DOIUrl":"10.1002/sim.70414","url":null,"abstract":"<p><p>A popular approach to growth reference centile estimation is the LMS (Lambda-Mu-Sigma) method, which assumes a parametric distribution for response variable <math> <semantics><mrow><mi>Y</mi></mrow> <annotation>$$ Y $$</annotation></semantics> </math> and fits the location, scale and shape parameters of the distribution of <math> <semantics><mrow><mi>Y</mi></mrow> <annotation>$$ Y $$</annotation></semantics> </math> as smooth functions of explanatory variable <math> <semantics><mrow><mi>X</mi></mrow> <annotation>$$ X $$</annotation></semantics> </math> . This article provides two methods, transformation and adaptive smoothing, for improving the centile estimation when there is high curvature (i.e., rapid change in slope) with respect to <math> <semantics><mrow><mi>X</mi></mrow> <annotation>$$ X $$</annotation></semantics> </math> in one or more of the <math> <semantics><mrow><mi>Y</mi></mrow> <annotation>$$ Y $$</annotation></semantics> </math> distribution parameters. In general, high curvature is reduced (i.e., attenuated or dampened) by smoothing. In the first method, <math> <semantics><mrow><mi>X</mi></mrow> <annotation>$$ X $$</annotation></semantics> </math> is transformed to variable <math> <semantics><mrow><mi>T</mi></mrow> <annotation>$$ T $$</annotation></semantics> </math> to reduce this high curvature, and the <math> <semantics><mrow><mi>Y</mi></mrow> <annotation>$$ Y $$</annotation></semantics> </math> distribution parameters are fitted as smooth functions of <math> <semantics><mrow><mi>T</mi></mrow> <annotation>$$ T $$</annotation></semantics> </math> . Three different transformations of <math> <semantics><mrow><mi>X</mi></mrow> <annotation>$$ X $$</annotation></semantics> </math> are described. In the second method, the <math> <semantics><mrow><mi>Y</mi></mrow> <annotation>$$ Y $$</annotation></semantics> </math> distribution parameters are adaptively smoothed against <math> <semantics><mrow><mi>X</mi></mrow> <annotation>$$ X $$</annotation></semantics> </math> by allowing the smoothing parameter itself to vary continuously with <math> <semantics><mrow><mi>Y</mi></mrow> <annotation>$$ Y $$</annotation></semantics> </math> . Simulations are used to compare the performance of the two methods. Three examples show how the process can lead to substantially smoother and better fitting centiles.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70414"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12874224/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146126374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recently, there has been growing concern about heavy-tailed and skewed noise in biological data. We introduce RobustPALMRT, a flexible permutation framework for testing the association of a covariate of interest adjusted for control covariates. RobustPALMRT controls type I error rate for finite-samples, even in the presence of heavy-tailed or skewed noise. The new framework expands the scope of state-of-the-art tests in three directions. First, our method applies to robust and quantile regressions, even with the necessary hyper-parameter tuning. Second, by separating model-fitting and model-evaluation, we discover that performance improves when using a robust loss function in the model-evaluation step, regardless of how the model is fit. Third, we allow fitting multiple models to detect specialized features of interest in a distribution. To demonstrate this, we introduce DispersionPALMRT, which tests for differences in dispersion between treatment and control groups. We establish theoretical guarantees, identify settings where our method has greater power than existing methods, and analyze existing immunological data on Long-COVID patients. Using RobustPALMRT, we unveil novel differences between Long-COVID patients and others even in the presence of highly skewed noise.
{"title":"Robust Distribution-Free Tests for the Linear Model.","authors":"Torey Hilbert, Steven N MacEachern, Yuan Zhang","doi":"10.1002/sim.70404","DOIUrl":"10.1002/sim.70404","url":null,"abstract":"<p><p>Recently, there has been growing concern about heavy-tailed and skewed noise in biological data. We introduce RobustPALMRT, a flexible permutation framework for testing the association of a covariate of interest adjusted for control covariates. RobustPALMRT controls type I error rate for finite-samples, even in the presence of heavy-tailed or skewed noise. The new framework expands the scope of state-of-the-art tests in three directions. First, our method applies to robust and quantile regressions, even with the necessary hyper-parameter tuning. Second, by separating model-fitting and model-evaluation, we discover that performance improves when using a robust loss function in the model-evaluation step, regardless of how the model is fit. Third, we allow fitting multiple models to detect specialized features of interest in a distribution. To demonstrate this, we introduce DispersionPALMRT, which tests for differences in dispersion between treatment and control groups. We establish theoretical guarantees, identify settings where our method has greater power than existing methods, and analyze existing immunological data on Long-COVID patients. Using RobustPALMRT, we unveil novel differences between Long-COVID patients and others even in the presence of highly skewed noise.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70404"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12875190/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146126417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study evaluates the performance of the Unrestricted Weighted Least Squares (UWLS) estimator in meta-analyses of medical research. Using a large-scale simulation approach, it addresses the limitations of model selection criteria in small-sample contexts. Prior research using the Cochrane Database of Systematic Reviews (CDSR) reported that UWLS outperformed Random Effects (RE) and, in some cases, Fixed Effect (FE) estimators when assessed using AIC and BIC. However, we show that idiosyncratic characteristics of the CDSR datasets, notably their small sample sizes and weak-signal settings (where key parameters are often small in magnitude), undermine the reliability of AIC and BIC for model selection. Accordingly, we simulate 108 000 datasets mirroring the original CDSR data. This allows us to know the true model parameters and evaluate the estimators more accurately. While all estimators performed similarly with respect to bias and efficiency, RE consistently produced more accurate standard errors than UWLS, making confidence intervals and hypothesis testing more reliable. The comparison with FE was less clear. We therefore recommend continued use of the RE estimator as a reliable general-purpose approach for medical research, with the choice between UWLS and FE made in light of the likely extent of effect heterogeneity in the data.
{"title":"Is UWLS Really Better for Medical Research?","authors":"Sanghyun Hong, W Robert Reed","doi":"10.1002/sim.70411","DOIUrl":"10.1002/sim.70411","url":null,"abstract":"<p><p>This study evaluates the performance of the Unrestricted Weighted Least Squares (UWLS) estimator in meta-analyses of medical research. Using a large-scale simulation approach, it addresses the limitations of model selection criteria in small-sample contexts. Prior research using the Cochrane Database of Systematic Reviews (CDSR) reported that UWLS outperformed Random Effects (RE) and, in some cases, Fixed Effect (FE) estimators when assessed using AIC and BIC. However, we show that idiosyncratic characteristics of the CDSR datasets, notably their small sample sizes and weak-signal settings (where key parameters are often small in magnitude), undermine the reliability of AIC and BIC for model selection. Accordingly, we simulate 108 000 datasets mirroring the original CDSR data. This allows us to know the true model parameters and evaluate the estimators more accurately. While all estimators performed similarly with respect to bias and efficiency, RE consistently produced more accurate standard errors than UWLS, making confidence intervals and hypothesis testing more reliable. The comparison with FE was less clear. We therefore recommend continued use of the RE estimator as a reliable general-purpose approach for medical research, with the choice between UWLS and FE made in light of the likely extent of effect heterogeneity in the data.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70411"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12874514/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146126441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Randomized clinical trials are the gold standard for evaluating the benefits and harms of interventions, though they often fail to provide the necessary evidence to inform medical decision-making. Primary reasons are failure to recognize the most important questions for informing clinical practice, and that traditional approaches do not directly address these most important questions, and subsequently not using these most important questions as the motivation for the design, monitoring, analysis, and reporting of clinical trials. The standard approach of analyzing one outcome at a time fails to incorporate associations between or the cumulative nature of multiple outcomes in individual patients, suffers from competing risk complexities during interpretation of individual outcomes, fails to recognize important gradations of patient-centric responses, and since efficacy and safety analyses are often conducted on different populations, benefit:risk estimands and generalizability are unclear. Cardiovascular event prevention trials typically utilize: (1) major adverse cardiovascular events (MACE), for example, stroke, myocardial infarction, and death as the primary endpoint, which fails to recognize multiple events or the differential importance of events, and (2) relative risk models which rely on robustness-challenging modeling assumptions and are contraindicated in benefit:risk and multiple outcome evaluation. The Desirability Of Outcome Ranking (DOOR) is a paradigm for the design, data monitoring, analysis, interpretation, and reporting of clinical trials based on comprehensive patient-centric benefit:risk evaluation, developed to address these issues and advance clinical trial science. The rationale and the methodology for the design and analyses for the DOOR paradigm are described. The methods are illustrated using an example. Freely available online tools for the design and analysis of studies implementing the DOOR are provided.
{"title":"Patient-Centric Pragmatic Clinical Trials: Opening the DOOR.","authors":"Scott R Evans, Qihang Wu, Toshimitsu Hamasaki","doi":"10.1002/sim.70328","DOIUrl":"https://doi.org/10.1002/sim.70328","url":null,"abstract":"<p><p>Randomized clinical trials are the gold standard for evaluating the benefits and harms of interventions, though they often fail to provide the necessary evidence to inform medical decision-making. Primary reasons are failure to recognize the most important questions for informing clinical practice, and that traditional approaches do not directly address these most important questions, and subsequently not using these most important questions as the motivation for the design, monitoring, analysis, and reporting of clinical trials. The standard approach of analyzing one outcome at a time fails to incorporate associations between or the cumulative nature of multiple outcomes in individual patients, suffers from competing risk complexities during interpretation of individual outcomes, fails to recognize important gradations of patient-centric responses, and since efficacy and safety analyses are often conducted on different populations, benefit:risk estimands and generalizability are unclear. Cardiovascular event prevention trials typically utilize: (1) major adverse cardiovascular events (MACE), for example, stroke, myocardial infarction, and death as the primary endpoint, which fails to recognize multiple events or the differential importance of events, and (2) relative risk models which rely on robustness-challenging modeling assumptions and are contraindicated in benefit:risk and multiple outcome evaluation. The Desirability Of Outcome Ranking (DOOR) is a paradigm for the design, data monitoring, analysis, interpretation, and reporting of clinical trials based on comprehensive patient-centric benefit:risk evaluation, developed to address these issues and advance clinical trial science. The rationale and the methodology for the design and analyses for the DOOR paradigm are described. The methods are illustrated using an example. Freely available online tools for the design and analysis of studies implementing the DOOR are provided.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70328"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146166817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}