The use of combination treatments in early-phase oncology trials is growing. The objective of these trials is to search for the maximum tolerated dose combination from a predefined set. However, cases in which the initial set of combinations does not contain one close to the target toxicity pose a significant challenge. Currently, solutions are typically ad hoc and may bring practical challenges. We propose a novel method for inserting dose levels mid-trial, which features a search for the contour partitioning the dose space into combinations with toxicity truly above and below the target toxicity. Establishing this contour with a degree of certainty suggests that no combination is close to the target toxicity, triggering an insertion. We examine our approach in a comprehensive simulation study applied to the PIPE design and two-dimensional Bayesian logistic regression model (BLRM), though any model-based or model-assisted design is an appropriate candidate. Our results demonstrate that, on average, the insertion method can increase the probability of selecting combinations close to the target toxicity, without increasing the probability of subtherapeutic or toxic recommendations.
{"title":"A Novel Method for Inserting Dose Levels Mid-Trial in Early-Phase Oncology Combination Studies.","authors":"Matthew George, Ian Wadsworth, Pavel Mozgunov","doi":"10.1002/sim.70417","DOIUrl":"10.1002/sim.70417","url":null,"abstract":"<p><p>The use of combination treatments in early-phase oncology trials is growing. The objective of these trials is to search for the maximum tolerated dose combination from a predefined set. However, cases in which the initial set of combinations does not contain one close to the target toxicity pose a significant challenge. Currently, solutions are typically ad hoc and may bring practical challenges. We propose a novel method for inserting dose levels mid-trial, which features a search for the contour partitioning the dose space into combinations with toxicity truly above and below the target toxicity. Establishing this contour with a degree of certainty suggests that no combination is close to the target toxicity, triggering an insertion. We examine our approach in a comprehensive simulation study applied to the PIPE design and two-dimensional Bayesian logistic regression model (BLRM), though any model-based or model-assisted design is an appropriate candidate. Our results demonstrate that, on average, the insertion method can increase the probability of selecting combinations close to the target toxicity, without increasing the probability of subtherapeutic or toxic recommendations.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70417"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146166837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ji Soo Kim, Yizhen Xu, Rachel S Wallwork, Laura K Hummers, Ami A Shah, Scott L Zeger
Background: Scleroderma (systemic sclerosis; SSc) is a chronic autoimmune disease known for wide heterogeneity in patients' disease progression in multiple organ systems. Our goal is to guide clinical care by real-time classification of patients into clinically interpretable subpopulations based on their baseline characteristics and the temporal patterns of their disease progression.
Methods: A Bayesian multivariate growth mixture model was fit to identify subgroups of patients from the Johns Hopkins Scleroderma Center Research Registry who share similar lung function trajectories. We jointly modeled forced vital capacity (FVC) and diffusing capacity for carbon monoxide (DLCO) as pulmonary outcomes for 289 patients with SSc and anti-topoisomerase 1 antibodies and developed a framework to sequentially update class membership probabilities for any given patient based on her accumulating data.
Results: We identified a "stable" group of 150 patients for whom both biomarkers changed little from the date of disease onset over the next 10 years, and a "progressor" group of 139 patients that, on average, experienced a clinically significant decline in both measures starting soon after disease onset. For any given patient at any given time, our algorithm calculates the probability of belonging to the progressor group using both baseline characteristics and the patient's longitudinal FVC and DLCO observations.
Conclusions: Our method calculates the probability of being a fast progressor at baseline when no FVC and DLCO are observed, then sequentially updates it as more information becomes available. This sequential integration of patient data and classification of her disease trajectory has the potential to improve clinical decisions and ultimately patient outcomes.
{"title":"Probabilistic Clustering Using Multivariate Growth Mixture Model in Clinical Settings-A Scleroderma Example.","authors":"Ji Soo Kim, Yizhen Xu, Rachel S Wallwork, Laura K Hummers, Ami A Shah, Scott L Zeger","doi":"10.1002/sim.70450","DOIUrl":"10.1002/sim.70450","url":null,"abstract":"<p><strong>Background: </strong>Scleroderma (systemic sclerosis; SSc) is a chronic autoimmune disease known for wide heterogeneity in patients' disease progression in multiple organ systems. Our goal is to guide clinical care by real-time classification of patients into clinically interpretable subpopulations based on their baseline characteristics and the temporal patterns of their disease progression.</p><p><strong>Methods: </strong>A Bayesian multivariate growth mixture model was fit to identify subgroups of patients from the Johns Hopkins Scleroderma Center Research Registry who share similar lung function trajectories. We jointly modeled forced vital capacity (FVC) and diffusing capacity for carbon monoxide (DLCO) as pulmonary outcomes for 289 patients with SSc and anti-topoisomerase 1 antibodies and developed a framework to sequentially update class membership probabilities for any given patient based on her accumulating data.</p><p><strong>Results: </strong>We identified a \"stable\" group of 150 patients for whom both biomarkers changed little from the date of disease onset over the next 10 years, and a \"progressor\" group of 139 patients that, on average, experienced a clinically significant decline in both measures starting soon after disease onset. For any given patient at any given time, our algorithm calculates the probability of belonging to the progressor group using both baseline characteristics and the patient's longitudinal FVC and DLCO observations.</p><p><strong>Conclusions: </strong>Our method calculates the probability of being a fast progressor at baseline when no FVC and DLCO are observed, then sequentially updates it as more information becomes available. This sequential integration of patient data and classification of her disease trajectory has the potential to improve clinical decisions and ultimately patient outcomes.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70450"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12904757/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146195722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohsen Sadatsafavi, Paul Gustafson, Solmaz Setayeshgar, Laure Wynants, Richard D Riley
Contemporary sample size calculations for external validation of risk prediction models require users to specify fixed values of assumed model performance metrics alongside target precision levels (e.g., 95% CI widths). However, due to the finite samples of previous studies, our knowledge of true model performance in the target population is uncertain, and so choosing fixed values represents an incomplete picture. As well, for net benefit (NB) as a measure of clinical utility, the relevance of conventional precision-based inference is doubtful. In this work, we propose a general Bayesian framework for multi-criteria sample size considerations for prediction models for binary outcomes. For statistical metrics of performance (e.g., discrimination and calibration), we propose sample size rules that target desired expected precision or desired assurance probability that the precision criteria will be satisfied. For NB, we propose rules based on Optimality Assurance (the probability that the planned study correctly identifies the optimal strategy) and Value of Information (VoI) analysis, which quantifies the expected gain in NB by learning about model performance from a validation study of a given size. We showcase these developments in a case study on the validation of a risk prediction model for deterioration among hospitalized COVID-19 patients. Compared to conventional sample size calculation methods, a Bayesian approach requires explicit quantification of uncertainty around model performance, and thereby enables flexible sample size rules based on expected precision, assurance probabilities, and VoI. In our case study, calculations based on VoI for NB suggest considerably lower sample sizes are required than when focusing on the precision of calibration metrics. This approach is implemented in the accompanying software.
{"title":"Bayesian Sample Size Calculations for External Validation Studies of Risk Prediction Models.","authors":"Mohsen Sadatsafavi, Paul Gustafson, Solmaz Setayeshgar, Laure Wynants, Richard D Riley","doi":"10.1002/sim.70389","DOIUrl":"10.1002/sim.70389","url":null,"abstract":"<p><p>Contemporary sample size calculations for external validation of risk prediction models require users to specify fixed values of assumed model performance metrics alongside target precision levels (e.g., 95% CI widths). However, due to the finite samples of previous studies, our knowledge of true model performance in the target population is uncertain, and so choosing fixed values represents an incomplete picture. As well, for net benefit (NB) as a measure of clinical utility, the relevance of conventional precision-based inference is doubtful. In this work, we propose a general Bayesian framework for multi-criteria sample size considerations for prediction models for binary outcomes. For statistical metrics of performance (e.g., discrimination and calibration), we propose sample size rules that target desired expected precision or desired assurance probability that the precision criteria will be satisfied. For NB, we propose rules based on Optimality Assurance (the probability that the planned study correctly identifies the optimal strategy) and Value of Information (VoI) analysis, which quantifies the expected gain in NB by learning about model performance from a validation study of a given size. We showcase these developments in a case study on the validation of a risk prediction model for deterioration among hospitalized COVID-19 patients. Compared to conventional sample size calculation methods, a Bayesian approach requires explicit quantification of uncertainty around model performance, and thereby enables flexible sample size rules based on expected precision, assurance probabilities, and VoI. In our case study, calculations based on VoI for NB suggest considerably lower sample sizes are required than when focusing on the precision of calibration metrics. This approach is implemented in the accompanying software.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70389"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12894519/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146166819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pegah Golchian, Jan Kapar, David S Watson, Marvin N Wright
Handling missing values is a common challenge in biostatistical analyses, typically addressed by imputation methods. We propose a novel, fast, and easy-to-use imputation method called missing value imputation with adversarial random forests (MissARF), based on generative machine learning, that provides both single and multiple imputation. MissARF employs adversarial random forest (ARF) for density estimation and data synthesis. To impute a missing value of an observation, we condition on the non-missing values and sample from the estimated conditional distribution generated by ARF. Our experiments demonstrate that MissARF performs comparably to state-of-the-art single and multiple imputation methods in terms of imputation quality and fast runtime with no additional costs for multiple imputation.
{"title":"Missing Value Imputation With Adversarial Random Forests-MissARF.","authors":"Pegah Golchian, Jan Kapar, David S Watson, Marvin N Wright","doi":"10.1002/sim.70379","DOIUrl":"10.1002/sim.70379","url":null,"abstract":"<p><p>Handling missing values is a common challenge in biostatistical analyses, typically addressed by imputation methods. We propose a novel, fast, and easy-to-use imputation method called missing value imputation with adversarial random forests (MissARF), based on generative machine learning, that provides both single and multiple imputation. MissARF employs adversarial random forest (ARF) for density estimation and data synthesis. To impute a missing value of an observation, we condition on the non-missing values and sample from the estimated conditional distribution generated by ARF. Our experiments demonstrate that MissARF performs comparably to state-of-the-art single and multiple imputation methods in terms of imputation quality and fast runtime with no additional costs for multiple imputation.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70379"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12871009/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We have studied 21 435 unique randomized controlled trials (RCTs) from the Cochrane Database of Systematic Reviews (CDSR). Of these trials, 7224 (34%) have a continuous (numerical) outcome and 14 211 (66%) have a binary outcome. We find that trials with a binary outcome have larger sample sizes on average, but also larger standard errors and fewer statistically significant results. We conclude that researchers tend to increase the sample size to compensate for the low information content of binary outcomes, but not sufficiently. In many cases, the binary outcome is the result of dichotomization of a continuous outcome, which is sometimes referred to as "responder analysis". In those cases, the loss of information is avoidable. Burdening more participants than necessary is wasteful, costly, and unethical. We provide a method to convert a sample size calculation for the comparison of two proportions into one for the comparison of the means of the underlying continuous outcomes. This demonstrates how much the sample size may be reduced if the outcome were not dichotomized. We also provide a method to calculate the loss of information after a dichotomization. We apply this method to all the trials from the CDSR with a binary outcome, and estimate that on average, only about 60% of the information is retained after dichotomization. We provide R code and a shiny app at: https://vanzwet.shinyapps.io/info_loss/ to do these calculations. We hope that quantifying the loss of information will discourage researchers from dichotomizing continuous outcomes. Instead, we recommend they "model continuously but interpret dichotomously". For example, they might present "percentage achieving clinically meaningful improvement" derived from a continuous analysis rather than by dichotomizing raw data.
{"title":"An Empirical Assessment of the Cost of Dichotomization of the Outcome of Clinical Trials.","authors":"Erik W van Zwet, Frank E Harrell, Stephen J Senn","doi":"10.1002/sim.70402","DOIUrl":"10.1002/sim.70402","url":null,"abstract":"<p><p>We have studied 21 435 unique randomized controlled trials (RCTs) from the Cochrane Database of Systematic Reviews (CDSR). Of these trials, 7224 (34%) have a continuous (numerical) outcome and 14 211 (66%) have a binary outcome. We find that trials with a binary outcome have larger sample sizes on average, but also larger standard errors and fewer statistically significant results. We conclude that researchers tend to increase the sample size to compensate for the low information content of binary outcomes, but not sufficiently. In many cases, the binary outcome is the result of dichotomization of a continuous outcome, which is sometimes referred to as \"responder analysis\". In those cases, the loss of information is avoidable. Burdening more participants than necessary is wasteful, costly, and unethical. We provide a method to convert a sample size calculation for the comparison of two proportions into one for the comparison of the means of the underlying continuous outcomes. This demonstrates how much the sample size may be reduced if the outcome were not dichotomized. We also provide a method to calculate the loss of information after a dichotomization. We apply this method to all the trials from the CDSR with a binary outcome, and estimate that on average, only about 60% of the information is retained after dichotomization. We provide R code and a shiny app at: https://vanzwet.shinyapps.io/info_loss/ to do these calculations. We hope that quantifying the loss of information will discourage researchers from dichotomizing continuous outcomes. Instead, we recommend they \"model continuously but interpret dichotomously\". For example, they might present \"percentage achieving clinically meaningful improvement\" derived from a continuous analysis rather than by dichotomizing raw data.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70402"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12875020/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146126398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Confounding bias and selection bias are two major challenges in causal inference with observational data. While numerous methods have been developed to mitigate confounding bias, they often assume that the data are representative of the study population and ignore the potential selection bias introduced during data collection. In this paper, we propose a unified weighting framework-survey-weighted propensity score weighting-to simultaneously address both confounding and selection biases when the observational dataset is a probability survey sample from a finite population, which is itself viewed as a random sample from the target superpopulation. The proposed method yields a doubly robust inferential procedure for a class of population weighted average treatment effects. We further extend our results to non-probability observational data when the sampling mechanism is unknown but auxiliary information of the confounding variables is available from an external probability sample. We focus on practically important scenarios where the confounders are only partially observed in the external data. Our analysis reveals that the key variables in the external data are those related to both treatment effect heterogeneity and the selection mechanism. We also discuss how to combine auxiliary information from multiple reference probability samples. Monte Carlo simulations and an application to a real-world non-probability observational dataset demonstrate the superiority of our proposed methods over standard propensity score weighting approaches.
{"title":"Causal Inference With Survey Data: A Robust Framework for Propensity Score Weighting in Probability and Non-Probability Samples.","authors":"Wei Liang, Changbao Wu","doi":"10.1002/sim.70420","DOIUrl":"10.1002/sim.70420","url":null,"abstract":"<p><p>Confounding bias and selection bias are two major challenges in causal inference with observational data. While numerous methods have been developed to mitigate confounding bias, they often assume that the data are representative of the study population and ignore the potential selection bias introduced during data collection. In this paper, we propose a unified weighting framework-survey-weighted propensity score weighting-to simultaneously address both confounding and selection biases when the observational dataset is a probability survey sample from a finite population, which is itself viewed as a random sample from the target superpopulation. The proposed method yields a doubly robust inferential procedure for a class of population weighted average treatment effects. We further extend our results to non-probability observational data when the sampling mechanism is unknown but auxiliary information of the confounding variables is available from an external probability sample. We focus on practically important scenarios where the confounders are only partially observed in the external data. Our analysis reveals that the key variables in the external data are those related to both treatment effect heterogeneity and the selection mechanism. We also discuss how to combine auxiliary information from multiple reference probability samples. Monte Carlo simulations and an application to a real-world non-probability observational dataset demonstrate the superiority of our proposed methods over standard propensity score weighting approaches.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70420"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12873465/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lillian Rountree, Lauren Zimmermann, Lucy Teed, Daniel M Weinberger, Bhramar Mukherjee
Excess death estimation, defined as the difference between the observed and expected death counts, is a popular technique for assessing the overall death toll of a public health crisis. The expected death count is defined as the expected number of deaths in the counterfactual scenario where prevailing conditions continued and the public health crisis did not occur. While excess death is frequently obtained by estimating the expected number of deaths and subtracting it from the observed number, some methods calculate this difference directly, based on historic mortality data and direct predictors of excess deaths. This tutorial provides guidance to researchers on the application of four popular methods for estimating excess death: the World Health Organization's Bayesian model; The Economist's gradient boosting algorithm; Acosta and Irizarry's quasi-Poisson model; and the Institute for Health Metrics and Evaluation's ensemble model. We begin with explanations of the mathematical formulation of each method and then demonstrate how to code each method in R, applying the code for a case study estimating excess death in the United States for the post-pandemic period of 2022-2024. An additional simulation study estimating excess death for three different scenarios and three different extrapolation periods further demonstrates general trends in performance across methods; together, these two studies show how the estimates by these methods and their accuracy vary widely depending on the choice of input covariates, reference period, extrapolation period, and tuning parameters. Caution should be exercised when extrapolating for estimating excess death, particularly in cases where the reference period of pre-event conditions is temporally distant (> 5 years) from the period of interest. In place of committing to one method under one setting, we advocate for using multiple excess death methods in tandem, comparing and synthesizing their results and conducting thorough sensitivity analyses as best practice for estimating excess death for a period of interest. We also call for more detailed simulation studies and benchmark datasets to better understand the accuracy and comparative performance of methods estimating excess death.
{"title":"A Tutorial on Implementing Statistical Methods for Estimating Excess Death With a Case Study and Simulations on Estimating Excess Death in the Post-COVID-19 United States.","authors":"Lillian Rountree, Lauren Zimmermann, Lucy Teed, Daniel M Weinberger, Bhramar Mukherjee","doi":"10.1002/sim.70396","DOIUrl":"https://doi.org/10.1002/sim.70396","url":null,"abstract":"<p><p>Excess death estimation, defined as the difference between the observed and expected death counts, is a popular technique for assessing the overall death toll of a public health crisis. The expected death count is defined as the expected number of deaths in the counterfactual scenario where prevailing conditions continued and the public health crisis did not occur. While excess death is frequently obtained by estimating the expected number of deaths and subtracting it from the observed number, some methods calculate this difference directly, based on historic mortality data and direct predictors of excess deaths. This tutorial provides guidance to researchers on the application of four popular methods for estimating excess death: the World Health Organization's Bayesian model; The Economist's gradient boosting algorithm; Acosta and Irizarry's quasi-Poisson model; and the Institute for Health Metrics and Evaluation's ensemble model. We begin with explanations of the mathematical formulation of each method and then demonstrate how to code each method in R, applying the code for a case study estimating excess death in the United States for the post-pandemic period of 2022-2024. An additional simulation study estimating excess death for three different scenarios and three different extrapolation periods further demonstrates general trends in performance across methods; together, these two studies show how the estimates by these methods and their accuracy vary widely depending on the choice of input covariates, reference period, extrapolation period, and tuning parameters. Caution should be exercised when extrapolating for estimating excess death, particularly in cases where the reference period of pre-event conditions is temporally distant (> 5 years) from the period of interest. In place of committing to one method under one setting, we advocate for using multiple excess death methods in tandem, comparing and synthesizing their results and conducting thorough sensitivity analyses as best practice for estimating excess death for a period of interest. We also call for more detailed simulation studies and benchmark datasets to better understand the accuracy and comparative performance of methods estimating excess death.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70396"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146166841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correction to \"Model-Robust Standardization in Cluster-Randomized Trials\".","authors":"","doi":"10.1002/sim.70447","DOIUrl":"https://doi.org/10.1002/sim.70447","url":null,"abstract":"","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70447"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146228758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In medical research, particularly in fields such as genomics, multi-center clinical trials, and meta-analysis, effectively combining the p-values from multiple related hypothesis tests has always been a challenging statistical issue. To address this problem and enhance the statistical power of comprehensive analysis, this study proposes a generalized harmonic mean for p-values (GHMP( )) combination method and builds two kinds of combination tests based on this framework. The first kind of test is designed for applications with small significance levels and has more lenient conditions for adapting to correlations, making it suitable for the complex dependency structures commonly found in actual research. The second kind of test introduces a novel high-order tail approximation technique based on stable distribution theory, which can more accurately estimate the extreme tail probabilities at large significance levels under independent or weakly correlated conditions. Extensive simulation experiments show that both kinds of tests perform robustly across various configurations, with statistical power not inferior to the traditional Cauchy combination test (CCT) and minimum p-value (MinP) methods, and demonstrate superior detection capabilities in several scenarios. Additionally, GHMP( ) has high computational efficiency and has been empirically validated in real genetic data. These characteristics make it a reliable and practical analytical tool for high-dimensional medical research, such as genome-wide association studies (GWAS) and large-scale meta-analysis.
在医学研究中,特别是在基因组学、多中心临床试验和元分析等领域,如何有效地结合多个相关假设检验的p值一直是一个具有挑战性的统计学问题。为了解决这一问题,提高综合分析的统计能力,本文提出了p值的广义调和均值(GHMP(ξ $$ xi $$))组合方法,并在此框架上构建了两种组合检验。第一类测试是为具有小显著性水平的应用程序设计的,并且具有更宽松的适应相关性的条件,使其适合于实际研究中常见的复杂依赖结构。第二类检验引入了一种新的基于稳定分布理论的高阶尾部逼近技术,该技术可以在独立或弱相关条件下更准确地估计大显著性水平下的极端尾部概率。大量的仿真实验表明,这两种测试在各种配置下都具有鲁棒性,其统计功率不低于传统的柯西组合测试(CCT)和最小p值(MinP)方法,并且在几种情况下显示出优越的检测能力。此外,GHMP(ξ $$ xi $$)具有较高的计算效率,并在实际遗传数据中得到了经验验证。这些特点使其成为一个可靠和实用的高维医学研究分析工具,如全基因组关联研究(GWAS)和大规模荟萃分析。
{"title":"<ArticleTitle xmlns:ns0=\"http://www.w3.org/1998/Math/MathML\">The Generalized Harmonic Mean for <ns0:math> <ns0:semantics><ns0:mrow><ns0:mi>p</ns0:mi></ns0:mrow> <ns0:annotation>$$ p $$</ns0:annotation></ns0:semantics> </ns0:math> -Values: Combining Dependent and Independent Tests.","authors":"Zhengbang Li, Xinjie Zhou","doi":"10.1002/sim.70439","DOIUrl":"https://doi.org/10.1002/sim.70439","url":null,"abstract":"<p><p>In medical research, particularly in fields such as genomics, multi-center clinical trials, and meta-analysis, effectively combining the p-values from multiple related hypothesis tests has always been a challenging statistical issue. To address this problem and enhance the statistical power of comprehensive analysis, this study proposes a generalized harmonic mean for p-values (GHMP( <math> <semantics><mrow><mi>ξ</mi></mrow> <annotation>$$ xi $$</annotation></semantics> </math> )) combination method and builds two kinds of combination tests based on this framework. The first kind of test is designed for applications with small significance levels and has more lenient conditions for adapting to correlations, making it suitable for the complex dependency structures commonly found in actual research. The second kind of test introduces a novel high-order tail approximation technique based on stable distribution theory, which can more accurately estimate the extreme tail probabilities at large significance levels under independent or weakly correlated conditions. Extensive simulation experiments show that both kinds of tests perform robustly across various configurations, with statistical power not inferior to the traditional Cauchy combination test (CCT) and minimum p-value (MinP) methods, and demonstrate superior detection capabilities in several scenarios. Additionally, GHMP( <math> <semantics><mrow><mi>ξ</mi></mrow> <annotation>$$ xi $$</annotation></semantics> </math> ) has high computational efficiency and has been empirically validated in real genetic data. These characteristics make it a reliable and practical analytical tool for high-dimensional medical research, such as genome-wide association studies (GWAS) and large-scale meta-analysis.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70439"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146182708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Modeling prognosis has unique significance in cancer research. For this purpose, omics data have been routinely used. In a series of recent studies, pathological imaging data derived from biopsy have also been shown as informative. Motivated by the complementary information contained in omics and pathological imaging data, we examine integrating them under a Cox modeling framework. The two types of data have distinct properties: for omics variables, which are more actionable and demand stronger interpretability, we model their effects in a parametric way; whereas for pathological imaging features, which are not actionable and do not have lucid interpretations, we model their effects in a nonparametric way for better flexibility and prediction performance. Specifically, we adopt deep neural networks (DNNs) for nonparametric estimation, considering their advantages over regression models in accommodating nonlinearity and providing better prediction. As both omics and pathological imaging data are high-dimensional and are expected to contain noises, we propose applying penalization for selecting relevant variables and regulating estimation. Different from some existing studies, we pay unique attention to overlapping information contained in the two types of data. Numerical investigations are carefully carried out. In the analysis of TCGA data, sensible selection and superior prediction performance are observed, which demonstrates the practical utility of the proposed analysis.
{"title":"Integrating Omics and Pathological Imaging Data for Cancer Prognosis via a Deep Neural Network-Based Cox Model.","authors":"Jingmao Li, Shuangge Ma","doi":"10.1002/sim.70435","DOIUrl":"https://doi.org/10.1002/sim.70435","url":null,"abstract":"<p><p>Modeling prognosis has unique significance in cancer research. For this purpose, omics data have been routinely used. In a series of recent studies, pathological imaging data derived from biopsy have also been shown as informative. Motivated by the complementary information contained in omics and pathological imaging data, we examine integrating them under a Cox modeling framework. The two types of data have distinct properties: for omics variables, which are more actionable and demand stronger interpretability, we model their effects in a parametric way; whereas for pathological imaging features, which are not actionable and do not have lucid interpretations, we model their effects in a nonparametric way for better flexibility and prediction performance. Specifically, we adopt deep neural networks (DNNs) for nonparametric estimation, considering their advantages over regression models in accommodating nonlinearity and providing better prediction. As both omics and pathological imaging data are high-dimensional and are expected to contain noises, we propose applying penalization for selecting relevant variables and regulating estimation. Different from some existing studies, we pay unique attention to overlapping information contained in the two types of data. Numerical investigations are carefully carried out. In the analysis of TCGA data, sensible selection and superior prediction performance are observed, which demonstrates the practical utility of the proposed analysis.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70435"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146126432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}