The development of height, weight, and body mass index (BMI) in children has been the subject of considerable interest due to secular changes in growth patterns, such as increases in height and rising obesity rates. Predicting growth in a target population is particularly challenging when the population comprises of individuals with and without past growth data. In this study, we present three approaches for the joint prediction of height and weight in that situation. The predictive performance of each approach is evaluated using a range of measures that assess different properties of the prediction distributions. We also compare the approaches to interpret their clinical relevance, particularly in terms of prediction accuracy. The developed prediction approaches vary in their use of past growth data. We predict growth for a target population of children aged 4-11 years in 2021, residing in three municipalities in Finland. We employ longitudinal register data on height and weight, collected from children aged 2-11 years between 2014 and 2020 in these municipalities to construct a Bayesian hierarchical linear model (HLM) for growth prediction. Additionally, we estimate posterior unconditional distributions of height, weight, and BMI for within-sample model validation. The inclusion of individual-level data in the predictions reduced the divergence from observed measurements, particularly for weight and BMI. This is important given the skewed distribution of the measurements with increasing age. Incorporating individual-level information is also beneficial for child-specific predictions. Our study highlights the importance of multiple prediction checks to understand the flaws and strengths of each prediction approach.
{"title":"Modelling and Predicting Population-Level Growth With Individual-Level Information.","authors":"Tuuli Kauppala, Tuomo Susi, Sangita Kulathinal","doi":"10.1002/sim.70421","DOIUrl":"10.1002/sim.70421","url":null,"abstract":"<p><p>The development of height, weight, and body mass index (BMI) in children has been the subject of considerable interest due to secular changes in growth patterns, such as increases in height and rising obesity rates. Predicting growth in a target population is particularly challenging when the population comprises of individuals with and without past growth data. In this study, we present three approaches for the joint prediction of height and weight in that situation. The predictive performance of each approach is evaluated using a range of measures that assess different properties of the prediction distributions. We also compare the approaches to interpret their clinical relevance, particularly in terms of prediction accuracy. The developed prediction approaches vary in their use of past growth data. We predict growth for a target population of children aged 4-11 years in 2021, residing in three municipalities in Finland. We employ longitudinal register data on height and weight, collected from children aged 2-11 years between 2014 and 2020 in these municipalities to construct a Bayesian hierarchical linear model (HLM) for growth prediction. Additionally, we estimate posterior unconditional distributions of height, weight, and BMI for within-sample model validation. The inclusion of individual-level data in the predictions reduced the divergence from observed measurements, particularly for weight and BMI. This is important given the skewed distribution of the measurements with increasing age. Incorporating individual-level information is also beneficial for child-specific predictions. Our study highlights the importance of multiple prediction checks to understand the flaws and strengths of each prediction approach.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70421"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12926727/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147272057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hao Wang, Xinyuan Chen, Katherine R Courtright, Scott D Halpern, Michael O Harhay, Monica Taljaard, Fan Li
In stepped wedge cluster randomized trials (SW-CRTs), the intervention is rolled out to clusters over multiple periods. A standard approach for analyzing SW-CRTs utilizes the linear mixed model, where the treatment effect is only present after the treatment adoption, under the assumption of no anticipation. This assumption, however, may not always hold in practice because stakeholders, providers, or individuals who are aware of the treatment adoption timing (especially when blinding is challenging or infeasible) can inadvertently change their behaviors in anticipation of the forthcoming intervention. We provide an analytical framework to address the anticipation effect in SW-CRTs and study its impact. We derive expectations of the estimators based on a collection of linear mixed models and demonstrate that when the anticipation effect is ignored, these estimators give biased estimates of the treatment effect. We also provide updated sample size formulas that explicitly account for anticipation effects, exposure-time heterogeneity, or both in SW-CRTs and illustrate their impact on study power. Through simulation studies and empirical analyses, we compare the treatment effect estimators with and without adjusting for anticipation, and provide some practical considerations.
{"title":"On Anticipation Effect in Stepped Wedge Cluster Randomized Trials.","authors":"Hao Wang, Xinyuan Chen, Katherine R Courtright, Scott D Halpern, Michael O Harhay, Monica Taljaard, Fan Li","doi":"10.1002/sim.70380","DOIUrl":"10.1002/sim.70380","url":null,"abstract":"<p><p>In stepped wedge cluster randomized trials (SW-CRTs), the intervention is rolled out to clusters over multiple periods. A standard approach for analyzing SW-CRTs utilizes the linear mixed model, where the treatment effect is only present after the treatment adoption, under the assumption of no anticipation. This assumption, however, may not always hold in practice because stakeholders, providers, or individuals who are aware of the treatment adoption timing (especially when blinding is challenging or infeasible) can inadvertently change their behaviors in anticipation of the forthcoming intervention. We provide an analytical framework to address the anticipation effect in SW-CRTs and study its impact. We derive expectations of the estimators based on a collection of linear mixed models and demonstrate that when the anticipation effect is ignored, these estimators give biased estimates of the treatment effect. We also provide updated sample size formulas that explicitly account for anticipation effects, exposure-time heterogeneity, or both in SW-CRTs and illustrate their impact on study power. Through simulation studies and empirical analyses, we compare the treatment effect estimators with and without adjusting for anticipation, and provide some practical considerations.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70380"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146150718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In including random effects to account for dependent observations, the odds ratio interpretation of logistic regression coefficients is changed from population-averaged to subject-specific. This is unappealing in many applications, motivating a rich literature on methods that maintain the marginal logistic regression structure without random effects, such as generalized estimating equations. However, for spatial data, random effect approaches are appealing in providing a full probabilistic characterization of the data that can be used for prediction. We propose a new class of spatial logistic regression models that maintain both population-averaged and subject-specific interpretations through a novel class of bridge processes for spatial random effects. These processes are shown to have appealing computational and theoretical properties, including a scale mixture of normal representation. The new methodology is illustrated with simulations and an analysis of childhood malaria prevalence data in Gambia.
{"title":"Marginally Interpretable Spatial Logistic Regression With Bridge Processes.","authors":"Changwoo J Lee, David B Dunson","doi":"10.1002/sim.70399","DOIUrl":"10.1002/sim.70399","url":null,"abstract":"<p><p>In including random effects to account for dependent observations, the odds ratio interpretation of logistic regression coefficients is changed from population-averaged to subject-specific. This is unappealing in many applications, motivating a rich literature on methods that maintain the marginal logistic regression structure without random effects, such as generalized estimating equations. However, for spatial data, random effect approaches are appealing in providing a full probabilistic characterization of the data that can be used for prediction. We propose a new class of spatial logistic regression models that maintain both population-averaged and subject-specific interpretations through a novel class of bridge processes for spatial random effects. These processes are shown to have appealing computational and theoretical properties, including a scale mixture of normal representation. The new methodology is illustrated with simulations and an analysis of childhood malaria prevalence data in Gambia.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70399"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12991412/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The use of combination treatments in early-phase oncology trials is growing. The objective of these trials is to search for the maximum tolerated dose combination from a predefined set. However, cases in which the initial set of combinations does not contain one close to the target toxicity pose a significant challenge. Currently, solutions are typically ad hoc and may bring practical challenges. We propose a novel method for inserting dose levels mid-trial, which features a search for the contour partitioning the dose space into combinations with toxicity truly above and below the target toxicity. Establishing this contour with a degree of certainty suggests that no combination is close to the target toxicity, triggering an insertion. We examine our approach in a comprehensive simulation study applied to the PIPE design and two-dimensional Bayesian logistic regression model (BLRM), though any model-based or model-assisted design is an appropriate candidate. Our results demonstrate that, on average, the insertion method can increase the probability of selecting combinations close to the target toxicity, without increasing the probability of subtherapeutic or toxic recommendations.
{"title":"A Novel Method for Inserting Dose Levels Mid-Trial in Early-Phase Oncology Combination Studies.","authors":"Matthew George, Ian Wadsworth, Pavel Mozgunov","doi":"10.1002/sim.70417","DOIUrl":"10.1002/sim.70417","url":null,"abstract":"<p><p>The use of combination treatments in early-phase oncology trials is growing. The objective of these trials is to search for the maximum tolerated dose combination from a predefined set. However, cases in which the initial set of combinations does not contain one close to the target toxicity pose a significant challenge. Currently, solutions are typically ad hoc and may bring practical challenges. We propose a novel method for inserting dose levels mid-trial, which features a search for the contour partitioning the dose space into combinations with toxicity truly above and below the target toxicity. Establishing this contour with a degree of certainty suggests that no combination is close to the target toxicity, triggering an insertion. We examine our approach in a comprehensive simulation study applied to the PIPE design and two-dimensional Bayesian logistic regression model (BLRM), though any model-based or model-assisted design is an appropriate candidate. Our results demonstrate that, on average, the insertion method can increase the probability of selecting combinations close to the target toxicity, without increasing the probability of subtherapeutic or toxic recommendations.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70417"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12917875/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146166837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ji Soo Kim, Yizhen Xu, Rachel S Wallwork, Laura K Hummers, Ami A Shah, Scott L Zeger
Background: Scleroderma (systemic sclerosis; SSc) is a chronic autoimmune disease known for wide heterogeneity in patients' disease progression in multiple organ systems. Our goal is to guide clinical care by real-time classification of patients into clinically interpretable subpopulations based on their baseline characteristics and the temporal patterns of their disease progression.
Methods: A Bayesian multivariate growth mixture model was fit to identify subgroups of patients from the Johns Hopkins Scleroderma Center Research Registry who share similar lung function trajectories. We jointly modeled forced vital capacity (FVC) and diffusing capacity for carbon monoxide (DLCO) as pulmonary outcomes for 289 patients with SSc and anti-topoisomerase 1 antibodies and developed a framework to sequentially update class membership probabilities for any given patient based on her accumulating data.
Results: We identified a "stable" group of 150 patients for whom both biomarkers changed little from the date of disease onset over the next 10 years, and a "progressor" group of 139 patients that, on average, experienced a clinically significant decline in both measures starting soon after disease onset. For any given patient at any given time, our algorithm calculates the probability of belonging to the progressor group using both baseline characteristics and the patient's longitudinal FVC and DLCO observations.
Conclusions: Our method calculates the probability of being a fast progressor at baseline when no FVC and DLCO are observed, then sequentially updates it as more information becomes available. This sequential integration of patient data and classification of her disease trajectory has the potential to improve clinical decisions and ultimately patient outcomes.
{"title":"Probabilistic Clustering Using Multivariate Growth Mixture Model in Clinical Settings-A Scleroderma Example.","authors":"Ji Soo Kim, Yizhen Xu, Rachel S Wallwork, Laura K Hummers, Ami A Shah, Scott L Zeger","doi":"10.1002/sim.70450","DOIUrl":"10.1002/sim.70450","url":null,"abstract":"<p><strong>Background: </strong>Scleroderma (systemic sclerosis; SSc) is a chronic autoimmune disease known for wide heterogeneity in patients' disease progression in multiple organ systems. Our goal is to guide clinical care by real-time classification of patients into clinically interpretable subpopulations based on their baseline characteristics and the temporal patterns of their disease progression.</p><p><strong>Methods: </strong>A Bayesian multivariate growth mixture model was fit to identify subgroups of patients from the Johns Hopkins Scleroderma Center Research Registry who share similar lung function trajectories. We jointly modeled forced vital capacity (FVC) and diffusing capacity for carbon monoxide (DLCO) as pulmonary outcomes for 289 patients with SSc and anti-topoisomerase 1 antibodies and developed a framework to sequentially update class membership probabilities for any given patient based on her accumulating data.</p><p><strong>Results: </strong>We identified a \"stable\" group of 150 patients for whom both biomarkers changed little from the date of disease onset over the next 10 years, and a \"progressor\" group of 139 patients that, on average, experienced a clinically significant decline in both measures starting soon after disease onset. For any given patient at any given time, our algorithm calculates the probability of belonging to the progressor group using both baseline characteristics and the patient's longitudinal FVC and DLCO observations.</p><p><strong>Conclusions: </strong>Our method calculates the probability of being a fast progressor at baseline when no FVC and DLCO are observed, then sequentially updates it as more information becomes available. This sequential integration of patient data and classification of her disease trajectory has the potential to improve clinical decisions and ultimately patient outcomes.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70450"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12904757/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146195722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cluster randomized trials (CRTs), in which entire clusters of subjects are randomized to treatment arms, are widely used in pragmatic trials to evaluate interventions under real-world conditions. However, CRTs are particularly vulnerable to treatment non-adherence, especially when cluster-level preferences lead subjects in clusters to deviate from their assigned treatment. Such deviations can reduce power, introduce bias, and compromise generalizability if not properly addressed. This research is directly motivated by a planned multi-center trial in Kawasaki Disease patients with high risk for coronary artery abnormalities, in which institutional treatment preferences influence both willingness to participate and adhere. To address this issue, we propose a Bayesian hierarchical model under a Preference-Informed Cluster Randomized Design (PICRD). This model explicitly incorporates cluster-level treatment switching into the analysis rather than excluding non-willing or non-adherent clusters. We conduct a simulation study to evaluate the performance of the PICRD model across a range of treatment effect sizes and switching proportions. Results demonstrate that the PICRD model consistently outperforms per-protocol analyses by maintaining higher power for the main treatment effect, producing narrower 95% credible intervals, and yielding more stable bias and root mean square error in the presence of substantial non-adherence. By explicitly modeling preference within a Bayesian hierarchical framework, the PICRD approach provides a flexible and robust solution for CRTs conducted in pragmatic settings when willingness to accept randomization assignment or adherence to randomization is often unrealistic.
{"title":"Preference-Informed Cluster Randomized Design for Pragmatic Clinical Trials.","authors":"Yuwei Cheng, Adriana Tremoulet, Sonia Jain","doi":"10.1002/sim.70426","DOIUrl":"10.1002/sim.70426","url":null,"abstract":"<p><p>Cluster randomized trials (CRTs), in which entire clusters of subjects are randomized to treatment arms, are widely used in pragmatic trials to evaluate interventions under real-world conditions. However, CRTs are particularly vulnerable to treatment non-adherence, especially when cluster-level preferences lead subjects in clusters to deviate from their assigned treatment. Such deviations can reduce power, introduce bias, and compromise generalizability if not properly addressed. This research is directly motivated by a planned multi-center trial in Kawasaki Disease patients with high risk for coronary artery abnormalities, in which institutional treatment preferences influence both willingness to participate and adhere. To address this issue, we propose a Bayesian hierarchical model under a Preference-Informed Cluster Randomized Design (PICRD). This model explicitly incorporates cluster-level treatment switching into the analysis rather than excluding non-willing or non-adherent clusters. We conduct a simulation study to evaluate the performance of the PICRD model across a range of treatment effect sizes and switching proportions. Results demonstrate that the PICRD model consistently outperforms per-protocol analyses by maintaining higher power for the main treatment effect, producing narrower 95% credible intervals, and yielding more stable bias and root mean square error in the presence of substantial non-adherence. By explicitly modeling preference within a Bayesian hierarchical framework, the PICRD approach provides a flexible and robust solution for CRTs conducted in pragmatic settings when willingness to accept randomization assignment or adherence to randomization is often unrealistic.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70426"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12875034/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146126462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A dynamic treatment regime is a sequence of decision rules that map available history information to a treatment option at each decision point. The optimal dynamic treatment regime seeks to make these decisions to maximize the expected outcome of interest. Most existing methods assume population homogeneity. In many complex applications, ignoring latent heterogeneous structures may compromise estimation, highlighting the necessity of exploring heterogeneous structures during the estimation of optimal treatment regimes. We propose heterogeneous Q-learning that facilitates the estimation of optimal dynamic treatment regimes using a concave pairwise fusion penalized approach. The proposed method employs an alternating direction method of multipliers algorithm to solve the concave pairwise fusion penalized least squares problem in each stage. Simulation studies demonstrate that our proposed method outperforms the standard Q-learning method, and it is further illustrated through a real data analysis from the China Rural Hypertension Control Project (CRHCP) study group.
{"title":"A Concave Pairwise Fusion Approach to Heterogeneous Q-Learning for Dynamic Treatment Regimes.","authors":"Jubo Sun, Wensheng Zhu, Guozhe Sun","doi":"10.1002/sim.70415","DOIUrl":"https://doi.org/10.1002/sim.70415","url":null,"abstract":"<p><p>A dynamic treatment regime is a sequence of decision rules that map available history information to a treatment option at each decision point. The optimal dynamic treatment regime seeks to make these decisions to maximize the expected outcome of interest. Most existing methods assume population homogeneity. In many complex applications, ignoring latent heterogeneous structures may compromise estimation, highlighting the necessity of exploring heterogeneous structures during the estimation of optimal treatment regimes. We propose heterogeneous Q-learning that facilitates the estimation of optimal dynamic treatment regimes using a concave pairwise fusion penalized approach. The proposed method employs an alternating direction method of multipliers algorithm to solve the concave pairwise fusion penalized least squares problem in each stage. Simulation studies demonstrate that our proposed method outperforms the standard Q-learning method, and it is further illustrated through a real data analysis from the China Rural Hypertension Control Project (CRHCP) study group.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70415"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Phase I clinical trials aim to identify the maximum tolerated dose (MTD), a task that becomes challenging in rare disease due to limited patient recruitment. Traditional dose-finding designs, which assign one dose per patient, require a sufficient sample size that may be infeasible for rare disease trials. To address these limitations, we propose the patient retreat in dose escalation (PRIDE) scheme, which integrates intra-patient dose escalation and considers intra-patient correlations by incorporating random effects into a Bayesian hierarchical framework. We further introduce PRIDE-FA (flexible allocation), an extension of PRIDE with a flexible allocation strategy. By allowing retreated patients to be assigned to any dose level based on trial needs, PRIDE-FA improves resource efficiency, leading to greater reductions in required sample size and trial duration. This paper incorporates random effects into established dose-finding designs, including the calibration-free odds (CFO) design, the Bayesian optimal interval (BOIN) design, and the continual reassessment method (CRM) to account for intra-patient correlations when each patient may receive multiple doses. Simulation studies demonstrate that PRIDE and PRIDE-FA significantly improve the accuracy of MTD selection, reduce required sample size, and shorten trial duration compared to existing dose-finding methods. Together, PRIDE and PRIDE-FA provide a robust and efficient framework for phase I clinical trials with rare diseases.
{"title":"Patient Retreat in Dose Escalation for Phase I Clinical Trials With Rare Diseases.","authors":"Jialu Fang, Guosheng Yin","doi":"10.1002/sim.70409","DOIUrl":"10.1002/sim.70409","url":null,"abstract":"<p><p>Phase I clinical trials aim to identify the maximum tolerated dose (MTD), a task that becomes challenging in rare disease due to limited patient recruitment. Traditional dose-finding designs, which assign one dose per patient, require a sufficient sample size that may be infeasible for rare disease trials. To address these limitations, we propose the patient retreat in dose escalation (PRIDE) scheme, which integrates intra-patient dose escalation and considers intra-patient correlations by incorporating random effects into a Bayesian hierarchical framework. We further introduce PRIDE-FA (flexible allocation), an extension of PRIDE with a flexible allocation strategy. By allowing retreated patients to be assigned to any dose level based on trial needs, PRIDE-FA improves resource efficiency, leading to greater reductions in required sample size and trial duration. This paper incorporates random effects into established dose-finding designs, including the calibration-free odds (CFO) design, the Bayesian optimal interval (BOIN) design, and the continual reassessment method (CRM) to account for intra-patient correlations when each patient may receive multiple doses. Simulation studies demonstrate that PRIDE and PRIDE-FA significantly improve the accuracy of MTD selection, reduce required sample size, and shorten trial duration compared to existing dose-finding methods. Together, PRIDE and PRIDE-FA provide a robust and efficient framework for phase I clinical trials with rare diseases.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70409"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12873649/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohsen Sadatsafavi, Paul Gustafson, Solmaz Setayeshgar, Laure Wynants, Richard D Riley
Contemporary sample size calculations for external validation of risk prediction models require users to specify fixed values of assumed model performance metrics alongside target precision levels (e.g., 95% CI widths). However, due to the finite samples of previous studies, our knowledge of true model performance in the target population is uncertain, and so choosing fixed values represents an incomplete picture. As well, for net benefit (NB) as a measure of clinical utility, the relevance of conventional precision-based inference is doubtful. In this work, we propose a general Bayesian framework for multi-criteria sample size considerations for prediction models for binary outcomes. For statistical metrics of performance (e.g., discrimination and calibration), we propose sample size rules that target desired expected precision or desired assurance probability that the precision criteria will be satisfied. For NB, we propose rules based on Optimality Assurance (the probability that the planned study correctly identifies the optimal strategy) and Value of Information (VoI) analysis, which quantifies the expected gain in NB by learning about model performance from a validation study of a given size. We showcase these developments in a case study on the validation of a risk prediction model for deterioration among hospitalized COVID-19 patients. Compared to conventional sample size calculation methods, a Bayesian approach requires explicit quantification of uncertainty around model performance, and thereby enables flexible sample size rules based on expected precision, assurance probabilities, and VoI. In our case study, calculations based on VoI for NB suggest considerably lower sample sizes are required than when focusing on the precision of calibration metrics. This approach is implemented in the accompanying software.
{"title":"Bayesian Sample Size Calculations for External Validation Studies of Risk Prediction Models.","authors":"Mohsen Sadatsafavi, Paul Gustafson, Solmaz Setayeshgar, Laure Wynants, Richard D Riley","doi":"10.1002/sim.70389","DOIUrl":"10.1002/sim.70389","url":null,"abstract":"<p><p>Contemporary sample size calculations for external validation of risk prediction models require users to specify fixed values of assumed model performance metrics alongside target precision levels (e.g., 95% CI widths). However, due to the finite samples of previous studies, our knowledge of true model performance in the target population is uncertain, and so choosing fixed values represents an incomplete picture. As well, for net benefit (NB) as a measure of clinical utility, the relevance of conventional precision-based inference is doubtful. In this work, we propose a general Bayesian framework for multi-criteria sample size considerations for prediction models for binary outcomes. For statistical metrics of performance (e.g., discrimination and calibration), we propose sample size rules that target desired expected precision or desired assurance probability that the precision criteria will be satisfied. For NB, we propose rules based on Optimality Assurance (the probability that the planned study correctly identifies the optimal strategy) and Value of Information (VoI) analysis, which quantifies the expected gain in NB by learning about model performance from a validation study of a given size. We showcase these developments in a case study on the validation of a risk prediction model for deterioration among hospitalized COVID-19 patients. Compared to conventional sample size calculation methods, a Bayesian approach requires explicit quantification of uncertainty around model performance, and thereby enables flexible sample size rules based on expected precision, assurance probabilities, and VoI. In our case study, calculations based on VoI for NB suggest considerably lower sample sizes are required than when focusing on the precision of calibration metrics. This approach is implemented in the accompanying software.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70389"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12894519/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146166819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploratory and Confirmatory Empirical Research on Algorithms: Implications for Methodological Practice and Education-A Comment on \"On 'Confirmatory' Methodological Research in Statistics and Related Fields\".","authors":"Ulrich Mansmann","doi":"10.1002/sim.70388","DOIUrl":"10.1002/sim.70388","url":null,"abstract":"","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70388"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146221460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}