The mean survival is the key ingredient of the decision process in several applications, notably in health economic evaluations. It is defined as the area under the complete survival curve, thus necessitating extrapolation of the observed data. This may be achieved in a more stable manner by borrowing long term evidence from registry and demographic data. In this article, we employ a Bayesian mortality model and transfer its projections in order to construct the baseline population that acts as an anchor of the survival model. We then propose extrapolation methods based on flexible parametric poly-hazard models which can naturally accommodate diverse shapes, including non-proportional hazards and crossing survival curves, while typically maintaining a natural interpretation as a data generating mechanism. We estimate the mean survival and related estimands in 3 cases, namely breast cancer, advanced melanoma, and cardiac arrhythmia. Specifically, we evaluate the survival disadvantage of triple-negative breast cancer cases, the efficacy of combining immunotherapy with mRNA cancer therapeutics for melanoma treatment and the suitability of implantable cardioverter defibrilators for cardiac arrhythmia. The last is conducted in a competing risks context illustrating how working on the cause-specific hazard alone minimizes potential instability. The results suggest that the proposed approach offers a flexible, interpretable, and robust approach when survival extrapolation is required.
{"title":"Stable survival extrapolation using mortality projections.","authors":"Anastasios Apsemidis, Nikolaos Demiris","doi":"10.1093/biomtc/ujaf159","DOIUrl":"10.1093/biomtc/ujaf159","url":null,"abstract":"<p><p>The mean survival is the key ingredient of the decision process in several applications, notably in health economic evaluations. It is defined as the area under the complete survival curve, thus necessitating extrapolation of the observed data. This may be achieved in a more stable manner by borrowing long term evidence from registry and demographic data. In this article, we employ a Bayesian mortality model and transfer its projections in order to construct the baseline population that acts as an anchor of the survival model. We then propose extrapolation methods based on flexible parametric poly-hazard models which can naturally accommodate diverse shapes, including non-proportional hazards and crossing survival curves, while typically maintaining a natural interpretation as a data generating mechanism. We estimate the mean survival and related estimands in 3 cases, namely breast cancer, advanced melanoma, and cardiac arrhythmia. Specifically, we evaluate the survival disadvantage of triple-negative breast cancer cases, the efficacy of combining immunotherapy with mRNA cancer therapeutics for melanoma treatment and the suitability of implantable cardioverter defibrilators for cardiac arrhythmia. The last is conducted in a competing risks context illustrating how working on the cause-specific hazard alone minimizes potential instability. The results suggest that the proposed approach offers a flexible, interpretable, and robust approach when survival extrapolation is required.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145720509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The heterogeneous treatment effect plays a crucial role in precision medicine. There is evidence that real-world data, even subject to biases, can be employed as supplementary evidence for randomized clinical trials to improve the statistical efficiency of the heterogeneous treatment effect estimation. In this paper, for survival data with right censoring, we consider estimating the heterogeneous treatment effect, defined as the difference of the treatment-specific conditional restricted mean survival times given covariates, by synthesizing evidence from randomized clinical trials and the real-world data with possible biases. We define an omnibus bias function to characterize the effect of biases caused by unmeasured confounders, censoring, and outcome heterogeneity, and further, identify it by combining the trial and real-world data. We propose a penalized sieve method to estimate the heterogeneous treatment effect and the bias function. We further study the theoretical properties of the proposed integrative estimators based on the theory of reproducing kernel Hilbert space and empirical process. The proposed methodology outperforms the approach solely based on the trial data through simulation studies and an integrative analysis of the data from a randomized trial and a real-world registry on early-stage non-small-cell lung cancer.
{"title":"Statistical inference for heterogeneous treatment effect with right-censored data from synthesizing randomized clinical trials and real-world data.","authors":"Guangcai Mao, Shu Yang, Xiaofei Wang","doi":"10.1093/biomtc/ujaf131","DOIUrl":"10.1093/biomtc/ujaf131","url":null,"abstract":"<p><p>The heterogeneous treatment effect plays a crucial role in precision medicine. There is evidence that real-world data, even subject to biases, can be employed as supplementary evidence for randomized clinical trials to improve the statistical efficiency of the heterogeneous treatment effect estimation. In this paper, for survival data with right censoring, we consider estimating the heterogeneous treatment effect, defined as the difference of the treatment-specific conditional restricted mean survival times given covariates, by synthesizing evidence from randomized clinical trials and the real-world data with possible biases. We define an omnibus bias function to characterize the effect of biases caused by unmeasured confounders, censoring, and outcome heterogeneity, and further, identify it by combining the trial and real-world data. We propose a penalized sieve method to estimate the heterogeneous treatment effect and the bias function. We further study the theoretical properties of the proposed integrative estimators based on the theory of reproducing kernel Hilbert space and empirical process. The proposed methodology outperforms the approach solely based on the trial data through simulation studies and an integrative analysis of the data from a randomized trial and a real-world registry on early-stage non-small-cell lung cancer.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12505326/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145249501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ordinary differential equations (ODEs) are widely considered for modeling the dynamics of complex systems across various scientific areas. To identify the structure of high-dimensional sparse ODEs from noisy time-course data, most existing methods adopt a frequentist perspective, while uncertainty quantification in parameter estimation remains challenging. Under an additive ODE model assumption, we present a Bayesian hierarchical collocation method to provide better quantification of uncertainty. Our framework unifies the likelihood, integrated ODE constraints and a group-wise sparse penalty, allowing for simultaneous system identification and trajectory estimation. We demonstrate the favorable performance of the proposed method through simulation studies, where the recovered system trajectories and estimated additive components are compared with other recent methods. A real data example of gene regulatory networks is provided to illustrate the methodology.
{"title":"A Bayesian collocation integral method for system identification of ordinary differential equations.","authors":"Mingwei Xu, Samuel W K Wong, Peijun Sang","doi":"10.1093/biomtc/ujaf141","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf141","url":null,"abstract":"<p><p>Ordinary differential equations (ODEs) are widely considered for modeling the dynamics of complex systems across various scientific areas. To identify the structure of high-dimensional sparse ODEs from noisy time-course data, most existing methods adopt a frequentist perspective, while uncertainty quantification in parameter estimation remains challenging. Under an additive ODE model assumption, we present a Bayesian hierarchical collocation method to provide better quantification of uncertainty. Our framework unifies the likelihood, integrated ODE constraints and a group-wise sparse penalty, allowing for simultaneous system identification and trajectory estimation. We demonstrate the favorable performance of the proposed method through simulation studies, where the recovered system trajectories and estimated additive components are compared with other recent methods. A real data example of gene regulatory networks is provided to illustrate the methodology.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145372174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Online health communities (OHCs) provide a platform for patients and those related to share and communicate, making complex medical information more digestible and actionable. Health communication within OHCs can be impacted by other information sources. This study examines cross-platform health communication by mining Breastcancer.org (the largest online breast cancer community) and Twitter (now X). Early analyses of OHCs, Twitter, and other online platforms often adopt simple measures like word frequency, and more recent research has shifted towards word co-occurrence network analysis. Relatively, cross-platform communication analysis is limited, and the adopted techniques have drawbacks. We propose a new cross-platform communication model that collectively analyzes word co-occurrence networks and word frequency vectors. Here, the former describe the structural contents of health communication, and the latter describe the volumes. This model offers a nuanced perspective, accommodates temporal variations, and is examined for its theoretical and numerical properties. Collected from January 2010 to December 2020, the analyzed data contains over 1 395 000 tweets and 517 000 posts. Our analysis suggests that the Twitter's topics on breast cancer significantly impact the contents and volumes in the OHC. Distinct time phases are observed, with notable peaks during 2012-2013 and 2015-2018. This study can provide a venue for better understanding health communication and new insights into two highly important online platforms.
{"title":"Analysis of cross-platform health communication with a network approach.","authors":"Xinyan Fan, Mengque Liu, Shuangge Ma","doi":"10.1093/biomtc/ujaf154","DOIUrl":"10.1093/biomtc/ujaf154","url":null,"abstract":"<p><p>Online health communities (OHCs) provide a platform for patients and those related to share and communicate, making complex medical information more digestible and actionable. Health communication within OHCs can be impacted by other information sources. This study examines cross-platform health communication by mining Breastcancer.org (the largest online breast cancer community) and Twitter (now X). Early analyses of OHCs, Twitter, and other online platforms often adopt simple measures like word frequency, and more recent research has shifted towards word co-occurrence network analysis. Relatively, cross-platform communication analysis is limited, and the adopted techniques have drawbacks. We propose a new cross-platform communication model that collectively analyzes word co-occurrence networks and word frequency vectors. Here, the former describe the structural contents of health communication, and the latter describe the volumes. This model offers a nuanced perspective, accommodates temporal variations, and is examined for its theoretical and numerical properties. Collected from January 2010 to December 2020, the analyzed data contains over 1 395 000 tweets and 517 000 posts. Our analysis suggests that the Twitter's topics on breast cancer significantly impact the contents and volumes in the OHC. Distinct time phases are observed, with notable peaks during 2012-2013 and 2015-2018. This study can provide a venue for better understanding health communication and new insights into two highly important online platforms.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145572906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hidden Markov models (HMMs) are widely used to characterize latent state transition patterns in substance use. However, traditional HMM frameworks are incompetent when dealing with the complexities introduced by high-dimensional risk factors and varying time intervals, particularly in determining the number of hidden states and selecting variables for state transition parameters. To tackle the analytical challenges in the Population Assessment of Tobacco and Health (PATH) Study, a nationally representative longitudinal cohort study on tobacco use, we propose a continuous-time HMM framework with a regularization algorithm to identify multi-dimensional risk factors underlying complex poly-tobacco use transitions. We develop an elastic-net regularization on the transition covariates to identify informative covariates and improve model estimation accuracy. The inclusion of key covariates enables accurate determination of the number of hidden states. We incorporate survey weights and information on strata and clustering throughout the modeling framework. We demonstrate the validity of our approach in determining state numbers, identifying informative covariates, and estimating model parameters through a series of simulations. Application of the proposed approach to PATH data analysis revealed several demographic, behavioral, and psychosocial factors that contribute to the differential risks of transition between tobacco-use states among youth and young adults. The model's capacity in identifying high-dimensional risk factors for underlying hidden variables substantiates its potential for enhancing public health research and informing interventions.
{"title":"A regularized continuous-time hidden Markov model for identifying latent state transition patterns of poly-tobacco use.","authors":"Xinyu Yan, Ji-Hyun Lee, Xiang-Yang Lou","doi":"10.1093/biomtc/ujaf138","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf138","url":null,"abstract":"<p><p>Hidden Markov models (HMMs) are widely used to characterize latent state transition patterns in substance use. However, traditional HMM frameworks are incompetent when dealing with the complexities introduced by high-dimensional risk factors and varying time intervals, particularly in determining the number of hidden states and selecting variables for state transition parameters. To tackle the analytical challenges in the Population Assessment of Tobacco and Health (PATH) Study, a nationally representative longitudinal cohort study on tobacco use, we propose a continuous-time HMM framework with a regularization algorithm to identify multi-dimensional risk factors underlying complex poly-tobacco use transitions. We develop an elastic-net regularization on the transition covariates to identify informative covariates and improve model estimation accuracy. The inclusion of key covariates enables accurate determination of the number of hidden states. We incorporate survey weights and information on strata and clustering throughout the modeling framework. We demonstrate the validity of our approach in determining state numbers, identifying informative covariates, and estimating model parameters through a series of simulations. Application of the proposed approach to PATH data analysis revealed several demographic, behavioral, and psychosocial factors that contribute to the differential risks of transition between tobacco-use states among youth and young adults. The model's capacity in identifying high-dimensional risk factors for underlying hidden variables substantiates its potential for enhancing public health research and informing interventions.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145298373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Conditional independence is a foundational concept for understanding probabilistic relationships among variables, with broad applications in fields such as causal inference and machine learning. This study focuses on testing conditional independence, $Tperp X|Z$, where T represents survival data possibly subject to right censoring, Z represents established risk factors for T, and X represents potential novel biomarkers. The goal is to identify novel biomarkers that offer additional merits for further risk assessment and prediction. This can be achieved by using either the partial or parametric likelihood ratio statistic to evaluate whether the coefficient vector of X in the conditional model of T given $(X^{ mathrm{scriptscriptstyle top } }, Z^{ mathrm{scriptscriptstyle top } })^{ mathrm{scriptscriptstyle top } }$ is equal to zero. Traditional tests such as directly comparing likelihood ratios to chi-squared distributions may produce erroneous type-I error rates under model misspecification. As an alternative, we propose a resampling-based method to approximate the distribution of the likelihood ratios. A key advantage of the proposed test is its double robustness: it achieves approximately correct type-I error rates when either the conditional outcome model or the working model of ${rm pr} (X|Z)$ is correctly specified. Additionally, machine learning techniques can be incorporated to improve test performance. Simulation studies and the application to the Alzheimer's Disease Neuroimaging Initiative (ADNI) data demonstrate the finite-sample performance of the proposed tests.
{"title":"Double robust conditional independence test for novel biomarkers given established risk factors with survival data.","authors":"Baoying Yang, Jing Qin, Jing Ning, Yukun Liu","doi":"10.1093/biomtc/ujaf133","DOIUrl":"10.1093/biomtc/ujaf133","url":null,"abstract":"<p><p>Conditional independence is a foundational concept for understanding probabilistic relationships among variables, with broad applications in fields such as causal inference and machine learning. This study focuses on testing conditional independence, $Tperp X|Z$, where T represents survival data possibly subject to right censoring, Z represents established risk factors for T, and X represents potential novel biomarkers. The goal is to identify novel biomarkers that offer additional merits for further risk assessment and prediction. This can be achieved by using either the partial or parametric likelihood ratio statistic to evaluate whether the coefficient vector of X in the conditional model of T given $(X^{ mathrm{scriptscriptstyle top } }, Z^{ mathrm{scriptscriptstyle top } })^{ mathrm{scriptscriptstyle top } }$ is equal to zero. Traditional tests such as directly comparing likelihood ratios to chi-squared distributions may produce erroneous type-I error rates under model misspecification. As an alternative, we propose a resampling-based method to approximate the distribution of the likelihood ratios. A key advantage of the proposed test is its double robustness: it achieves approximately correct type-I error rates when either the conditional outcome model or the working model of ${rm pr} (X|Z)$ is correctly specified. Additionally, machine learning techniques can be incorporated to improve test performance. Simulation studies and the application to the Alzheimer's Disease Neuroimaging Initiative (ADNI) data demonstrate the finite-sample performance of the proposed tests.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145336170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Motivated by a malaria vaccine efficacy trial, this paper investigates generalized nonparametric temporal models of intensity processes with multiple time scales. Through the choice of link functions, the proposed models encompass a wide range of models such as the multiplicative temporal intensity model and the additive temporal intensity model. A maximum likelihood estimation procedure is developed to estimate the effects of two time-scales via the local linear smoothing with double kernels. Computational algorithms are developed to facilitate applications of the proposed method. An adaptive algorithm is developed to overcome the challenges of overlapping covariates. A cross-validation bandwidth selection procedure based on the logarithm of likelihood criteria is discussed. The asymptotic properties of the proposed estimators are investigated. Our simulation study shows that the proposed methods have satisfactory finite sample performance for both the multiplicative temporal intensity model and additive temporal intensity model. The proposed methods are applied to analyze the MAL-094/MAL-095 malaria vaccine efficacy trial data to investigate how the new malaria infection risk changes over time and how a prior infection or vaccination changes the future infection risk. The proposed method provides new insight into the protective effects of the malaria vaccine against new malaria infections and how the vaccine efficacy is modified by the history of prior malaria infection over time.
{"title":"Generalized nonparametric temporal modeling of recurrent events with application to a malaria vaccine trial.","authors":"Fei Heng, Yanqing Sun, Jing Xu, Peter B Gilbert","doi":"10.1093/biomtc/ujaf146","DOIUrl":"10.1093/biomtc/ujaf146","url":null,"abstract":"<p><p>Motivated by a malaria vaccine efficacy trial, this paper investigates generalized nonparametric temporal models of intensity processes with multiple time scales. Through the choice of link functions, the proposed models encompass a wide range of models such as the multiplicative temporal intensity model and the additive temporal intensity model. A maximum likelihood estimation procedure is developed to estimate the effects of two time-scales via the local linear smoothing with double kernels. Computational algorithms are developed to facilitate applications of the proposed method. An adaptive algorithm is developed to overcome the challenges of overlapping covariates. A cross-validation bandwidth selection procedure based on the logarithm of likelihood criteria is discussed. The asymptotic properties of the proposed estimators are investigated. Our simulation study shows that the proposed methods have satisfactory finite sample performance for both the multiplicative temporal intensity model and additive temporal intensity model. The proposed methods are applied to analyze the MAL-094/MAL-095 malaria vaccine efficacy trial data to investigate how the new malaria infection risk changes over time and how a prior infection or vaccination changes the future infection risk. The proposed method provides new insight into the protective effects of the malaria vaccine against new malaria infections and how the vaccine efficacy is modified by the history of prior malaria infection over time.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12635532/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145562433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Although the Cox proportional hazards (PH) model is well established and extensively used in the analysis of survival data, the PH assumption may not always hold in practical scenarios. The class of semiparametric transformation models extends the Cox model and also includes many other survival models as special cases. This paper introduces a deep partially linear transformation model as a general and flexible regression framework for right-censored data. The proposed method is capable of avoiding the curse of dimensionality while still retaining the interpretability of some covariates of interest. We derive the overall convergence rate of the maximum likelihood estimators, the minimax lower bound of the nonparametric deep neural network estimator, and the asymptotic normality and the semiparametric efficiency of the parametric estimator. Comprehensive simulation studies demonstrate the impressive performance of the proposed estimation procedure in terms of both the estimation accuracy and the predictive power, which is further validated by an application to a real-world dataset.
{"title":"Deep partially linear transformation model for right-censored survival data.","authors":"Junkai Yin, Yue Zhang, Zhangsheng Yu","doi":"10.1093/biomtc/ujaf126","DOIUrl":"10.1093/biomtc/ujaf126","url":null,"abstract":"<p><p>Although the Cox proportional hazards (PH) model is well established and extensively used in the analysis of survival data, the PH assumption may not always hold in practical scenarios. The class of semiparametric transformation models extends the Cox model and also includes many other survival models as special cases. This paper introduces a deep partially linear transformation model as a general and flexible regression framework for right-censored data. The proposed method is capable of avoiding the curse of dimensionality while still retaining the interpretability of some covariates of interest. We derive the overall convergence rate of the maximum likelihood estimators, the minimax lower bound of the nonparametric deep neural network estimator, and the asymptotic normality and the semiparametric efficiency of the parametric estimator. Comprehensive simulation studies demonstrate the impressive performance of the proposed estimation procedure in terms of both the estimation accuracy and the predictive power, which is further validated by an application to a real-world dataset.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145399782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The US Food and Drug Administration (FDA) launched Project Optimus to shift the objective of dose selection from the maximum tolerated dose to the optimal biological dose (OBD), optimizing the benefit-risk tradeoff. One approach recommended by the FDA's guidance is to conduct randomized trials comparing multiple doses. In this paper, using the selection design framework, we propose a Randomized Optimal SElection (ROSE) design, which minimizes sample size while ensuring the probability of correct selection of the OBD at pre-specified accuracy levels. The ROSE design is simple to implement, involving a straightforward comparison of the difference in response rates between two dose arms against a predetermined decision boundary. We further consider a two-stage ROSE design that allows for early selection of the OBD at the interim when there is sufficient evidence, further reducing the sample size. Simulation studies demonstrate that the ROSE design exhibits desirable operating characteristics in correctly identifying the OBD. A sample size of 15-40 patients per dosage arm typically results in a percentage of correct selection of the optimal dose ranging from 60% to 70%.
{"title":"Randomized optimal selection design for dose optimization.","authors":"Shuqi Wang, Ying Yuan, Suyu Liu","doi":"10.1093/biomtc/ujaf124","DOIUrl":"10.1093/biomtc/ujaf124","url":null,"abstract":"<p><p>The US Food and Drug Administration (FDA) launched Project Optimus to shift the objective of dose selection from the maximum tolerated dose to the optimal biological dose (OBD), optimizing the benefit-risk tradeoff. One approach recommended by the FDA's guidance is to conduct randomized trials comparing multiple doses. In this paper, using the selection design framework, we propose a Randomized Optimal SElection (ROSE) design, which minimizes sample size while ensuring the probability of correct selection of the OBD at pre-specified accuracy levels. The ROSE design is simple to implement, involving a straightforward comparison of the difference in response rates between two dose arms against a predetermined decision boundary. We further consider a two-stage ROSE design that allows for early selection of the OBD at the interim when there is sufficient evidence, further reducing the sample size. Simulation studies demonstrate that the ROSE design exhibits desirable operating characteristics in correctly identifying the OBD. A sample size of 15-40 patients per dosage arm typically results in a percentage of correct selection of the optimal dose ranging from 60% to 70%.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12505323/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145249541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}