Pub Date : 2024-10-15DOI: 10.1177/17407745241284798
Hayden P Nix, Charles Weijer, Monica Taljaard
Background: Randomized controlled trials with pragmatic intent aim to generate evidence that directly informs clinical decisions. Some have argued that the ethical protection of informed consent can be in tension with the goals of pragmatism. But the impact of other ethical protections on trial pragmatism has yet to be explored.
Purpose: In this article, we analyze the relationship between additional ethical protections for vulnerable participants and the degree of pragmatism within the PRagmatic Explanatory Continuum Indicator Summary-2 (PRECIS-2) domains of trial design.
Methods: We analyze three example trials with pragmatic intent that include vulnerable participants.
Conclusion: The relationship between ethical protections and trial pragmatism is complex. In some cases, additional ethical protections for vulnerable participants can promote the pragmatism of some of the PRECIS-2 domains of trial design. When designing trials with pragmatic intent, researchers ought to look for opportunities wherein ethical protections enhance the degree of pragmatism.
{"title":"Are pragmatism and ethical protections in clinical trials a zero-sum game?","authors":"Hayden P Nix, Charles Weijer, Monica Taljaard","doi":"10.1177/17407745241284798","DOIUrl":"10.1177/17407745241284798","url":null,"abstract":"<p><strong>Background: </strong>Randomized controlled trials with pragmatic intent aim to generate evidence that directly informs clinical decisions. Some have argued that the ethical protection of informed consent can be in tension with the goals of pragmatism. But the impact of other ethical protections on trial pragmatism has yet to be explored.</p><p><strong>Purpose: </strong>In this article, we analyze the relationship between additional ethical protections for vulnerable participants and the degree of pragmatism within the PRagmatic Explanatory Continuum Indicator Summary-2 (PRECIS-2) domains of trial design.</p><p><strong>Methods: </strong>We analyze three example trials with pragmatic intent that include vulnerable participants.</p><p><strong>Conclusion: </strong>The relationship between ethical protections and trial pragmatism is complex. In some cases, additional ethical protections for vulnerable participants can promote the pragmatism of some of the PRECIS-2 domains of trial design. When designing trials with pragmatic intent, researchers ought to look for opportunities wherein ethical protections enhance the degree of pragmatism.</p>","PeriodicalId":10685,"journal":{"name":"Clinical Trials","volume":" ","pages":"17407745241284798"},"PeriodicalIF":2.2,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142459901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-12DOI: 10.1177/17407745241284786
Lingyun Ji, Todd A Alonzo
Background/aims: For cancers with low incidence, low event rates, and a time-to-event endpoint, a randomized non-inferiority trial designed based on the logrank test can require a large sample size with significantly prolonged enrollment duration, making such a non-inferiority trial not feasible. This article evaluates a design based on a non-inferiority test of proportions, compares its required sample size to the non-inferiority logrank test, assesses whether there are scenarios for which a non-inferiority test of proportions can be more efficient, and provides guidelines in usage of a non-inferiority test of proportions.
Methods: This article describes the sample size calculation for a randomized non-inferiority trial based on a non-inferiority logrank test or a non-inferiority test of proportions. The sample size required by the two design methods are compared for a wide range of scenarios, varying the underlying Weibull survival functions, the non-inferiority margin, and loss to follow-up rate.
Results: Our results showed that there are scenarios for which the non-inferiority test of proportions can have significantly reduced sample size. Specifically, the non-inferiority test of proportions can be considered for cancers with more than 80% long-term survival rate. We provide guidance in choice of this design approach based on parameters of the Weibull survival functions, the non-inferiority margin, and loss to follow-up rate.
Conclusion: For cancers with low incidence and low event rates, a non-inferiority trial based on the logrank test is not feasible due to its large required sample size and prolonged enrollment duration. The use of a non-inferiority test of proportions can make a randomized non-inferiority Phase III trial feasible.
{"title":"Using non-inferiority test of proportions in design of randomized non-inferiority trials with time-to-event endpoint with a focus on low-event-rate setting.","authors":"Lingyun Ji, Todd A Alonzo","doi":"10.1177/17407745241284786","DOIUrl":"10.1177/17407745241284786","url":null,"abstract":"<p><strong>Background/aims: </strong>For cancers with low incidence, low event rates, and a time-to-event endpoint, a randomized non-inferiority trial designed based on the logrank test can require a large sample size with significantly prolonged enrollment duration, making such a non-inferiority trial not feasible. This article evaluates a design based on a non-inferiority test of proportions, compares its required sample size to the non-inferiority logrank test, assesses whether there are scenarios for which a non-inferiority test of proportions can be more efficient, and provides guidelines in usage of a non-inferiority test of proportions.</p><p><strong>Methods: </strong>This article describes the sample size calculation for a randomized non-inferiority trial based on a non-inferiority logrank test or a non-inferiority test of proportions. The sample size required by the two design methods are compared for a wide range of scenarios, varying the underlying Weibull survival functions, the non-inferiority margin, and loss to follow-up rate.</p><p><strong>Results: </strong>Our results showed that there are scenarios for which the non-inferiority test of proportions can have significantly reduced sample size. Specifically, the non-inferiority test of proportions can be considered for cancers with more than 80% long-term survival rate. We provide guidance in choice of this design approach based on parameters of the Weibull survival functions, the non-inferiority margin, and loss to follow-up rate.</p><p><strong>Conclusion: </strong>For cancers with low incidence and low event rates, a non-inferiority trial based on the logrank test is not feasible due to its large required sample size and prolonged enrollment duration. The use of a non-inferiority test of proportions can make a randomized non-inferiority Phase III trial feasible.</p>","PeriodicalId":10685,"journal":{"name":"Clinical Trials","volume":" ","pages":"17407745241284786"},"PeriodicalIF":2.2,"publicationDate":"2024-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142459904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-10DOI: 10.1177/17407745241276130
Pedro Nascimento Martins, Mateus Henrique Toledo Lourenço, Gabriel Paz Souza Mota, Alexandre Biasi Cavalcanti, Ana Carolina Peçanha Antonio, Fredi Alexander Diaz-Quijano
Background/aims: This study aimed to determine the prevalence of ordinal, binary, and numerical composite endpoints among coronavirus disease 2019 trials and the potential bias attributable to their use.
Methods: We systematically reviewed the Cochrane COVID-19 Study Register to assess the prevalence, characteristics, and bias associated with using composite endpoints in coronavirus disease 2019 randomized clinical trials. We compared the effect measure (relative risk) of composite outcomes and that of its most critical component (i.e. death) by estimating the Bias Attributable to Composite Outcomes index [ln(relative risk for the composite outcome)/ln(relative risk for death)].
Results: Composite endpoints accounted for 152 out of 417 primary endpoints in coronavirus disease 2019 randomized trials, being more frequent among studies published in high-impact journals. Ordinal endpoints were the most common (54% of all composites), followed by binary or time-to-event (34%), numerical (11%), and hierarchical (1%). Composites predominated among trials enrolling patients with severe disease when compared to trials with a mild or moderate case mix (odds ratio = 1.72). Adaptations of the seven-point World Health Organization scale occurred in 40% of the ordinal primary endpoints, which frequently underwent dichotomization for the statistical analyses. Mortality accounted for a median of 24% (interquartile range: 6%-48%) of all events when included in the composite. The median point estimate of the Bias Attributable to Composite Outcomes index was 0.3 (interquartile range: -0.1 to 0.7), being significantly lower than 1 in 5 of 24 comparisons.
Discussion: Composite endpoints were used in a significant proportion of coronavirus disease 2019 trials, especially those involving severely ill patients. This is likely due to the higher anticipated rates of competing events, such as death, in such studies. Ordinal composites were common but often not fully appreciated, reducing the potential gains in information and statistical efficiency. For studies with binary composites, death was the most frequent component, and, unexpectedly, composite outcome estimates were often closer to the null when compared to those for mortality death. Numerical composites were less common, and only two trials used hierarchical endpoints. These newer approaches may offer advantages over traditional binary and ordinal composites; however, their potential benefits warrant further scrutiny.
Conclusion: Composite endpoints accounted for more than a third of coronavirus disease 2019 trials' primary endpoints; their use was more common among studies that included patients with severe disease and their point effect estimates tended to underestimate those for mortality.
{"title":"Composite endpoints in COVID-19 randomized controlled trials: a systematic review.","authors":"Pedro Nascimento Martins, Mateus Henrique Toledo Lourenço, Gabriel Paz Souza Mota, Alexandre Biasi Cavalcanti, Ana Carolina Peçanha Antonio, Fredi Alexander Diaz-Quijano","doi":"10.1177/17407745241276130","DOIUrl":"https://doi.org/10.1177/17407745241276130","url":null,"abstract":"<p><strong>Background/aims: </strong>This study aimed to determine the prevalence of ordinal, binary, and numerical composite endpoints among coronavirus disease 2019 trials and the potential bias attributable to their use.</p><p><strong>Methods: </strong>We systematically reviewed the Cochrane COVID-19 Study Register to assess the prevalence, characteristics, and bias associated with using composite endpoints in coronavirus disease 2019 randomized clinical trials. We compared the effect measure (relative risk) of composite outcomes and that of its most critical component (i.e. death) by estimating the Bias Attributable to Composite Outcomes index [ln(relative risk for the composite outcome)/ln(relative risk for death)].</p><p><strong>Results: </strong>Composite endpoints accounted for 152 out of 417 primary endpoints in coronavirus disease 2019 randomized trials, being more frequent among studies published in high-impact journals. Ordinal endpoints were the most common (54% of all composites), followed by binary or time-to-event (34%), numerical (11%), and hierarchical (1%). Composites predominated among trials enrolling patients with severe disease when compared to trials with a mild or moderate case mix (odds ratio = 1.72). Adaptations of the seven-point World Health Organization scale occurred in 40% of the ordinal primary endpoints, which frequently underwent dichotomization for the statistical analyses. Mortality accounted for a median of 24% (interquartile range: 6%-48%) of all events when included in the composite. The median point estimate of the Bias Attributable to Composite Outcomes index was 0.3 (interquartile range: -0.1 to 0.7), being significantly lower than 1 in 5 of 24 comparisons.</p><p><strong>Discussion: </strong>Composite endpoints were used in a significant proportion of coronavirus disease 2019 trials, especially those involving severely ill patients. This is likely due to the higher anticipated rates of competing events, such as death, in such studies. Ordinal composites were common but often not fully appreciated, reducing the potential gains in information and statistical efficiency. For studies with binary composites, death was the most frequent component, and, unexpectedly, composite outcome estimates were often closer to the null when compared to those for mortality death. Numerical composites were less common, and only two trials used hierarchical endpoints. These newer approaches may offer advantages over traditional binary and ordinal composites; however, their potential benefits warrant further scrutiny.</p><p><strong>Conclusion: </strong>Composite endpoints accounted for more than a third of coronavirus disease 2019 trials' primary endpoints; their use was more common among studies that included patients with severe disease and their point effect estimates tended to underestimate those for mortality.</p>","PeriodicalId":10685,"journal":{"name":"Clinical Trials","volume":" ","pages":"17407745241276130"},"PeriodicalIF":2.2,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142399650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-08DOI: 10.1177/17407745241276137
Guangyu Tong, Pascale Nevins, Mary Ryan, Kendra Davis-Plourde, Yongdong Ouyang, Jules Antoine Pereira Macedo, Can Meng, Xueqi Wang, Agnès Caille, Fan Li, Monica Taljaard
<p><strong>Background/aims: </strong>Stepped-wedge cluster randomized trials tend to require fewer clusters than standard parallel-arm designs due to the switches between control and intervention conditions, but there are no recommendations for the minimum number of clusters. Trials randomizing an extremely small number of clusters are not uncommon, but the justification for small numbers of clusters is often unclear and appropriate analysis is often lacking. In addition, stepped-wedge cluster randomized trials are methodologically more complex due to their longitudinal correlation structure, and ignoring the distinct within- and between-period intracluster correlations can underestimate the sample size in small stepped-wedge cluster randomized trials. We conducted a review of published small stepped-wedge cluster randomized trials to understand how and why they are used, and to characterize approaches used in their design and analysis.</p><p><strong>Methods: </strong>Electronic searches were used to identify primary reports of full-scale stepped-wedge cluster randomized trials published during the period 2016-2022; the subset that randomized two to six clusters was identified. Two reviewers independently extracted information from each report and any available protocol. Disagreements were resolved through discussion.</p><p><strong>Results: </strong>We identified 61 stepped-wedge cluster randomized trials that randomized two to six clusters: median sample size (Q1-Q3) 1426 (420-7553) participants. Twelve (19.7%) gave some indication that the evaluation was considered a "preliminary" evaluation and 16 (26.2%) recognized the small number of clusters as a limitation. Sixteen (26.2%) provided an explanation for the limited number of clusters: the need to minimize contamination (e.g. by merging adjacent units), limited availability of clusters, and logistical considerations were common explanations. Majority (51, 83.6%) presented sample size or power calculations, but only one assumed distinct within- and between-period intracluster correlations. Few (10, 16.4%) utilized restricted randomization methods; more than half (34, 55.7%) identified baseline imbalances. The most common statistical method for analysis was the generalized linear mixed model (44, 72.1%). Only four trials (6.6%) reported statistical analyses considering small numbers of clusters: one used generalized estimating equations with small-sample correction, two used generalized linear mixed model with small-sample correction, and one used Bayesian analysis. Another eight (13.1%) used fixed-effects regression, the performance of which requires further evaluation under stepped-wedge cluster randomized trials with small numbers of clusters. None used permutation tests or cluster-period level analysis.</p><p><strong>Conclusion: </strong>Methods appropriate for the design and analysis of small stepped-wedge cluster randomized trials have not been widely adopted in practice. Greater awareness
{"title":"A review of current practice in the design and analysis of extremely small stepped-wedge cluster randomized trials.","authors":"Guangyu Tong, Pascale Nevins, Mary Ryan, Kendra Davis-Plourde, Yongdong Ouyang, Jules Antoine Pereira Macedo, Can Meng, Xueqi Wang, Agnès Caille, Fan Li, Monica Taljaard","doi":"10.1177/17407745241276137","DOIUrl":"10.1177/17407745241276137","url":null,"abstract":"<p><strong>Background/aims: </strong>Stepped-wedge cluster randomized trials tend to require fewer clusters than standard parallel-arm designs due to the switches between control and intervention conditions, but there are no recommendations for the minimum number of clusters. Trials randomizing an extremely small number of clusters are not uncommon, but the justification for small numbers of clusters is often unclear and appropriate analysis is often lacking. In addition, stepped-wedge cluster randomized trials are methodologically more complex due to their longitudinal correlation structure, and ignoring the distinct within- and between-period intracluster correlations can underestimate the sample size in small stepped-wedge cluster randomized trials. We conducted a review of published small stepped-wedge cluster randomized trials to understand how and why they are used, and to characterize approaches used in their design and analysis.</p><p><strong>Methods: </strong>Electronic searches were used to identify primary reports of full-scale stepped-wedge cluster randomized trials published during the period 2016-2022; the subset that randomized two to six clusters was identified. Two reviewers independently extracted information from each report and any available protocol. Disagreements were resolved through discussion.</p><p><strong>Results: </strong>We identified 61 stepped-wedge cluster randomized trials that randomized two to six clusters: median sample size (Q1-Q3) 1426 (420-7553) participants. Twelve (19.7%) gave some indication that the evaluation was considered a \"preliminary\" evaluation and 16 (26.2%) recognized the small number of clusters as a limitation. Sixteen (26.2%) provided an explanation for the limited number of clusters: the need to minimize contamination (e.g. by merging adjacent units), limited availability of clusters, and logistical considerations were common explanations. Majority (51, 83.6%) presented sample size or power calculations, but only one assumed distinct within- and between-period intracluster correlations. Few (10, 16.4%) utilized restricted randomization methods; more than half (34, 55.7%) identified baseline imbalances. The most common statistical method for analysis was the generalized linear mixed model (44, 72.1%). Only four trials (6.6%) reported statistical analyses considering small numbers of clusters: one used generalized estimating equations with small-sample correction, two used generalized linear mixed model with small-sample correction, and one used Bayesian analysis. Another eight (13.1%) used fixed-effects regression, the performance of which requires further evaluation under stepped-wedge cluster randomized trials with small numbers of clusters. None used permutation tests or cluster-period level analysis.</p><p><strong>Conclusion: </strong>Methods appropriate for the design and analysis of small stepped-wedge cluster randomized trials have not been widely adopted in practice. Greater awareness","PeriodicalId":10685,"journal":{"name":"Clinical Trials","volume":" ","pages":"17407745241276137"},"PeriodicalIF":2.2,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142388716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-10-08DOI: 10.1177/17407745241271939
Ionut Bebu, Rebecca A Betensky, Michael P Fay
{"title":"15th Annual University of Pennsylvania conference on statistical issues in clinical trial/advances in time-to-event analyses in clinical trials (afternoon panel discussion).","authors":"Ionut Bebu, Rebecca A Betensky, Michael P Fay","doi":"10.1177/17407745241271939","DOIUrl":"10.1177/17407745241271939","url":null,"abstract":"","PeriodicalId":10685,"journal":{"name":"Clinical Trials","volume":" ","pages":"612-622"},"PeriodicalIF":2.2,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142388715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-08-08DOI: 10.1177/17407745241267999
Rachel Marceau West, Gregory Golm, Devan V Mehrotra
Composite time-to-event endpoints are commonly used in cardiovascular outcome trials. For example, the IMPROVE-IT trial comparing ezetimibe+simvastatin to placebo+simvastatin in 18,144 patients with acute coronary syndrome used a primary composite endpoint with five component outcomes: (1) cardiovascular death, (2) non-fatal stroke, (3) non-fatal myocardial infarction, (4) coronary revascularization ≥30 days after randomization, and (5) unstable angina requiring hospitalization. In such settings, the traditional analysis compares treatments using the observed time to the occurrence of the first (i.e. earliest) component outcome for each patient. This approach ignores information for subsequent outcome(s), possibly leading to reduced power to demonstrate the benefit of the test versus the control treatment. We use real data examples and simulations to contrast the traditional approach with several alternative approaches that use data for all the intra-patient component outcomes, not just the first.
{"title":"Analysis of composite time-to-event endpoints in cardiovascular outcome trials.","authors":"Rachel Marceau West, Gregory Golm, Devan V Mehrotra","doi":"10.1177/17407745241267999","DOIUrl":"10.1177/17407745241267999","url":null,"abstract":"<p><p>Composite time-to-event endpoints are commonly used in cardiovascular outcome trials. For example, the IMPROVE-IT trial comparing ezetimibe+simvastatin to placebo+simvastatin in 18,144 patients with acute coronary syndrome used a primary composite endpoint with five component outcomes: (1) cardiovascular death, (2) non-fatal stroke, (3) non-fatal myocardial infarction, (4) coronary revascularization ≥30 days after randomization, and (5) unstable angina requiring hospitalization. In such settings, the traditional analysis compares treatments using the observed time to the occurrence of the first (i.e. earliest) component outcome for each patient. This approach ignores information for subsequent outcome(s), possibly leading to reduced power to demonstrate the benefit of the test versus the control treatment. We use real data examples and simulations to contrast the traditional approach with several alternative approaches that use data for all the intra-patient component outcomes, not just the first.</p>","PeriodicalId":10685,"journal":{"name":"Clinical Trials","volume":" ","pages":"576-583"},"PeriodicalIF":2.2,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141906134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-02-02DOI: 10.1177/17407745231222448
Devan V Mehrotra, Rachel Marceau West
In randomized clinical trials, analyses of time-to-event data without risk stratification, or with stratification based on pre-selected factors revealed at the end of the trial to be at most weakly associated with risk, are quite common. We caution that such analyses are likely delivering hazard ratio estimates that unwittingly dilute the evidence of benefit for the test relative to the control treatment. To make our case, first, we use a hypothetical scenario to contrast risk-unstratified and risk-stratified hazard ratios. Thereafter, we draw attention to the previously published 5-step stratified testing and amalgamation routine (5-STAR) approach in which a pre-specified treatment-blinded algorithm is applied to survival times from the trial to partition patients into well-separated risk strata using baseline covariates determined to be jointly strongly prognostic for event risk. After treatment unblinding, a treatment comparison is done within each risk stratum and stratum-level results are averaged for overall inference. For illustration, we use 5-STAR to reanalyze data for the primary and key secondary time-to-event endpoints from three published cardiovascular outcomes trials. The results show that the 5-STAR estimate is typically smaller (i.e. more in favor of the test treatment) than the originally reported (traditional) estimate. This is not surprising because 5-STAR mitigates the presumed dilution bias in the traditional hazard ratio estimate caused by no or inadequate risk stratification, as evidenced by two detailed examples. Pre-selection of stratification factors at the trial design stage to achieve adequate risk stratification for the analysis will often be challenging. In such settings, an objective risk stratification approach such as 5-STAR, which is partly aligned with guidance from the US Food and Drug Administration on covariate-adjustment in clinical trials, is worthy of consideration.
{"title":"Is inadequate risk stratification diluting hazard ratio estimates in randomized clinical trials?","authors":"Devan V Mehrotra, Rachel Marceau West","doi":"10.1177/17407745231222448","DOIUrl":"10.1177/17407745231222448","url":null,"abstract":"<p><p>In randomized clinical trials, analyses of time-to-event data without risk stratification, or with stratification based on pre-selected factors revealed at the end of the trial to be at most weakly associated with risk, are quite common. We caution that such analyses are likely delivering hazard ratio estimates that unwittingly dilute the evidence of benefit for the test relative to the control treatment. To make our case, first, we use a hypothetical scenario to contrast risk-unstratified and risk-stratified hazard ratios. Thereafter, we draw attention to the previously published 5-step stratified testing and amalgamation routine (5-STAR) approach in which a pre-specified treatment-blinded algorithm is applied to survival times from the trial to partition patients into well-separated risk strata using baseline covariates determined to be jointly strongly prognostic for event risk. After treatment unblinding, a treatment comparison is done within each risk stratum and stratum-level results are averaged for overall inference. For illustration, we use 5-STAR to reanalyze data for the primary and key secondary time-to-event endpoints from three published cardiovascular outcomes trials. The results show that the 5-STAR estimate is typically smaller (i.e. more in favor of the test treatment) than the originally reported (traditional) estimate. This is not surprising because 5-STAR mitigates the presumed dilution bias in the traditional hazard ratio estimate caused by no or inadequate risk stratification, as evidenced by two detailed examples. Pre-selection of stratification factors at the trial design stage to achieve adequate risk stratification for the analysis will often be challenging. In such settings, an objective risk stratification approach such as 5-STAR, which is partly aligned with guidance from the US Food and Drug Administration on covariate-adjustment in clinical trials, is worthy of consideration.</p>","PeriodicalId":10685,"journal":{"name":"Clinical Trials","volume":" ","pages":"571-575"},"PeriodicalIF":2.2,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139671450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-08-02DOI: 10.1177/17407745241267862
Terry M Therneau, Fang-Shu Ou
A clinical trial represents a large commitment from all individuals involved and a huge financial obligation given its high cost; therefore, it is wise to make the most of all collected data by learning as much as possible. A multistate model is a generalized framework to describe longitudinal events; multistate hazards models can treat multiple intermediate/final clinical endpoints as outcomes and estimate the impact of covariates simultaneously. Proportional hazards models are fitted (one per transition), which can be used to calculate the absolute risks, that is, the probability of being in a state at a given time, the expected number of visits to a state, and the expected amount of time spent in a state. Three publicly available clinical trial datasets, colon, myeloid, and rhDNase, in the survival package in R were used to showcase the utility of multistate hazards models. In the colon dataset, a very well-known and well-used dataset, we found that the levamisole+fluorouracil treatment extended time in the recurrence-free state more than it extended overall survival, which resulted in less time in the recurrence state, an example of the classic "compression of morbidity." In the myeloid dataset, we found that complete response (CR) is durable, patients who received treatment B have longer sojourn time in CR than patients who received treatment A, while the mutation status does not impact the transition rate to CR but is highly influential on the sojourn time in CR. We also found that more patients in treatment A received transplants without CR, and more patients in treatment B received transplants after CR. In addition, the mutation status is highly influential on the CR to transplant transition rate. The observations that we made on these three datasets would not be possible without multistate models. We want to encourage readers to spend more time to look deeper into clinical trial data. It has a lot more to offer than a simple yes/no answer if only we, the statisticians, are willing to look for it.
临床试验是所有参与人员的一项重大承诺,也是一项巨大的财务义务,因为其成本高昂;因此,通过尽可能多的学习来充分利用所有收集到的数据是明智之举。多态模型是描述纵向事件的通用框架;多态危险模型可将多个中间/最终临床终点作为结果,并同时估计协变量的影响。比例危险模型是拟合模型(每个转变一个),可用于计算绝对风险,即在给定时间内处于某一状态的概率、进入某一状态的预期次数以及在某一状态下花费的预期时间。为了展示多态危险模型的实用性,我们使用了 R 生存软件包中三个公开的临床试验数据集:结肠、骨髓和 rhDNase。结肠数据集是一个非常著名且使用广泛的数据集,在该数据集中,我们发现左旋咪唑+氟尿嘧啶治疗延长了无复发状态的时间,超过了延长总生存期的时间,从而减少了复发状态的时间,这就是典型的 "压缩发病率 "的例子。在骨髓数据集中,我们发现完全应答(CR)是持久的,接受 B 治疗的患者比接受 A 治疗的患者在 CR 状态下的停留时间更长,而突变状态并不影响向 CR 的转变率,但对 CR 状态下的停留时间有很大影响。我们还发现,接受治疗 A 的更多患者在没有 CR 的情况下接受了移植,而接受治疗 B 的更多患者在 CR 后接受了移植。此外,突变状态对 CR 到移植的转换率也有很大影响。如果没有多态模型,我们就不可能对这三个数据集进行观察。我们鼓励读者花更多时间深入研究临床试验数据。只要我们统计学家愿意去寻找,它就能提供比简单的 "是/否 "答案更多的信息。
{"title":"Using multistate models with clinical trial data for a deeper understanding of complex disease processes.","authors":"Terry M Therneau, Fang-Shu Ou","doi":"10.1177/17407745241267862","DOIUrl":"10.1177/17407745241267862","url":null,"abstract":"<p><p>A clinical trial represents a large commitment from all individuals involved and a huge financial obligation given its high cost; therefore, it is wise to make the most of all collected data by learning as much as possible. A multistate model is a generalized framework to describe longitudinal events; multistate hazards models can treat multiple intermediate/final clinical endpoints as outcomes and estimate the impact of covariates simultaneously. Proportional hazards models are fitted (one per transition), which can be used to calculate the absolute risks, that is, the probability of being in a state at a given time, the expected number of visits to a state, and the expected amount of time spent in a state. Three publicly available clinical trial datasets, colon, myeloid, and rhDNase, in the survival package in R were used to showcase the utility of multistate hazards models. In the colon dataset, a very well-known and well-used dataset, we found that the levamisole+fluorouracil treatment extended time in the recurrence-free state more than it extended overall survival, which resulted in less time in the recurrence state, an example of the classic \"compression of morbidity.\" In the myeloid dataset, we found that complete response (CR) is durable, patients who received treatment B have longer sojourn time in CR than patients who received treatment A, while the mutation status does not impact the transition rate to CR but is highly influential on the sojourn time in CR. We also found that more patients in treatment A received transplants without CR, and more patients in treatment B received transplants after CR. In addition, the mutation status is highly influential on the CR to transplant transition rate. The observations that we made on these three datasets would not be possible without multistate models. We want to encourage readers to spend more time to look deeper into clinical trial data. It has a lot more to offer than a simple yes/no answer if only we, the statisticians, are willing to look for it.</p>","PeriodicalId":10685,"journal":{"name":"Clinical Trials","volume":" ","pages":"531-540"},"PeriodicalIF":2.2,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141878507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-08-08DOI: 10.1177/17407745241265628
Anne Eaton
Composite endpoints defined as the time to the earliest of two or more events are often used as primary endpoints in clinical trials. Component-wise censoring arises when different components of the composite endpoint are censored differently. We focus on a composite of death and a non-fatal event where death time is right censored and the non-fatal event time is interval censored because the event can only be detected during study visits. Such data are most often analysed using methods for right censored data, treating the time the non-fatal event was first detected as the time it occurred. This can lead to bias, particularly when the time between assessments is long. We describe several approaches for estimating the event-free survival curve and the effect of treatment on event-free survival via the hazard ratio that are specifically designed to handle component-wise censoring. We apply the methods to a randomized study of breastfeeding versus formula feeding for infants of mothers infected with human immunodeficiency virus.
{"title":"Statistical approaches for component-wise censored composite endpoints.","authors":"Anne Eaton","doi":"10.1177/17407745241265628","DOIUrl":"10.1177/17407745241265628","url":null,"abstract":"<p><p>Composite endpoints defined as the time to the earliest of two or more events are often used as primary endpoints in clinical trials. Component-wise censoring arises when different components of the composite endpoint are censored differently. We focus on a composite of death and a non-fatal event where death time is right censored and the non-fatal event time is interval censored because the event can only be detected during study visits. Such data are most often analysed using methods for right censored data, treating the time the non-fatal event was first detected as the time it occurred. This can lead to bias, particularly when the time between assessments is long. We describe several approaches for estimating the event-free survival curve and the effect of treatment on event-free survival via the hazard ratio that are specifically designed to handle component-wise censoring. We apply the methods to a randomized study of breastfeeding versus formula feeding for infants of mothers infected with human immunodeficiency virus.</p>","PeriodicalId":10685,"journal":{"name":"Clinical Trials","volume":" ","pages":"595-603"},"PeriodicalIF":2.2,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11533687/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141901175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-08-24DOI: 10.1177/17407745241268054
Richard J Cook, Jerald F Lawless
Clinical trials with random assignment of treatment provide evidence about causal effects of an experimental treatment compared to standard care. However, when disease processes involve multiple types of possibly semi-competing events, specification of target estimands and causal inferences can be challenging. Intercurrent events such as study withdrawal, the introduction of rescue medication, and death further complicate matters. There has been much discussion about these issues in recent years, but guidance remains ambiguous. Some recommended approaches are formulated in terms of hypothetical settings that have little bearing in the real world. We discuss issues in formulating estimands, beginning with intercurrent events in the context of a linear model and then move on to more complex disease history processes amenable to multistate modeling. We elucidate the meaning of estimands implicit in some recommended approaches for dealing with intercurrent events and highlight the disconnect between estimands formulated in terms of potential outcomes and the real world.
{"title":"Estimands in clinical trials of complex disease processes.","authors":"Richard J Cook, Jerald F Lawless","doi":"10.1177/17407745241268054","DOIUrl":"10.1177/17407745241268054","url":null,"abstract":"<p><p>Clinical trials with random assignment of treatment provide evidence about causal effects of an experimental treatment compared to standard care. However, when disease processes involve multiple types of possibly semi-competing events, specification of target estimands and causal inferences can be challenging. Intercurrent events such as study withdrawal, the introduction of rescue medication, and death further complicate matters. There has been much discussion about these issues in recent years, but guidance remains ambiguous. Some recommended approaches are formulated in terms of hypothetical settings that have little bearing in the real world. We discuss issues in formulating estimands, beginning with intercurrent events in the context of a linear model and then move on to more complex disease history processes amenable to multistate modeling. We elucidate the meaning of estimands implicit in some recommended approaches for dealing with intercurrent events and highlight the disconnect between estimands formulated in terms of potential outcomes and the real world.</p>","PeriodicalId":10685,"journal":{"name":"Clinical Trials","volume":" ","pages":"604-611"},"PeriodicalIF":2.2,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11528884/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142046433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}