Background: Partially clustered trials are trials that, by design, include a mixture of independent and clustered observations. For example, neonatal trials may include infants from a single, twin or triplet birth. The clustering of observations in partially clustered trials should be accounted for when determining the target sample size to avoid treatment arm comparisons being over or under powered. Limited tools are currently available for calculating the sample size for partially clustered trials, particularly when the maximum cluster size is greater than 2. The aim of this article is to introduce a new online application to calculate the target sample size for partially clustered trials covering a broad range of scenarios.
Methods: The target sample size is calculated using design effects recently derived for two-arm partially clustered trials when the clusters exist prior to randomisation and the outcome of interest is continuous or binary. Both cluster and individual randomisation are considered for the clustered observations (resulting in nested and crossed designs, respectively). The sample size depends on quantities needed for typical sample size calculations, such as the effect size of interest, and the desired significance level and power. In addition, the sample size for partially clustered trials also depends on the range of cluster sizes, the proportion of observations that belong to clusters of each size, the intracluster correlation coefficient, the method of randomisation for the clustered observations, and the model that will be used for analysis. We developed an R Shiny web application that implements these methods in an easy-to-use sample size calculator that is freely available online.
Results: The sample size calculator is free to access and provides trialists with the ability to determine the target sample size for different types of partially clustered trials. Step-by-step instructions are provided to illustrate the use of the calculator for designing two hypothetical trials. The target sample size that accounts for partial clustering can be quite different to the sample size that is calculated by methods for an independent design that ignore the clustering.
Conclusion: Partial clustering affects the power and sample size requirements of clinical trials. The calculator presented in this article allows trialists to account for the clustering that occurs in two-arm partially clustered trials for binary and continuous outcomes and ensure their trials are appropriately powered.
Background: Intrapartum research (occurring during labour and birth) presents challenges to successful recruitment to clinical trials. These include limited time for discussion, decision-making for two (mother and baby), heightened emotional states (pain, anxiety and/or fatigue) and clinician hesitancy to discuss research in this setting. In the context of the Baby head ElevAtion Device Feasibility Study, where the event of interest (caesarean section at full dilatation) is both rare (fewer than 3% of all births) and unpredictable, we undertook a mixed-methods evaluation of the two-stage consent process: (1) abbreviated intrapartum consent and (2) full postpartum consent. The aim was to explore whether abbreviated intrapartum consent was acceptable to patients and clinicians.
Methods: Eligible patients approached at full cervical dilatation (10 cm) to take part in the Baby head ElevAtion Device Feasibility Study were invited to complete a face-to-face survey of their experience of consent within 3 days after birth. We sampled those who consented and those who declined the study. Clinicians working at recruitment sites were invited to an individual semi-structured interview. Qualitative data were analysed using reflexive thematic analysis.
Results: Over 12 months, 69% (128/186) of eligible patients consented to the Baby head ElevAtion Device Feasibility Study; 87% of consenters and 66% of decliners completed a follow-up survey. Most survey responders (78%) and clinicians found abbreviated intrapartum consent acceptable. Three themes shaped patient decision-making: perceived benefits, trust in healthcare, and feeling overwhelmed. Those who declined often wished they'd had more time or earlier information. Clinicians found the two-stage consent process feasible and appropriate for low-risk interventions, although time pressures and communication challenges affected consent quality. Many saw the model as respectful of autonomy and potentially useful for future intrapartum research.
Conclusion: Our findings suggest that the two-stage consent process for this intrapartum study was acceptable to both patients and clinicians. We propose this as a useful consent model for peripartum studies where the clinical situation occurs infrequently, the intervention being studied is low risk, and where opt-out or deferred consent is not available.
Introduction: Conducting systematic reviews of clinical trials is time-consuming and resource-intensive. One potential solution is to design databases that are continuously and automatically populated with clinical trial data from harmonised and structured datasets. This scoping review aimed to identify and map publicly available, continuously updated, topic-specific databases of clinical trials.
Methods: We systematically searched PubMed, Embase, the preprint servers medRxiv, arXiv, Open Science Framework, and Google. We characterised each database using seven predefined features (access model, database type, data input sources, retrieval methods, data-extraction methods, trial presentation, and export options) and narratively summarised the results.
Results: We identified 14 continuously updated databases of clinical trials, seven related to COVID-19 (initiated in 2020) and seven non-COVID-19 databases (initiated as early as in 2009). All databases, except one, were publicly funded and accessible without restrictions. Most relied on traditional methods used in static article-based systematic reviews sourcing data from journal publications and trial registries. The COVID-19 databases and some non-COVID-19 databases implemented semi-automated features of data import, which combined automated and manual data curation, whereas the non-COVID-19 databases mainly relied on manual workflows. Most reported information was metadata, such as author names, years of publication, and link to publication or trial registry. Only two databases included trial appraisal information (such as risk of bias assessments). Six databases reported aggregate group-level results, but only one database provided individual participant data on request.
Discussion: Continuously updated topic-specific databases of clinical trials remain limited in number, and existing initiatives mainly employ traditional static systematic review methodologies. A key barrier to developing truly living platforms is the lack of accessible, machine-readable, and standardised clinical trial data.
Background: There is growing recognition of the importance of patient-reported tolerability in complementing traditional clinician-reported safety evaluation of cancer therapies. Recent regulatory guidance listed the evaluation of overall side effect impact as a core patient-reported outcome in oncology clinical trials. A single item ('GP5') that asks about side effect bother is included in the Functional Assessment of Chronic Illness Therapy and has been used to capture overall side effect impact. This paper sought to expand the evidence base for GP5 by examining its association with clinician-reported treatment-emergent adverse events and patient-reported global health.
Methods: We examined six commercial cancer clinical trials that collected GP5. The patient population was drawn from the safety population and the analysis focused on the first on-treatment assessment. Clinician-reported adverse events were classified as symptomatic if such adverse events were considered amenable to patient self-reporting (e.g. nausea). Chi-square tests and Pearson's correlation were used to examine associations. We considered adverse event grade and frequency, both for symptomatic adverse events and any type of adverse events. Global health was measured using the visual analogue scale of the EuroQol-5 Dimensions-3 Levels measure. 'Moderate-severe' bother was characterised as scores of 2-4 on a 0-4 point scale for GP5, and 'severe' bother was characterised as scores of 3-4. Analyses were conducted separately for each trial.
Results: Data from 3,557 patients were included. Across the trials, most (71.7%-94.2%) patients had an adverse event of some kind, but fewer (17.1%-44.4%) had an adverse event of grade 3 or higher. In general, fewer than 50% of patients (20.6%-44.2%) reported moderate-severe bother and 5.8%-17.% reported severe bother. There were consistent, albeit not always statistically significant, associations between GP5 and adverse events, and GP5/global health correlations ranged from -0.17 to -0.41.
Discussion: GP5 is associated with both clinician- and patient-reported symptoms, suggesting its validity and usefulness as part of comprehensive tolerability assessment of cancer trials.
Background: In randomized trials where some standard-treatment arm patients cross to the experimental treatment, it is frequently of interest to estimate the between-arm survival difference as if no patients on the standard-treatment arm had crossed over to the experimental treatment. Rank-preserving structural failure time models, an extension of semiparametric accelerated-failure-time models, are a popular method for accomplishing this because they do not require modeling which patients will crossover.
Methods: In trying to apply the rank-preserving structural failure time model in practice, we noted some unusual behavior of the estimated acceleration parameter (differential treatment effect). Simple examples and limited simulations are provided to examine and understand this behavior.
Results: The simulations show that rank-preserving structural failure time model estimator of the acceleration parameter can take on extreme values, especially when the intent-to-treat analysis favors the standard-treatment arm. Furthermore, the addition of censoring is paradoxically shown to reduce the estimator's variability compared to the uncensored data when the underlying observations are exponentially distributed. Use of a Weibull distribution with short tails for the survival times eliminates this unusual behavior.
Conclusion: The rank-preserving structural failure time model estimators of the acceleration parameter are not based on the joint ranks of the original data, and it is suggested that this makes acceleration-parameter estimator unstable with long-tailed survival distributions.
In May 2023, the US Food and Drug Administration released a guidance document on adjusting for covariates in randomized clinical trials for drugs and biological products. This article provides a summary of motivations for the US Food and Drug Administration guidance document, recommendations in the guidance document, considerations for covariate adjustment in large trials and small trials, and additional topics beyond the scope of the guidance document that may benefit from greater consensus on best practices. A covariate-adjusted prespecified primary analysis can have advantages over an unadjusted analysis and is generally acceptable to the US Food and Drug Administration.
Background/aimsTo assess pre- and postnatal factors associated with participation in a randomized clinical trial of daily docosahexaenoic supplementation in toddlers born preterm. We hypothesized that enrolled families would not differ from those who did not participate.MethodChildren eligible for the Omega Tots trial were born at <35 completed weeks' gestation and were 10-16 months of age at recruitment. Eligibility data abstracted from the medical record were linked with the child's birth certificate. The primary outcome was whether the family enrolled, declined, or was non-responsive to recruitment efforts. Log-binomial regression calculated risk ratios (RR).Results316 families enrolled, 1089 declined, and 1081 were non-responsive. Enrolling, rather than not enrolling, was negatively associated with caregivers being married (RR = 0.76, 95% CI: 0.62, 0.94), identifying as White (RR = 0.76, 95% CI: 0.60, 0.94), and children being born at later gestational ages (RR1-week = 0.96, 95% CI: 0.92, 0.99); positively associated with children weighing <1500 g at birth (RR = 1.26, 95% CI: 1.01, 1.55), attending a neonatology specialty clinic (RR = 1.46, 95% CI: 1.19, 1.80), family participation in WIC (RR = 1.39, 95% CI: 1.13, 1.72), and living in an urban zip code (RR = 1.68, 95% CI: 1.30, 2.17). Varied associations with enrolling rather than declining, enrolling rather than being non-responsive, and declining rather than being non-responsive were identified.ConclusionsMaternal, child, and socioeconomic characteristics were different for families who enrolled, relative to families who did not enroll. Factors associated with enrollment differed between families who were non-responsive to recruitment attempts and those who declined enrollment, with additional differences identified between families who declined participation and those who were non-responsive. Recruitment initiatives tailored to ensuring enrollees reflect the source population may improve generalizability.
Background: Desirability of outcome ranking (DOOR) is a paradigm for the design, monitoring, analysis, interpretation, and reporting of clinical trials based on patient-centric benefit-risk evaluation, developed to address limitations of existing approaches and advance clinical trial science. The first step in implementing DOOR is defining an ordinal DOOR outcome representing a global patient-centric response, a cumulative summary of the benefits and harms for an individual patient. This article aims to develop an analysis methodology for the setting where the DOOR outcome is a progressive time-varying state, and there is interest in event times and times that patients spend in more and less desirable states.
Methods: We develop methods to estimate and make inferences about the temporal treatment effects. If the k-levels of the DOOR outcome are monotone, then k - 1 non-overlapping Kaplan-Meier survival curves can be estimated and plotted. The areas under the curves asymptotically follow a multivariate Gaussian distribution. We apply restricted mean survival time (RMST) concepts to the ordinal Kaplan-Meier curves and provide steps for estimating the covariance structure.
Results: Simulation studies demonstrate that the proposed methods perform well in practical settings. We generate censoring time under a uniform distribution and event times under a multi-state structure. The proposed estimators have small biases, the 95% confidence intervals have correct coverage probabilities, and the proposed tests accurately control the type I error rate under the null hypothesis. We illustrate the methods using data from Adaptive COVID-19 Treatment Trial (ACTT-1), a clinical trial that compared remdesivir vs placebo for the treatment of COVID-19 infection.
Discussion: Ordinal DOOR outcomes, which incorporate benefits and harms and represent an overall patient response, have recently been recommended by the Council for International Organizations of Medical Sciences (CIOMS) as a standard approach to benefit:risk analysis. Such endpoints recognize the cumulative nature of outcomes on patients, account for correlations between efficacy and safety, incorporate multivariate survival outcomes, offer generalizability to inform clinical practice, and recognize finer gradations of patient response and binary outcomes. Robust and interpretable analysis methodologies for ordinal outcomes are needed.
Conclusion: Restricted mean survival time is a useful nonparametric approach for robust treatment effect estimation. We provide a framework for inference using multiple RMSTs to analyze DOOR and other ordinal outcomes using an interpretable time metric.

