首页 > 最新文献

Clinical Trials最新文献

英文 中文
Comment on: Shaping the future of clinical trials through strategic foresight. 点评:通过战略远见塑造临床试验的未来。
IF 2.2 3区 医学 Q3 MEDICINE, RESEARCH & EXPERIMENTAL Pub Date : 2026-02-26 DOI: 10.1177/17407745251414681
Brian L Wiens
{"title":"Comment on: Shaping the future of clinical trials through strategic foresight.","authors":"Brian L Wiens","doi":"10.1177/17407745251414681","DOIUrl":"https://doi.org/10.1177/17407745251414681","url":null,"abstract":"","PeriodicalId":10685,"journal":{"name":"Clinical Trials","volume":" ","pages":"17407745251414681"},"PeriodicalIF":2.2,"publicationDate":"2026-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147289390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Response-adaptive randomization with imperfect intermediate endpoints. 具有不完全中间终点的响应自适应随机化。
IF 2.2 3区 医学 Q3 MEDICINE, RESEARCH & EXPERIMENTAL Pub Date : 2026-02-25 DOI: 10.1177/17407745261415792
Yousra Kherabi, Michael A Proschan, Lori E Dodd

Background: Response-adaptive randomization is controversial even in the best circumstances when based on a quickly determined primary outcome. In disease settings in which the primary outcome requires long follow-up, an intermediate endpoint may be chosen to update randomization allocations. The aim of our study is to evaluate the impact of response-adaptive randomization applied to an imperfect intermediate endpoint. We use tuberculosis trials as the motivating example.

Methods: We simulated a response-adaptive randomization design, adapting randomization allocations using an imperfect intermediate endpoint, in a superiority trial of two experimental regimens and one control arm. The primary study outcome was treatment success after 73 weeks from randomization; the intermediate endpoint was culture conversion at 8 weeks. We compared different sensitivity (Se) and specificity (Spe) scenarios for the intermediate endpoint, while varying the true treatment efficacy. We evaluated the performance of response-adaptive randomization to achieve its primary goal of allocating more participants to the better arm and the impact of time-trends on type I error rate.

Results: Even in an ideal state of perfect accuracy (i.e. intermediate endpoint with Se = 100% and Spe = 100%), response-adaptive randomization did not always live up to its main purpose of allocating more patients to the better arm. Lower accuracy of the intermediate endpoint leads to greater divergence from the goal of more allocations to the better arm. The larger the difference in treatment efficacy between the arms, the more striking the impact of an intermediate endpoint with poor diagnostic accuracy. Time-trends inflate the type I error rate, and while stratified tests can correct this, they do so at the cost of a power loss. Allocating more patients to the worst arm increases power for comparisons with this arm but reduces power for comparisons of the best arm to control.

Conclusion: Given the objective of evaluating several new therapeutic regimens in a timely manner, response-adaptive randomization is tempting. However, it requires at least reliance on highly accurate intermediate endpoints, which are still no guarantee of response-adaptive randomization's trustworthiness.

背景:即使在基于快速确定的主要结果的最佳情况下,反应适应性随机化也是有争议的。在主要结局需要长时间随访的疾病环境中,可以选择一个中间终点来更新随机分配。我们研究的目的是评估应用于不完美中间终点的反应适应性随机化的影响。我们用肺结核试验作为激励的例子。方法:我们模拟了一种反应-自适应随机化设计,在一项包含两个实验方案和一个对照组的优势试验中,使用一个不完美的中间终点来适应随机化分配。主要研究结果是随机分组后73周治疗成功;中间终点为8周时的培养转化。我们比较了中间终点的不同敏感性(Se)和特异性(Spe)情况,同时改变了真实的治疗效果。我们评估了反应自适应随机化的性能,以实现其将更多参与者分配到更好的组的主要目标,以及时间趋势对I型错误率的影响。结果:即使在完美准确度的理想状态下(即Se = 100%和Spe = 100%的中间终点),反应适应性随机化并不总是达到其将更多患者分配到较好组的主要目的。中间端点的准确性较低,导致更大的偏离目标,更多的分配到更好的手臂。两组治疗疗效差异越大,诊断准确性差的中间终点的影响就越显著。时间趋势增加了I型错误率,分层测试可以纠正这一点,但这样做的代价是功率损失。将更多的患者分配到最差组增加了与该组比较的能力,但减少了与最佳组比较的能力。结论:考虑到及时评估几种新的治疗方案的目的,反应适应性随机化是诱人的。然而,它至少需要依赖高度精确的中间端点,这仍然不能保证响应自适应随机化的可信度。
{"title":"Response-adaptive randomization with imperfect intermediate endpoints.","authors":"Yousra Kherabi, Michael A Proschan, Lori E Dodd","doi":"10.1177/17407745261415792","DOIUrl":"https://doi.org/10.1177/17407745261415792","url":null,"abstract":"<p><strong>Background: </strong>Response-adaptive randomization is controversial even in the best circumstances when based on a quickly determined primary outcome. In disease settings in which the primary outcome requires long follow-up, an intermediate endpoint may be chosen to update randomization allocations. The aim of our study is to evaluate the impact of response-adaptive randomization applied to an imperfect intermediate endpoint. We use tuberculosis trials as the motivating example.</p><p><strong>Methods: </strong>We simulated a response-adaptive randomization design, adapting randomization allocations using an imperfect intermediate endpoint, in a superiority trial of two experimental regimens and one control arm. The primary study outcome was treatment success after 73 weeks from randomization; the intermediate endpoint was culture conversion at 8 weeks. We compared different sensitivity (Se) and specificity (Spe) scenarios for the intermediate endpoint, while varying the true treatment efficacy. We evaluated the performance of response-adaptive randomization to achieve its primary goal of allocating more participants to the better arm and the impact of time-trends on type I error rate.</p><p><strong>Results: </strong>Even in an ideal state of perfect accuracy (i.e. intermediate endpoint with Se = 100% and Spe = 100%), response-adaptive randomization did not always live up to its main purpose of allocating more patients to the better arm. Lower accuracy of the intermediate endpoint leads to greater divergence from the goal of more allocations to the better arm. The larger the difference in treatment efficacy between the arms, the more striking the impact of an intermediate endpoint with poor diagnostic accuracy. Time-trends inflate the type I error rate, and while stratified tests can correct this, they do so at the cost of a power loss. Allocating more patients to the worst arm increases power for comparisons with this arm but reduces power for comparisons of the best arm to control.</p><p><strong>Conclusion: </strong>Given the objective of evaluating several new therapeutic regimens in a timely manner, response-adaptive randomization is tempting. However, it requires at least reliance on highly accurate intermediate endpoints, which are still no guarantee of response-adaptive randomization's trustworthiness.</p>","PeriodicalId":10685,"journal":{"name":"Clinical Trials","volume":" ","pages":"17407745261415792"},"PeriodicalIF":2.2,"publicationDate":"2026-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147282663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stakeholder views about the responsibilities of principal investigators in multicenter randomized controlled trials. 利益相关者对多中心随机对照试验中主要研究者责任的看法。
IF 2.2 3区 医学 Q3 MEDICINE, RESEARCH & EXPERIMENTAL Pub Date : 2026-02-23 DOI: 10.1177/17407745261417337
Steven Joffe, Elizabeth F Bair, Katharine A Gleason, Deborah E Sellers, Sarah A McGraw, Cary P Gross, Donna T Chen, Eric G Campbell, Michelle M Mello
<p><strong>Background/aims: </strong>Clinical trials are commonly believed to benefit from the involvement of an academic principal investigator who accepts responsibility for design, conduct, and reporting. Little evidence exists, however, about the importance that diverse stakeholders assign to the principal investigator's role in leadership of trials. Furthermore, few studies have examined whether and how beliefs about the role of the principal investigator might vary by funding source.</p><p><strong>Methods: </strong>We conducted parallel Delphi panel surveys with seven stakeholder groups (principal investigators, patient advocates, journal editors, public funders, industry representatives, United States Food and Drug Administration officials, and clinical trial cooperative-group chairs) to assess the extent to which respondents believed leadership of a multicenter randomized controlled trial by an academic principal investigator to be important, considering publicly and industry-funded trials separately. We then surveyed an international sample of principal investigators (N = 92) who had recently published a multicenter randomized controlled trial in a high-impact general medical, oncology, cardiovascular, or psychiatry journal to assess their normative views on the importance of the academic principal investigator in leading both publicly and industry-funded trials.</p><p><strong>Results: </strong>Several patterns emerged from the Delphi panel surveys. First, panelists viewed involvement of an identified academic principal investigator as most important at the design and planning and the interpretation and dissemination phases of a trial, as compared with the implementation and data collection phase. Second, panelists generally viewed involvement of an identified academic principal investigator as more important in publicly funded than in industry-funded trials. Finally, panelists representing industry stakeholders and United States Food and Drug Administration officials viewed involvement of an identified academic principal investigator as less important, especially for industry-funded trials, than did other groups. Respondents to the normative principal investigator survey generally endorsed the importance of academic principal investigators in leading multicenter randomized controlled trials, both overall (median rating 6 on the 0-6 point scale) and for trial-specific tasks. Both overall and with respect to specific tasks, however, respondents viewed an academic principal investigator's leadership as more important when considering publicly funded as compared with industry-funded trials.</p><p><strong>Conclusion: </strong>Although members of most stakeholder groups participating in Delphi surveys view involvement of an academic principal investigator with overall responsibility for a multicenter randomized controlled trial as very important, there are notable differences depending on the respondent's perspective, the specific trial-relat
背景/目的:临床试验通常被认为受益于学术首席研究者的参与,他承担设计、实施和报告的责任。然而,很少有证据表明,不同的利益相关者赋予首席研究者在试验领导中的作用的重要性。此外,很少有研究调查了对主要研究者角色的看法是否以及如何因资金来源而变化。方法:我们对七个利益相关者群体(主要研究者、患者倡导者、期刊编辑、公共资助者、行业代表、美国食品和药物管理局官员和临床试验合作小组主席)进行了平行的德尔菲小组调查,以评估受访者在多大程度上认为由学术主要研究者领导的多中心随机对照试验是重要的,将公共和行业资助的试验分开考虑。然后,我们调查了最近在高影响力的普通医学、肿瘤学、心血管或精神病学杂志上发表多中心随机对照试验的主要研究者(N = 92)的国际样本,以评估他们对学术主要研究者在领导公共和行业资助试验中的重要性的规范性看法。结果:从德尔菲小组调查中出现了几种模式。首先,小组成员认为,与实施和数据收集阶段相比,在试验的设计和规划以及解释和传播阶段,确定的学术主要研究者的参与是最重要的。其次,专家小组成员普遍认为,在公共资助的试验中,确定的学术首席研究员的参与比在工业资助的试验中更重要。最后,代表行业利益相关者和美国食品和药物管理局官员的小组成员认为,与其他团体相比,确定的学术首席研究员的参与不那么重要,特别是对于行业资助的试验而言。规范性首席调查员调查的受访者普遍认可学术首席调查员在领先的多中心随机对照试验中的重要性,无论是总体(0-6分制的中位数评分为6)还是试验特定任务。然而,在总体上和具体任务方面,与工业界资助的试验相比,在考虑公共资助的试验时,受访者认为学术首席研究员的领导更重要。结论:虽然参与德尔菲调查的大多数利益相关者群体成员认为学术首席研究者对多中心随机对照试验的全面负责非常重要,但根据受访者的观点、具体的试验相关任务和试验的资金来源,存在显著差异。此外,首席研究人员通常认为,学术首席研究人员对公共资助的多中心随机对照试验的有效性比对行业资助的试验的有效性更重要。这些发现强调了澄清临床试验领导在不同环境下的现实实践的必要性,并评估这些实践如何与广泛共享的规范保持一致。
{"title":"Stakeholder views about the responsibilities of principal investigators in multicenter randomized controlled trials.","authors":"Steven Joffe, Elizabeth F Bair, Katharine A Gleason, Deborah E Sellers, Sarah A McGraw, Cary P Gross, Donna T Chen, Eric G Campbell, Michelle M Mello","doi":"10.1177/17407745261417337","DOIUrl":"10.1177/17407745261417337","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background/aims: &lt;/strong&gt;Clinical trials are commonly believed to benefit from the involvement of an academic principal investigator who accepts responsibility for design, conduct, and reporting. Little evidence exists, however, about the importance that diverse stakeholders assign to the principal investigator's role in leadership of trials. Furthermore, few studies have examined whether and how beliefs about the role of the principal investigator might vary by funding source.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;We conducted parallel Delphi panel surveys with seven stakeholder groups (principal investigators, patient advocates, journal editors, public funders, industry representatives, United States Food and Drug Administration officials, and clinical trial cooperative-group chairs) to assess the extent to which respondents believed leadership of a multicenter randomized controlled trial by an academic principal investigator to be important, considering publicly and industry-funded trials separately. We then surveyed an international sample of principal investigators (N = 92) who had recently published a multicenter randomized controlled trial in a high-impact general medical, oncology, cardiovascular, or psychiatry journal to assess their normative views on the importance of the academic principal investigator in leading both publicly and industry-funded trials.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;Several patterns emerged from the Delphi panel surveys. First, panelists viewed involvement of an identified academic principal investigator as most important at the design and planning and the interpretation and dissemination phases of a trial, as compared with the implementation and data collection phase. Second, panelists generally viewed involvement of an identified academic principal investigator as more important in publicly funded than in industry-funded trials. Finally, panelists representing industry stakeholders and United States Food and Drug Administration officials viewed involvement of an identified academic principal investigator as less important, especially for industry-funded trials, than did other groups. Respondents to the normative principal investigator survey generally endorsed the importance of academic principal investigators in leading multicenter randomized controlled trials, both overall (median rating 6 on the 0-6 point scale) and for trial-specific tasks. Both overall and with respect to specific tasks, however, respondents viewed an academic principal investigator's leadership as more important when considering publicly funded as compared with industry-funded trials.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusion: &lt;/strong&gt;Although members of most stakeholder groups participating in Delphi surveys view involvement of an academic principal investigator with overall responsibility for a multicenter randomized controlled trial as very important, there are notable differences depending on the respondent's perspective, the specific trial-relat","PeriodicalId":10685,"journal":{"name":"Clinical Trials","volume":" ","pages":"17407745261417337"},"PeriodicalIF":2.2,"publicationDate":"2026-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12944536/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147275788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Confidence interval estimation for the win probability in cluster randomized trials with hierarchical composite endpoints using win fractions. 用胜利分数估计具有分层复合终点的聚类随机试验中获胜概率的置信区间。
IF 2.2 3区 医学 Q3 MEDICINE, RESEARCH & EXPERIMENTAL Pub Date : 2026-02-23 DOI: 10.1177/17407745261417308
Emma Davies Smith, Yun-Hee Choi, Vipul Jairath, Guangyong Zou

Background/AimsCluster randomized trials with multiple endpoints feature complex correlation structures. Estimating treatment effects in a meaningful way that respects differences in scale and clinical importance is challenging. Pairwise comparison methods address these challenges by constructing all pairs featuring one treatment and one control participant, then evaluating endpoints in hierarchical order. For cluster randomized trials featuring such a "hierarchical composite endpoint," we develop large-sample confidence interval estimators and hypothesis tests for the nonparametric treatment effect referred to here as the "win probability."MethodsFor each pair of participants (one treated and one control), responses on each endpoint are compared in order of descending clinical importance until it can be determined which participant responded better ("won") or all endpoints are exhausted. Dividing the number of wins attributed to the treatment arm by the total number of "pairwise comparisons" yields a point estimate of the win probability. The win probability can be transformed into alternative effect measures, including the "win difference" and "win odds." A two-stage procedure, or "win fraction" approach, is used to obtain variance estimators for the win probability. Each participant's multivariate response is transformed into a univariate "win fraction," which quantifies the proportion of times they won when compared to all participants in the comparison arm. A working linear mixed model is applied to the win fractions to obtain cluster-adjusted point estimates of the win probability and its variance. Inference proceeds by the central limit theorem. Simulation is used to assess the performance of the proposed estimators for a hierarchical composite endpoint comprised of one binary component (more important) and one continuous component (less important) across a range of cluster trial designs. Performance of an empirical bootstrap estimator is also investigated. A case study using data from the REACT cluster trial demonstrates application of the methods, and corresponding SAS and R code is provided.ResultsSimulation suggests that the nominal 95% coverage probability is well maintained and type I error is controlled. Due to the large-sample nature of our method, confidence intervals may be conservative (over coverage) for fewer than 30 clusters. In comparison, the empirical bootstrap estimator is liberal (under coverage) for all numbers of randomized clusters (up to 50).ConclusionOur win fraction method uses a working linear mixed model to obtain confidence intervals and hypothesis tests which respect coverage and type I error. It is faster than the bootstrap, applicable to multiple components on different scales, bypasses specification of complex correlation matrices, permits adjustment, and can be implemented in existing software.

背景/目的多终点的集群随机试验具有复杂的相关结构。以一种有意义的方式评估治疗效果,尊重规模和临床重要性的差异是具有挑战性的。两两比较方法通过构建具有一个处理和一个对照参与者的所有配对,然后按层次顺序评估终点来解决这些挑战。对于具有这种“分层复合终点”的集群随机试验,我们开发了大样本置信区间估计器和假设检验,用于此处称为“获胜概率”的非参数治疗效果。方法对于每一对参与者(一个治疗组和一个对照组),按照临床重要性降序比较每个终点的反应,直到可以确定哪个参与者反应更好(“赢”)或所有终点都耗尽。将治疗组的获胜次数除以“两两比较”的总次数,得出获胜概率的点估计。获胜概率可以转化为可选效果度量,包括“获胜差”和“获胜赔率”。一个两阶段的程序,或“胜利分数”的方法,是用来获得方差估计的胜利概率。每个参与者的多变量反应被转化为单变量“获胜分数”,该分数量化了他们与比较组中所有参与者相比获胜的次数比例。将一个有效的线性混合模型应用于获胜分数,以获得获胜概率及其方差的聚类调整点估计。推论由中心极限定理进行。模拟用于评估所提出的估计器在一系列聚类试验设计中对由一个二元成分(更重要)和一个连续成分(不太重要)组成的分层复合端点的性能。研究了经验自举估计器的性能。使用REACT集群试验数据的案例研究演示了方法的应用,并提供了相应的SAS和R代码。结果仿真结果表明,该方法能很好地维持95%的标称覆盖率,ⅰ类误差得到控制。由于我们方法的大样本性质,对于少于30个集群,置信区间可能是保守的(超过覆盖范围)。相比之下,经验自举估计器对于所有数量的随机集群(最多50个)都是自由的(在覆盖范围内)。结论该方法采用工作线性混合模型,得到了考虑覆盖率和I型误差的置信区间和假设检验。它比bootstrap更快,适用于不同尺度的多个组件,绕过复杂相关矩阵的规范,允许调整,并且可以在现有软件中实现。
{"title":"Confidence interval estimation for the win probability in cluster randomized trials with hierarchical composite endpoints using win fractions.","authors":"Emma Davies Smith, Yun-Hee Choi, Vipul Jairath, Guangyong Zou","doi":"10.1177/17407745261417308","DOIUrl":"10.1177/17407745261417308","url":null,"abstract":"<p><p>Background/AimsCluster randomized trials with multiple endpoints feature complex correlation structures. Estimating treatment effects in a meaningful way that respects differences in scale and clinical importance is challenging. Pairwise comparison methods address these challenges by constructing all pairs featuring one treatment and one control participant, then evaluating endpoints in hierarchical order. For cluster randomized trials featuring such a \"hierarchical composite endpoint,\" we develop large-sample confidence interval estimators and hypothesis tests for the nonparametric treatment effect referred to here as the \"win probability.\"MethodsFor each pair of participants (one treated and one control), responses on each endpoint are compared in order of descending clinical importance until it can be determined which participant responded better (\"won\") or all endpoints are exhausted. Dividing the number of wins attributed to the treatment arm by the total number of \"pairwise comparisons\" yields a point estimate of the win probability. The win probability can be transformed into alternative effect measures, including the \"win difference\" and \"win odds.\" A two-stage procedure, or \"win fraction\" approach, is used to obtain variance estimators for the win probability. Each participant's multivariate response is transformed into a univariate \"win fraction,\" which quantifies the proportion of times they won when compared to all participants in the comparison arm. A working linear mixed model is applied to the win fractions to obtain cluster-adjusted point estimates of the win probability and its variance. Inference proceeds by the central limit theorem. Simulation is used to assess the performance of the proposed estimators for a hierarchical composite endpoint comprised of one binary component (more important) and one continuous component (less important) across a range of cluster trial designs. Performance of an empirical bootstrap estimator is also investigated. A case study using data from the REACT cluster trial demonstrates application of the methods, and corresponding SAS and R code is provided.ResultsSimulation suggests that the nominal 95% coverage probability is well maintained and type I error is controlled. Due to the large-sample nature of our method, confidence intervals may be conservative (over coverage) for fewer than 30 clusters. In comparison, the empirical bootstrap estimator is liberal (under coverage) for all numbers of randomized clusters (up to 50).ConclusionOur win fraction method uses a working linear mixed model to obtain confidence intervals and hypothesis tests which respect coverage and type I error. It is faster than the bootstrap, applicable to multiple components on different scales, bypasses specification of complex correlation matrices, permits adjustment, and can be implemented in existing software.</p>","PeriodicalId":10685,"journal":{"name":"Clinical Trials","volume":" ","pages":"17407745261417308"},"PeriodicalIF":2.2,"publicationDate":"2026-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12931659/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147269734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Use of estimands in cluster randomised trials: A review. 聚类随机试验中估计的使用:综述。
IF 2.2 3区 医学 Q3 MEDICINE, RESEARCH & EXPERIMENTAL Pub Date : 2026-02-17 DOI: 10.1177/17407745251415538
Dongquan Bi, Andrew Copas, Brennan C Kahan

Background: An estimand is a clear description of the treatment effect a study aims to quantify. The ICH E9(R1) addendum lists five attributes that should be described as part of the estimand definition. However, the addendum was primarily developed for individually randomised trials. Cluster randomised trials, in which groups of individuals are randomised, have additional considerations for defining estimands (e.g. how individuals and clusters are weighted, how cluster-level intercurrent events are handled). However, it is currently unknown if estimands are being used in cluster randomised trials, or whether the considerations specific to cluster randomised trials are being described.

Methods: We reviewed 73 cluster randomised trials published between October 2023 and January 2024 that were indexed in MEDLINE. For each trial, we assessed whether the estimand for the primary outcome was described, or if not, whether it could be inferred from the statistical methods. We also assessed whether considerations specific to cluster randomised trials were described or inferable, how trials were analysed and whether key assumptions being made in the analysis (e.g. 'no informative cluster size') could be identified.

Results: No trials attempted to describe the estimand for their primary outcome. We were able to infer the five attributes outlined in ICH E9(R1) in only 49% of trials, and when including additional considerations specific to cluster randomised trials, this figure dropped to 21%. Key drivers of this ambiguity were lack of clarity around whether individual- or cluster-average effects were of interest (unclear in 63% of trials), and how cluster-level intercurrent events were handled (unclear in 21% of trials for which this was applicable). Over half of trials used mixed-effects models or generalising estimating equations with an exchangeable correlation structure, which make the assumption that there is no informative cluster size; however, only one of these trials performed sensitivity analyses to evaluate robustness of results to deviations from this assumption. There were 14% of trials that used independence estimating equations or the analysis of cluster-level summaries; however, because no trials stated whether they were targeting the individual- or cluster-average effect, it was impossible to determine whether these methods implemented the appropriate weighting scheme and were thus unbiased.

Conclusion: The uptake of estimands in published cluster randomised trial articles is low, making it difficult to ascertain which questions were being investigated or whether statistical estimators were appropriate for those questions. This highlights an urgent need to develop guidelines on defining estimands that cover unique aspects of cluster randomised trials to ensure clarity of research questions in these trials.

背景:评估是对研究旨在量化的治疗效果的清晰描述。ICH E9(R1)附录列出了应作为评估定义一部分描述的五个属性。然而,附录主要是为单独随机试验开发的。在随机分组的个体试验中,在定义估计时需要考虑额外的因素(例如,如何对个体和集群进行加权,如何处理集群级别的交互事件)。然而,目前尚不清楚是否在聚类随机试验中使用了估计值,或者是否描述了聚类随机试验的特定考虑因素。方法:我们回顾了2023年10月至2024年1月间发表的73项在MEDLINE检索的聚类随机试验。对于每个试验,我们评估是否描述了主要结局的估计,如果没有,是否可以从统计方法中推断出来。我们还评估了聚类随机试验的特定考虑因素是否被描述或推断,试验是如何分析的,以及分析中是否做出了关键假设(例如:‘没有信息群集大小’)可以识别。结果:没有试验试图描述其主要结局的估计。我们仅能在49%的试验中推断出ICH E9(R1)中概述的五个属性,当包括特定于集群随机试验的额外考虑因素时,这一数字下降到21%。这种模糊性的主要驱动因素是缺乏对个体效应或集群平均效应是否感兴趣(63%的试验不清楚),以及如何处理集群水平的交互事件(21%的试验不清楚)。超过一半的试验使用混合效应模型或具有可交换相关结构的广义估计方程,这假设没有信息簇大小;然而,这些试验中只有一项进行了敏感性分析,以评估结果对偏离该假设的稳健性。有14%的试验使用了独立估计方程或聚类水平总结分析;然而,由于没有试验表明它们是针对个体平均效应还是集群平均效应,因此不可能确定这些方法是否实施了适当的加权方案,从而是无偏的。结论:在已发表的聚类随机试验文章中,估计的使用率很低,这使得很难确定哪些问题正在被调查,或者统计估计是否适用于这些问题。这突出了迫切需要制定指南来定义涵盖集群随机试验独特方面的估计,以确保这些试验中研究问题的清晰度。
{"title":"Use of estimands in cluster randomised trials: A review.","authors":"Dongquan Bi, Andrew Copas, Brennan C Kahan","doi":"10.1177/17407745251415538","DOIUrl":"https://doi.org/10.1177/17407745251415538","url":null,"abstract":"<p><strong>Background: </strong>An estimand is a clear description of the treatment effect a study aims to quantify. The ICH E9(R1) addendum lists five attributes that should be described as part of the estimand definition. However, the addendum was primarily developed for individually randomised trials. Cluster randomised trials, in which groups of individuals are randomised, have additional considerations for defining estimands (e.g. how individuals and clusters are weighted, how cluster-level intercurrent events are handled). However, it is currently unknown if estimands are being used in cluster randomised trials, or whether the considerations specific to cluster randomised trials are being described.</p><p><strong>Methods: </strong>We reviewed 73 cluster randomised trials published between October 2023 and January 2024 that were indexed in MEDLINE. For each trial, we assessed whether the estimand for the primary outcome was described, or if not, whether it could be inferred from the statistical methods. We also assessed whether considerations specific to cluster randomised trials were described or inferable, how trials were analysed and whether key assumptions being made in the analysis (e.g. 'no informative cluster size') could be identified.</p><p><strong>Results: </strong>No trials attempted to describe the estimand for their primary outcome. We were able to infer the five attributes outlined in ICH E9(R1) in only 49% of trials, and when including additional considerations specific to cluster randomised trials, this figure dropped to 21%. Key drivers of this ambiguity were lack of clarity around whether individual- or cluster-average effects were of interest (unclear in 63% of trials), and how cluster-level intercurrent events were handled (unclear in 21% of trials for which this was applicable). Over half of trials used mixed-effects models or generalising estimating equations with an exchangeable correlation structure, which make the assumption that there is no informative cluster size; however, only one of these trials performed sensitivity analyses to evaluate robustness of results to deviations from this assumption. There were 14% of trials that used independence estimating equations or the analysis of cluster-level summaries; however, because no trials stated whether they were targeting the individual- or cluster-average effect, it was impossible to determine whether these methods implemented the appropriate weighting scheme and were thus unbiased.</p><p><strong>Conclusion: </strong>The uptake of estimands in published cluster randomised trial articles is low, making it difficult to ascertain which questions were being investigated or whether statistical estimators were appropriate for those questions. This highlights an urgent need to develop guidelines on defining estimands that cover unique aspects of cluster randomised trials to ensure clarity of research questions in these trials.</p>","PeriodicalId":10685,"journal":{"name":"Clinical Trials","volume":" ","pages":"17407745251415538"},"PeriodicalIF":2.2,"publicationDate":"2026-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146212420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Event-driven planning of two-armed trials with a binary endpoint. 具有二元终点的双臂试验的事件驱动计划。
IF 2.2 3区 医学 Q3 MEDICINE, RESEARCH & EXPERIMENTAL Pub Date : 2026-02-16 DOI: 10.1177/17407745251415535
Erica H Brittain, Raphaël N Morsomme, Michael A Proschan

Background/aims: In randomized two-armed clinical trials with binary endpoints, there may be uncertainty about the event probability, which is needed for sample size calculation. Survival trials are powered based on number of events rather than people, and this is advantageous because the number of events needed to achieve a desired power is less sensitive to an unknown parameter than is the number of people needed. We investigate and quantify this relative stability of number of events compared to number of people in the context of a randomized two-armed trial with equal sample sizes and a binary endpoint. In binary endpoint settings with such relative stability, we consider (1) enhancement of traditional adaptive trial design and (2) potential benefits of a simple event-driven strategy.

Methods: Using sample size formulas, we determine the relative stability of the expected number of events compared to the sample size for binary outcome trials using the relative risk, odds ratio, or risk difference. Simulations consider a simple event-driven design when there is relative stability; we evaluate type I error rate and power under various analysis methods and approaches to halting the trial.

Results: We find that the number of events is at least three times more stable than the sample size to achieve a specified power for the relative risk when the overall event probability is less than 1/3, and for the odds ratio when the overall event probability is less than 0.20. We show that this relative stability is independent of the type 1 and type 2 error rates and magnitude of the treatment effect. In a setting where the overall event probability is consistent with relative stability, simulations of an event-driven design show that asymptotic methods may have modestly high type I error rates, but that other approaches appear to have good operating characteristics.

Conclusion: In settings with moderately low event probabilities, thinking in terms of the number of events instead of sample size may (1) facilitate the planning of clinical trials and help determine whether a trial is futile, and (2) lead to a simple event-driven design for binary endpoints that may be feasible and appealing.

背景/目的:在具有双终点的随机双臂临床试验中,事件概率可能存在不确定性,这是计算样本量所需要的。生存试验是基于事件的数量而不是人数来提供能量的,这是有利的,因为达到预期能量所需的事件数量对未知参数的敏感性低于所需人数。我们调查并量化了一项具有等样本量和二元终点的随机双臂试验中与人数相比的事件数量的相对稳定性。在具有这种相对稳定性的二元终点设置中,我们考虑(1)传统自适应试验设计的增强和(2)简单事件驱动策略的潜在好处。方法:使用样本量公式,我们使用相对风险、优势比或风险差异来确定与双结果试验的样本量相比预期事件数的相对稳定性。当存在相对稳定性时,模拟考虑简单的事件驱动设计;我们在各种分析方法和停止试验的方法下评估I型错误率和功率。结果:我们发现,当整体事件概率小于1/3时,相对风险的稳定性至少比样本量稳定3倍,当整体事件概率小于0.20时,优势比达到指定功率。我们表明,这种相对稳定性与1型和2型错误率和治疗效果的大小无关。在总体事件概率与相对稳定性相一致的情况下,对事件驱动设计的模拟表明,渐近方法可能具有适度高的I型错误率,但其他方法似乎具有良好的操作特性。结论:在事件概率较低的情况下,考虑事件数量而不是样本量可能(1)有助于临床试验的规划,并有助于确定试验是否无效,(2)导致简单的事件驱动的二元终点设计可能是可行的和吸引人的。
{"title":"Event-driven planning of two-armed trials with a binary endpoint.","authors":"Erica H Brittain, Raphaël N Morsomme, Michael A Proschan","doi":"10.1177/17407745251415535","DOIUrl":"https://doi.org/10.1177/17407745251415535","url":null,"abstract":"<p><strong>Background/aims: </strong>In randomized two-armed clinical trials with binary endpoints, there may be uncertainty about the event probability, which is needed for sample size calculation. Survival trials are powered based on number of events rather than people, and this is advantageous because the number of events needed to achieve a desired power is less sensitive to an unknown parameter than is the number of people needed. We investigate and quantify this relative stability of number of events compared to number of people in the context of a randomized two-armed trial with equal sample sizes and a binary endpoint. In binary endpoint settings with such relative stability, we consider (1) enhancement of traditional adaptive trial design and (2) potential benefits of a simple event-driven strategy.</p><p><strong>Methods: </strong>Using sample size formulas, we determine the relative stability of the expected number of events compared to the sample size for binary outcome trials using the relative risk, odds ratio, or risk difference. Simulations consider a simple event-driven design when there is relative stability; we evaluate type I error rate and power under various analysis methods and approaches to halting the trial.</p><p><strong>Results: </strong>We find that the number of events is at least three times more stable than the sample size to achieve a specified power for the relative risk when the overall event probability is less than 1/3, and for the odds ratio when the overall event probability is less than 0.20. We show that this relative stability is independent of the type 1 and type 2 error rates and magnitude of the treatment effect. In a setting where the overall event probability is consistent with relative stability, simulations of an event-driven design show that asymptotic methods may have modestly high type I error rates, but that other approaches appear to have good operating characteristics.</p><p><strong>Conclusion: </strong>In settings with moderately low event probabilities, thinking in terms of the number of events instead of sample size may (1) facilitate the planning of clinical trials and help determine whether a trial is futile, and (2) lead to a simple event-driven design for binary endpoints that may be feasible and appealing.</p>","PeriodicalId":10685,"journal":{"name":"Clinical Trials","volume":" ","pages":"17407745251415535"},"PeriodicalIF":2.2,"publicationDate":"2026-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146206838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A modular pipeline for natural language processing-screened human abstraction of a pragmatic trial outcome from electronic health records. 一个用于自然语言处理的模块化管道筛选了电子健康记录中实用试验结果的人类抽象。
IF 2.2 3区 医学 Q3 MEDICINE, RESEARCH & EXPERIMENTAL Pub Date : 2026-02-12 DOI: 10.1177/17407745251405386
Robert Y Lee, Kevin S Li, James Sibley, Trevor Cohen, William B Lober, Janaki O'Brien, Nicole LeDuc, Kasey Mallon Andrews, Anna Ungar, Jessica Walsh, Elizabeth L Nielsen, Danae G Dotolo, Erin K Kross

Background: Natural language processing allows efficient extraction of clinical variables and outcomes from electronic health records (EHRs). However, measuring pragmatic clinical trial outcomes may demand accuracy that exceeds natural language processing performance. Combining natural language processing with human adjudication can address this gap, yet few software solutions support such workflows. We developed a modular, scalable system for natural language processing-screened human abstraction to measure the primary outcomes of two clinical trials.

Methods: In two clinical trials of hospitalized patients with serious illness, a deep-learning natural language processing model screened electronic health record passages for documented goals-of-care discussions. Screen-positive passages were referred for human adjudication using a REDCap-based system to measure the trial outcomes. Dynamic pooling of passages using structured query language within the REDCap database reduced unnecessary abstraction while ensuring data completeness.

Results: In the first trial (N = 2512), natural language processing identified 22,187 screen-positive passages (0.8%) from 2.6 million electronic health record passages. Human reviewers adjudicated 7494 passages over 34.3 abstractor-hours to measure the cumulative incidence and time to first documented goals-of-care discussion for all patients with 92.6% patient-level sensitivity. In the second trial (N = 617), natural language processing identified 8952 screen-positive passages (1.6%) from 559,596 passages at a threshold with near-100% sensitivity. Human reviewers adjudicated 3509 passages over 27.9 abstractor-hours to measure the same outcome for all patients.

Discussion: We present the design and source code for a scalable and efficient pipeline for measuring complex electronic health record-derived outcomes using natural language processing-screened human abstraction. This implementation is adaptable to diverse research needs, and its modular pipeline represents a practical middle ground between custom software and commercial platforms.

背景:自然语言处理可以有效地从电子健康记录(EHRs)中提取临床变量和结果。然而,测量实用的临床试验结果可能需要超过自然语言处理性能的准确性。将自然语言处理与人类裁决相结合可以解决这一差距,但很少有软件解决方案支持这种工作流程。我们开发了一个模块化的,可扩展的系统,用于自然语言处理筛选人类抽象来测量两个临床试验的主要结果。方法:在两项重症住院患者的临床试验中,深度学习自然语言处理模型筛选电子健康记录段落,以记录护理目标讨论。使用基于redcap的系统来测量试验结果,将筛选阳性通道提交给人工裁决。在REDCap数据库中使用结构化查询语言的动态通道池减少了不必要的抽象,同时确保了数据的完整性。结果:在第一次试验中(N = 2512),自然语言处理从260万电子健康记录中识别出22187个筛查阳性通道(0.8%)。人类审稿人在34.3个抽象小时内对7494个传代进行了评审,以测量所有患者的累积发生率和首次记录护理目标讨论的时间,患者水平敏感性为92.6%。在第二次试验中(N = 617),自然语言处理在接近100%灵敏度的阈值下,从559,596个传代中识别出8952个筛选阳性传代(1.6%)。人类审稿人在27.9个抽象小时内评判了3509篇文章,以衡量所有患者的相同结果。讨论:我们提出了一个可扩展的高效管道的设计和源代码,该管道使用自然语言处理筛选人类抽象来测量复杂的电子健康记录衍生结果。这种实现可以适应不同的研究需求,其模块化管道代表了自定义软件和商业平台之间的实际中间地带。
{"title":"A modular pipeline for natural language processing-screened human abstraction of a pragmatic trial outcome from electronic health records.","authors":"Robert Y Lee, Kevin S Li, James Sibley, Trevor Cohen, William B Lober, Janaki O'Brien, Nicole LeDuc, Kasey Mallon Andrews, Anna Ungar, Jessica Walsh, Elizabeth L Nielsen, Danae G Dotolo, Erin K Kross","doi":"10.1177/17407745251405386","DOIUrl":"10.1177/17407745251405386","url":null,"abstract":"<p><strong>Background: </strong>Natural language processing allows efficient extraction of clinical variables and outcomes from electronic health records (EHRs). However, measuring pragmatic clinical trial outcomes may demand accuracy that exceeds natural language processing performance. Combining natural language processing with human adjudication can address this gap, yet few software solutions support such workflows. We developed a modular, scalable system for natural language processing-screened human abstraction to measure the primary outcomes of two clinical trials.</p><p><strong>Methods: </strong>In two clinical trials of hospitalized patients with serious illness, a deep-learning natural language processing model screened electronic health record passages for documented goals-of-care discussions. Screen-positive passages were referred for human adjudication using a REDCap-based system to measure the trial outcomes. Dynamic pooling of passages using structured query language within the REDCap database reduced unnecessary abstraction while ensuring data completeness.</p><p><strong>Results: </strong>In the first trial (N = 2512), natural language processing identified 22,187 screen-positive passages (0.8%) from 2.6 million electronic health record passages. Human reviewers adjudicated 7494 passages over 34.3 abstractor-hours to measure the cumulative incidence and time to first documented goals-of-care discussion for all patients with 92.6% patient-level sensitivity. In the second trial (N = 617), natural language processing identified 8952 screen-positive passages (1.6%) from 559,596 passages at a threshold with near-100% sensitivity. Human reviewers adjudicated 3509 passages over 27.9 abstractor-hours to measure the same outcome for all patients.</p><p><strong>Discussion: </strong>We present the design and source code for a scalable and efficient pipeline for measuring complex electronic health record-derived outcomes using natural language processing-screened human abstraction. This implementation is adaptable to diverse research needs, and its modular pipeline represents a practical middle ground between custom software and commercial platforms.</p>","PeriodicalId":10685,"journal":{"name":"Clinical Trials","volume":" ","pages":"17407745251405386"},"PeriodicalIF":2.2,"publicationDate":"2026-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12912770/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146178017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Premarket and postmarket real-world evidence studies supporting U.S. Food and Drug Administration regulatory decision-making, 2016-2024. 2016-2024年,支持美国食品和药物管理局监管决策的上市前和上市后真实证据研究。
IF 2.2 3区 医学 Q3 MEDICINE, RESEARCH & EXPERIMENTAL Pub Date : 2026-02-05 DOI: 10.1177/17407745251415190
Louis Y Li, Reshma Ramachandran, Joseph S Ross, Joshua D Wallach
<p><strong>Background/aims: </strong>There is growing interest in leveraging real-world data, such as electronic health records, administrative claims data, and patient registries, to generate real-world evidence studies that support the U.S. Food and Drug Administration's premarket and postmarket regulatory determinations of effectiveness and/or safety for novel therapeutics. We examined the frequency and characteristics of real-world evidence studies used by the U.S. Food and Drug Administration to support premarket determinations of effectiveness and/or safety, as well as those required or requested by the U.S. Food and Drug Administration to be conducted postmarket after approval.</p><p><strong>Methods: </strong>We identified all novel therapeutics approved by the U.S. Food and Drug Administration between 2016 and 2024, using action packages from the Drugs@FDA database. Product labels, approval letters, and review documents were used to identify real-world evidence studies supporting premarket determinations of effectiveness and/or safety, as well as all postmarketing requirements or commitments outlined at the time of approval. Outcomes included the number of novel therapeutics approved with premarket and/or postmarket real-world evidence studies and characteristics of these studies, including study design, data source, and primary objectives.</p><p><strong>Results: </strong>From 2016 to 2024, the U.S. Food and Drug Administration approved 400 novel therapeutics for 543 indications, of which 43 (10.8%) had at least one real-world evidence study that supported premarket determinations of effectiveness and/or safety (64 unique studies), and 138 (34.5%) had at least one real-world evidence study required or requested by the U.S. Food and Drug Administration to be conducted postmarket after approval (208 unique studies). Among the 64 unique premarket real-world evidence studies, the most common study designs were non-interventional (observational) studies (35, 54.7%) and externally controlled trials (17, 26.6%); 38 (59.4%) studies utilized electronic health or medical records, and 47 (73.4%) provided evidence on effectiveness. Among the 208 unique postmarket real-world evidence studies, the most common study design was non-interventional (observational) studies (159, 76.4%); 61 (29.3%) studies identified registries as the proposed data source, and 197 (94.7%) were designed to provide evidence on safety alone. The proportion of therapeutics approved with at least one postmarket real-world evidence study increased over time from 2 of 20 (10.0%) in 2016 to 23 of 47 (48.9%) in 2024; however, only 7 (3.4%) of these studies were classified by the U.S. Food and Drug Administration as fulfilled or submitted as of May 2025.</p><p><strong>Conclusions: </strong>Real-world evidence studies are infrequently used to support the U.S. Food and Drug Administration's premarket determinations of effectiveness and/or safety but have been increasingly required or re
背景/目的:利用真实世界的数据(如电子健康记录、行政索赔数据和患者登记)来生成真实世界的证据研究,以支持美国食品和药物管理局对新疗法的上市前和上市后有效性和/或安全性的监管决定,这一点越来越受到关注。我们检查了美国食品和药物管理局用于支持上市前有效性和/或安全性确定的真实证据研究的频率和特征,以及美国食品和药物管理局要求或要求在批准后进行上市后的研究。方法:我们使用Drugs@FDA数据库中的行动包,确定2016年至2024年间美国食品和药物管理局批准的所有新疗法。产品标签、批准信和审查文件用于确定支持上市前有效性和/或安全性确定的真实证据研究,以及批准时概述的所有上市后要求或承诺。结果包括上市前和/或上市后实际证据研究批准的新疗法的数量和这些研究的特征,包括研究设计、数据源和主要目标。结果:从2016年到2024年,美国食品和药物管理局批准了400种新疗法用于543种适应症,其中43种(10.8%)至少有一项支持上市前有效性和/或安全性确定的真实世界证据研究(64项独特研究),138种(34.5%)至少有一项美国食品和药物管理局要求或要求在批准后进行的真实世界证据研究(208项独特研究)。在64个独特的上市前真实世界证据研究中,最常见的研究设计是非干预性(观察性)研究(35,54.7%)和外部对照试验(17,26.6%);38项(59.4%)研究使用了电子健康或医疗记录,47项(73.4%)研究提供了有效性证据。在208项独特的上市后真实世界证据研究中,最常见的研究设计是非干预性(观察性)研究(159项,76.4%);61项(29.3%)研究将登记处确定为建议的数据来源,197项(94.7%)研究仅提供安全性证据。随着时间的推移,至少有一项上市后真实世界证据研究批准的治疗药物比例从2016年的20个中的2个(10.0%)增加到2024年的47个中的23个(48.9%);然而,截至2025年5月,这些研究中只有7项(3.4%)被美国食品和药物管理局归类为完成或提交。结论:实际证据研究很少用于支持美国食品和药物管理局上市前对有效性和/或安全性的确定,但美国食品和药物管理局越来越多地要求或要求在批准后进行上市后的研究;然而,延迟完成上市后的真实世界证据研究可能会限制其监管影响。
{"title":"Premarket and postmarket real-world evidence studies supporting U.S. Food and Drug Administration regulatory decision-making, 2016-2024.","authors":"Louis Y Li, Reshma Ramachandran, Joseph S Ross, Joshua D Wallach","doi":"10.1177/17407745251415190","DOIUrl":"10.1177/17407745251415190","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background/aims: &lt;/strong&gt;There is growing interest in leveraging real-world data, such as electronic health records, administrative claims data, and patient registries, to generate real-world evidence studies that support the U.S. Food and Drug Administration's premarket and postmarket regulatory determinations of effectiveness and/or safety for novel therapeutics. We examined the frequency and characteristics of real-world evidence studies used by the U.S. Food and Drug Administration to support premarket determinations of effectiveness and/or safety, as well as those required or requested by the U.S. Food and Drug Administration to be conducted postmarket after approval.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;We identified all novel therapeutics approved by the U.S. Food and Drug Administration between 2016 and 2024, using action packages from the Drugs@FDA database. Product labels, approval letters, and review documents were used to identify real-world evidence studies supporting premarket determinations of effectiveness and/or safety, as well as all postmarketing requirements or commitments outlined at the time of approval. Outcomes included the number of novel therapeutics approved with premarket and/or postmarket real-world evidence studies and characteristics of these studies, including study design, data source, and primary objectives.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;From 2016 to 2024, the U.S. Food and Drug Administration approved 400 novel therapeutics for 543 indications, of which 43 (10.8%) had at least one real-world evidence study that supported premarket determinations of effectiveness and/or safety (64 unique studies), and 138 (34.5%) had at least one real-world evidence study required or requested by the U.S. Food and Drug Administration to be conducted postmarket after approval (208 unique studies). Among the 64 unique premarket real-world evidence studies, the most common study designs were non-interventional (observational) studies (35, 54.7%) and externally controlled trials (17, 26.6%); 38 (59.4%) studies utilized electronic health or medical records, and 47 (73.4%) provided evidence on effectiveness. Among the 208 unique postmarket real-world evidence studies, the most common study design was non-interventional (observational) studies (159, 76.4%); 61 (29.3%) studies identified registries as the proposed data source, and 197 (94.7%) were designed to provide evidence on safety alone. The proportion of therapeutics approved with at least one postmarket real-world evidence study increased over time from 2 of 20 (10.0%) in 2016 to 23 of 47 (48.9%) in 2024; however, only 7 (3.4%) of these studies were classified by the U.S. Food and Drug Administration as fulfilled or submitted as of May 2025.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;Real-world evidence studies are infrequently used to support the U.S. Food and Drug Administration's premarket determinations of effectiveness and/or safety but have been increasingly required or re","PeriodicalId":10685,"journal":{"name":"Clinical Trials","volume":" ","pages":"17407745251415190"},"PeriodicalIF":2.2,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12900038/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146124124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Practical inference for a complier average causal effect in cluster randomised trials with a binary outcome. 具有二元结果的聚类随机试验中编译平均因果效应的实际推断。
IF 2.2 3区 医学 Q3 MEDICINE, RESEARCH & EXPERIMENTAL Pub Date : 2026-02-01 Epub Date: 2025-10-16 DOI: 10.1177/17407745251378407
Tansy Edwards, Jennifer Thompson, Charles Opondo, Elizabeth Allen

Background: Individual non-compliance with an intervention in cluster randomised trials can occur and estimating an intervention effect according to intention-to-treat ignores non-compliance and underestimates efficacy. The effect of the intervention among compliers (the complier average causal effect) provides an unbiased estimate of efficacy but inference can be complex in cluster randomised trials.

Methods: We evaluated the performance of a pragmatic bootstrapping approach accounting for clustering to obtain a 95% confidence interval (CI) for a CACE for cluster randomised trials with monotonicity and one-sided non-compliance. We investigated a variety of scenarios for correlated cluster-level prevalence of a binary outcome and non-compliance (5%, 10%, 20%, 30%, 40%). Cluster randomised trials were simulated with the minimum number of clusters to provide at least 80% and at least 90% power, to detect an ITT odds ratio (OR) of 0.5 with 100 individuals per cluster.

Results: Under all non-compliance scenarios (5%-40%), there was negligible bias for the CACE. In the worst-case of bias, a true OR of 0.18 was estimated as 0.15 for the rarest outcome (5%) and highest non-compliance (40%). There was no under-coverage of bootstrap CIs. CIs were the correct width for an outcome prevalence of 20%-40% but too wide for a less common outcome. Loss of power for a CACE bootstrap analysis versus ITT regression analysis increased as the prevalence of the outcome decreased across all non-compliance scenarios, particularly for an outcome prevalence of less than 20%.

Conclusions: Our bootstrapping approach provides an accessible and computationally simple method to evaluate efficacy in support of ITT analyses in cluster randomised trials.

背景:在聚类随机试验中,个体对干预措施的不依从性可能发生,根据意向治疗来估计干预效果会忽略不依从性并低估疗效。干预对合组者的影响(合组者平均因果效应)提供了对疗效的无偏估计,但在聚类随机试验中,推断可能很复杂。方法:我们评估了考虑聚类的实用引导方法的性能,以获得具有单调性和单侧不依从性的聚类随机试验的CACE的95%置信区间(CI)。我们调查了各种相关的群集水平的二元结果和不依从性的患病率(5%,10%,20%,30%,40%)。模拟聚类随机试验,最小聚类数量至少提供80%和90%的功效,检测ITT优势比(OR)为0.5,每聚类100人。结果:在所有不符合情况下(5%-40%),CACE的偏倚可以忽略不计。在最坏的偏差情况下,估计最罕见的结果(5%)和最高的不依从性(40%)的真实OR为0.18。没有对引导式ci的覆盖不足。ci的宽度对于结果患病率为20%-40%是正确的,但对于不太常见的结果则太宽了。CACE自举分析与ITT回归分析的有效性损失随着结果在所有不符合情况下的发生率降低而增加,特别是在结果发生率低于20%的情况下。结论:我们的自举方法提供了一种可访问且计算简单的方法来评估支持ITT分析在聚类随机试验中的有效性。
{"title":"Practical inference for a complier average causal effect in cluster randomised trials with a binary outcome.","authors":"Tansy Edwards, Jennifer Thompson, Charles Opondo, Elizabeth Allen","doi":"10.1177/17407745251378407","DOIUrl":"10.1177/17407745251378407","url":null,"abstract":"<p><strong>Background: </strong>Individual non-compliance with an intervention in cluster randomised trials can occur and estimating an intervention effect according to intention-to-treat ignores non-compliance and underestimates efficacy. The effect of the intervention among compliers (the complier average causal effect) provides an unbiased estimate of efficacy but inference can be complex in cluster randomised trials.</p><p><strong>Methods: </strong>We evaluated the performance of a pragmatic bootstrapping approach accounting for clustering to obtain a 95% confidence interval (CI) for a CACE for cluster randomised trials with monotonicity and one-sided non-compliance. We investigated a variety of scenarios for correlated cluster-level prevalence of a binary outcome and non-compliance (5%, 10%, 20%, 30%, 40%). Cluster randomised trials were simulated with the minimum number of clusters to provide at least 80% and at least 90% power, to detect an ITT odds ratio (OR) of 0.5 with 100 individuals per cluster.</p><p><strong>Results: </strong>Under all non-compliance scenarios (5%-40%), there was negligible bias for the CACE. In the worst-case of bias, a true OR of 0.18 was estimated as 0.15 for the rarest outcome (5%) and highest non-compliance (40%). There was no under-coverage of bootstrap CIs. CIs were the correct width for an outcome prevalence of 20%-40% but too wide for a less common outcome. Loss of power for a CACE bootstrap analysis versus ITT regression analysis increased as the prevalence of the outcome decreased across all non-compliance scenarios, particularly for an outcome prevalence of less than 20%.</p><p><strong>Conclusions: </strong>Our bootstrapping approach provides an accessible and computationally simple method to evaluate efficacy in support of ITT analyses in cluster randomised trials.</p>","PeriodicalId":10685,"journal":{"name":"Clinical Trials","volume":" ","pages":"33-42"},"PeriodicalIF":2.2,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12909608/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145298963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Charting the content of data monitoring committee charters for clinical trials. 制定临床试验数据监测委员会章程的内容。
IF 2.2 3区 医学 Q3 MEDICINE, RESEARCH & EXPERIMENTAL Pub Date : 2026-02-01 Epub Date: 2025-12-10 DOI: 10.1177/17407745251389185
Lisa Eckstein, Akram Ibrahim, Olivia Orr, Annette Rid, Seema K Shah

Background: Data monitoring committees play a critical role in ensuring the ethical conduct of clinical trials. Data monitoring committee charters set out the role and processes for data monitoring committees in monitoring clinical trials; however, little is known about the information charters contain.

Methods: We conducted a summative content analysis of a convenience sample of data monitoring committee charters based on the criteria set out for charters by the DAMOCLES Study Group in 2005. Thirteen charters from public and commercially sponsored clinical trials were obtained for review.

Results: Although the data monitoring committee charters we analyzed broadly satisfied the criteria set out by the DAMOCLES Study Group, some issues warrant further attention. These included variability in the availability of unmasked data for review, communication across data monitoring committees for related trials, post-trial DMC responsibilities, and a need for more explicit decision-making processes and conflict resolution procedures. Moreover, few of the data monitoring committee charters we were able to analyze included legal protection for members.

Conclusion: Despite limitations due to the difficulties in obtaining data monitoring committee charters, the convenience sample reviewed suggests variability, including in terms of implementation of some best-practice recommendations. There is a need for further exploration of these issues in a larger sample size. Undertaking such research would be assisted by requiring or incentivizing public access to data monitoring committee charters.

背景:数据监测委员会在确保临床试验的伦理行为方面发挥着关键作用。数据监测委员会章程规定了数据监测委员会在监测临床试验方面的作用和程序;然而,人们对宪章所包含的信息知之甚少。方法:根据DAMOCLES研究小组2005年制定的章程标准,对数据监测委员会章程的便利样本进行了总结性内容分析。从公共和商业资助的临床试验中获得了13个特许供审查。结果:尽管我们分析的数据监测委员会章程大体上满足达摩克利斯研究小组制定的标准,但仍有一些问题值得进一步关注。这些问题包括可供审查的公开数据的可变性、相关试验的数据监测委员会之间的沟通、试验后DMC的责任,以及需要更明确的决策过程和冲突解决程序。此外,我们能够分析的数据监测委员会章程中很少包括对成员的法律保护。结论:尽管由于难以获得数据监测委员会章程而受到限制,但审查的便利样本表明,在一些最佳实践建议的实施方面存在可变性。有必要在更大的样本量中进一步探讨这些问题。要求或鼓励公众查阅数据监测委员会的章程将有助于进行这种研究。
{"title":"Charting the content of data monitoring committee charters for clinical trials.","authors":"Lisa Eckstein, Akram Ibrahim, Olivia Orr, Annette Rid, Seema K Shah","doi":"10.1177/17407745251389185","DOIUrl":"10.1177/17407745251389185","url":null,"abstract":"<p><strong>Background: </strong>Data monitoring committees play a critical role in ensuring the ethical conduct of clinical trials. Data monitoring committee charters set out the role and processes for data monitoring committees in monitoring clinical trials; however, little is known about the information charters contain.</p><p><strong>Methods: </strong>We conducted a summative content analysis of a convenience sample of data monitoring committee charters based on the criteria set out for charters by the DAMOCLES Study Group in 2005. Thirteen charters from public and commercially sponsored clinical trials were obtained for review.</p><p><strong>Results: </strong>Although the data monitoring committee charters we analyzed broadly satisfied the criteria set out by the DAMOCLES Study Group, some issues warrant further attention. These included variability in the availability of unmasked data for review, communication across data monitoring committees for related trials, post-trial DMC responsibilities, and a need for more explicit decision-making processes and conflict resolution procedures. Moreover, few of the data monitoring committee charters we were able to analyze included legal protection for members.</p><p><strong>Conclusion: </strong>Despite limitations due to the difficulties in obtaining data monitoring committee charters, the convenience sample reviewed suggests variability, including in terms of implementation of some best-practice recommendations. There is a need for further exploration of these issues in a larger sample size. Undertaking such research would be assisted by requiring or incentivizing public access to data monitoring committee charters.</p>","PeriodicalId":10685,"journal":{"name":"Clinical Trials","volume":" ","pages":"121-126"},"PeriodicalIF":2.2,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12841360/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145713342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Clinical Trials
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1