Statistical noise in PD-(L)1 inhibitor trials: Unravelling the durable-responder effect.

IF 7.3 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Journal of Clinical Epidemiology Pub Date : 2024-11-04 DOI:10.1016/j.jclinepi.2024.111589
Michael Coory, Susan J Jordan
{"title":"Statistical noise in PD-(L)1 inhibitor trials: Unravelling the durable-responder effect.","authors":"Michael Coory, Susan J Jordan","doi":"10.1016/j.jclinepi.2024.111589","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Programmed-death-1/ligand-1 inhibitors (PD-1/L1i's) have emerged as pivotal treatments for many cancers. A notable feature of this class of medicines is the dichotomous response pattern: A small (but clinically-relevant) percentage of patients (5% - 20%) benefit from deep and durable responses resembling functional cures (durable responders), while most patients experience only a modest or negligible response. Accurately predicting durable responders remains elusive due to the lack of a reliable biomarker. Another notable feature of these medicines is that different PD-1/L1's have obtained statistically significant results, leading to marketing approval, for some cancer indications, but not for others, with no discernible pattern. These puzzling inconsistencies have generated extensive discussions among oncologists. Proposed (but not entirely convincing) explanations include true underlying differences in efficacy for some types of cancer, but not others; or subtle differences in trial design.</p><p><strong>Objective: </strong>To investigate a less-explored hypothesis-the durable-responder effect: An initially unidentified group of durable responders generates more statistical noise than anticipated, leading to low-powered randomised controlled trials (RCTs) that report randomly variable results.</p><p><strong>Study design: </strong>Employing simulation, this investigation divides participants in PD-(L)1i RCTs into two groups: durable responders and patients with a more modest response. Drawing on published data for melanoma, lung and urothelial cancers, multiple pre-specified scenarios are replicated 50,000 times, systematically varying the durable-responder percentage from 5% to 20% and the modest-response hazard ratio for overall survival [HR(OS)] from 0.8 to 1.0. This allowed evaluation of the effect of durable responders on power, point estimates of the treatment effect for OS, and the probability of a misleading signal for harm.</p><p><strong>Results: </strong>When the treatment effect for the modest responders is similar to the comparator arm, statistical power remains below 80%, limiting the ability to reliably detect durable responders. Conversely, there is a material probability of obtaining a statistically significant result that exaggerates the treatment effect by chance. For instance, with an average HR(OS) of 0.93 (corresponding to 5% durable responders), statistically significant trials (7.2%) show an average HR(OS) of 0.77. Additionally, when 5% are durable responders, there is a 20% probability that the HR(OS) will exceed 1.0-suggesting potential harm, when none exists.</p><p><strong>Conclusion: </strong>This paper adds to the possible explanations for the puzzlingly inconsistent results from PD-(L)1i RCTs. Initially unidentified durable responders introduce features typical of imprecise, low-powered studies: a propensity for false-negative results; estimates of benefit that might not replicate; and misleading signals for harm.</p>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":" ","pages":"111589"},"PeriodicalIF":7.3000,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.jclinepi.2024.111589","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Programmed-death-1/ligand-1 inhibitors (PD-1/L1i's) have emerged as pivotal treatments for many cancers. A notable feature of this class of medicines is the dichotomous response pattern: A small (but clinically-relevant) percentage of patients (5% - 20%) benefit from deep and durable responses resembling functional cures (durable responders), while most patients experience only a modest or negligible response. Accurately predicting durable responders remains elusive due to the lack of a reliable biomarker. Another notable feature of these medicines is that different PD-1/L1's have obtained statistically significant results, leading to marketing approval, for some cancer indications, but not for others, with no discernible pattern. These puzzling inconsistencies have generated extensive discussions among oncologists. Proposed (but not entirely convincing) explanations include true underlying differences in efficacy for some types of cancer, but not others; or subtle differences in trial design.

Objective: To investigate a less-explored hypothesis-the durable-responder effect: An initially unidentified group of durable responders generates more statistical noise than anticipated, leading to low-powered randomised controlled trials (RCTs) that report randomly variable results.

Study design: Employing simulation, this investigation divides participants in PD-(L)1i RCTs into two groups: durable responders and patients with a more modest response. Drawing on published data for melanoma, lung and urothelial cancers, multiple pre-specified scenarios are replicated 50,000 times, systematically varying the durable-responder percentage from 5% to 20% and the modest-response hazard ratio for overall survival [HR(OS)] from 0.8 to 1.0. This allowed evaluation of the effect of durable responders on power, point estimates of the treatment effect for OS, and the probability of a misleading signal for harm.

Results: When the treatment effect for the modest responders is similar to the comparator arm, statistical power remains below 80%, limiting the ability to reliably detect durable responders. Conversely, there is a material probability of obtaining a statistically significant result that exaggerates the treatment effect by chance. For instance, with an average HR(OS) of 0.93 (corresponding to 5% durable responders), statistically significant trials (7.2%) show an average HR(OS) of 0.77. Additionally, when 5% are durable responders, there is a 20% probability that the HR(OS) will exceed 1.0-suggesting potential harm, when none exists.

Conclusion: This paper adds to the possible explanations for the puzzlingly inconsistent results from PD-(L)1i RCTs. Initially unidentified durable responders introduce features typical of imprecise, low-powered studies: a propensity for false-negative results; estimates of benefit that might not replicate; and misleading signals for harm.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
PD-(L)1抑制剂试验中的统计噪音:揭示持久应答效应。
背景:程序性死亡-1/配体-1抑制剂(PD-1/L1i)已成为治疗多种癌症的关键药物。这类药物的一个显著特点是二分反应模式:一小部分(但与临床相关)患者(5% - 20%)可从类似功能性治愈的深度和持久反应中获益(持久反应者),而大多数患者仅有轻微或可忽略不计的反应。由于缺乏可靠的生物标志物,准确预测持久应答者仍是一个难题。这些药物的另一个显著特点是,不同的 PD-1/L1 药物在某些癌症适应症上取得了有统计学意义的结果,从而获得了上市许可,但在另一些适应症上却没有,没有明显的规律可循。这些令人费解的不一致引起了肿瘤学家的广泛讨论。提出的解释(但并不完全令人信服)包括:某些类型癌症的疗效存在真正的潜在差异,而其他类型则没有;或者试验设计存在微妙差异:研究一个较少探讨的假设--持久应答效应:最初未被发现的持久应答者群体会产生比预期更多的统计噪声,导致低效随机对照试验(RCT)报告的结果随机变化:研究设计:本研究通过模拟,将 PD-(L)1i RCT 的参与者分为两组:持久应答者和应答较弱的患者。根据已公布的黑色素瘤、肺癌和尿道癌数据,对多个预先指定的情景进行了 50,000 次重复,系统地将持久应答者的比例从 5% 调整到 20%,将总生存期的中度应答危险比[HR(OS)]从 0.8 调整到 1.0。这样就可以评估持久应答者对疗效的影响、OS治疗效果的点估计以及危害信号误导的概率:结果:当适度应答者的治疗效果与对照组相似时,统计功率仍低于 80%,从而限制了可靠检测持久应答者的能力。相反,如果偶然夸大了治疗效果,则很有可能获得具有统计学意义的结果。例如,平均 HR(OS)为 0.93(对应 5%的持久应答者),具有统计学意义的试验(7.2%)显示平均 HR(OS) 为 0.77。此外,当5%为持久应答者时,HR(OS)超过1.0的概率为20%--这表明存在潜在危害,但实际上并不存在:本文为 PD-(L)1i RCT 令人费解的不一致结果提供了更多可能的解释。最初未被发现的持久应答者带来了不精确、低效研究的典型特征:假阴性结果的倾向;可能无法复制的获益估计;以及误导性的危害信号。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Clinical Epidemiology
Journal of Clinical Epidemiology 医学-公共卫生、环境卫生与职业卫生
CiteScore
12.00
自引率
6.90%
发文量
320
审稿时长
44 days
期刊介绍: The Journal of Clinical Epidemiology strives to enhance the quality of clinical and patient-oriented healthcare research by advancing and applying innovative methods in conducting, presenting, synthesizing, disseminating, and translating research results into optimal clinical practice. Special emphasis is placed on training new generations of scientists and clinical practice leaders.
期刊最新文献
Corrigendum to 'Avoiding searching for outcomes called for additional search strategies: a study of cochrane review searches' [Journal of Clinical Epidemiology, 149 (2022) 83-88]. A methodological review identified several options for utilizing registries for randomized controlled trials. Real-time Adaptive Randomization of Clinical Trials. Some superiority trials with non-significant results published in high impact factor journals correspond to non-inferiority situations: a research-on-research study. Directed acyclic graph helps to understand the causality of malnutrition in under-five children born small for gestational age.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1