Tanujit Dey, Stuart Lipsitz, Zara Cooper, Debajyoti Sinha, Quoc-Dien Trinh, Alexander Cole, Timothy N. Clinton
<p>In many clinical, biomedical, and epidemiologic cancer studies, the outcome of interest is the time until an event (or failure) occurs, and “survival analysis” is performed to estimate the effect of covariates on survival. Most clinical investigators are familiar with the term “right censoring”. The type of censoring we focus on in this paper is “interval-censoring” [<span>1-3</span>]. Interval-censoring occurs when the event of interest cannot be easily observed but is determined by imaging and/or blood biomarker tests at clinic visits. For example, suppose a patient comes in for imaging every 3 months, to detect the event (cancer recurrence). If recurrence is found at a visit, it occurred sometime between that visit and the previous visit where no recurrence was detected—this is “interval-censoring”. Researchers often code the event date as the visit when recurrence is detected, but this overestimates time to recurrence since it occurred earlier within the interval. Modern statistical software supports proper interval-censoring methods, making it unnecessary to use visit dates as event dates. Since most cancer recurrences are identified through imaging or biomarker tests at clinic visits, interval-censoring methods should be used, at least as a sensitivity analysis to assess the impact of coding the event date as the visit date when recurrence is detected .</p><p>Consider a study of biochemical recurrence in 260 patients with localized prostate cancer who underwent radical prostatectomy [<span>4</span>]. After surgery, prostate-specific antigen (PSA) levels drop to 0 ng/mL, and recurrence is defined as the time PSA exceeds 0.1 ng/mL. Since PSA is measured only at regular intervals (every 3 or 6 months), the recurrence time is unknown and interval-censored between visits. Clinicians often use the first elevated PSA test as the recurrence date, though the true event occurred between the current and previous visit. An interval-censored Cox model assessed the impact of Gleason score, surgical margin, preoperative PSA, and pathologic stage on recurrence time.</p><p>Parameters of survival models are estimated by maximizing a “likelihood” (ML). A patient with an exact failure time <i>t</i> contributes the probability of failure at <i>t</i> to the likelihood. A patient right-censored at time <i>t</i> contributes the probability of surviving beyond <i>t</i>. An interval-censored patient whose event occurs in the interval (<i>L</i>, <i>U</i>), contributes the probability of failure within that interval. After assuming a survival model (e.g., proportional hazards), the probabilities of failure at a given time (no censoring), between two time points (interval-censoring), or greater than a time point (right-censoring), are easily calculated and used in ML.</p><p>Despite the long history of interval-censoring statistical methodology [<span>1-3</span>], until recently, few software packages were available to analyze interval-censored data in a statistical
在许多临床、生物医学和流行病学癌症研究中,关注的结果是事件(或失败)发生之前的时间,进行“生存分析”是为了估计协变量对生存的影响。大多数临床研究者都熟悉“右审查”这个术语。本文关注的审查类型是“间隔审查”[1-3]。当感兴趣的事件不容易观察到,但在门诊就诊时通过成像和/或血液生物标志物测试确定时,进行间隔审查。例如,假设一个病人每3个月来做一次影像学检查,以检测癌症复发。如果在一次访问中发现复发,则发生在该访问和之前未发现复发的访问之间的某个时间-这是“间隔审查”。研究人员通常将事件日期编码为检测到复发的就诊日期,但这高估了复发的时间,因为它在间隔内发生得更早。现代统计软件支持适当的间隔审查方法,使得没有必要使用访问日期作为事件日期。由于大多数癌症复发是通过临床就诊时的成像或生物标志物测试确定的,因此应该使用间隔审查方法,至少作为一种敏感性分析,以评估在检测到复发时将事件日期编码为就诊日期的影响。考虑一项260例接受根治性前列腺切除术的局限性前列腺癌患者的生化复发研究。手术后,前列腺特异性抗原(PSA)水平降至0 ng/mL,当PSA超过0.1 ng/mL时定义为复发。由于PSA仅在定期间隔(每3或6个月)测量,复发时间是未知的,并且在两次就诊之间进行间隔审查。临床医生通常使用第一次PSA检测升高作为复发日期,尽管真实事件发生在当前和上次就诊之间。间隔剔除的Cox模型评估Gleason评分、手术切缘、术前PSA和病理分期对复发时间的影响。通过最大化“似然”(ML)来估计生存模型的参数。一个病人在t的确切失败时间将t的失败概率贡献给可能性。在时间t进行右截尾的患者贡献了超过t的生存概率。在时间间隔(L, U)中发生事件的间隔截尾患者贡献了该时间间隔内的失败概率。在假设一个生存模型(例如,比例风险)之后,在给定时间(无审查),两个时间点之间(间隔审查)或大于一个时间点(右审查)的失败概率很容易计算并在ml中使用。尽管间隔审查统计方法的历史很长[1-3],直到最近,很少有软件包可用来分析间隔审查数据在统计上有效的方式。因此,当事件被确定时,事件日期通常被分配到诊所访问日期,允许使用仅支持右审查的生存软件。间隔审查方法现在在SAS 9.4版本(SAS Institute Inc)、R和Stata version 19 (StataCorp)中都很容易获得。在SAS中,过程寿命测试用于间隔截短数据的Kaplan-Meier分析,过程寿命测试用于间隔截短比例风险模型。区间截割数据的Kaplan-Meier曲线可以使用R包Icens(版本1.81.0)进行拟合。区间截尾数据的Cox比例风险模型可以在R包icenReg(版本2.0.16)中估计。在Stata中,intcens和string模块支持估计。在下面的模拟中,我们使用了SAS过程。当使用任何软件程序分析间隔截尾数据时,数据集需要有两个变量,分别对应于上面描述的下端点和上端点(L, U)。例如,如果在6个月的成像中发现事件,而在3个月的前一次访问中没有事件,那么您给软件程序提供端点(3,6)。我们进行了一项小规模模拟,以显示使用Cox模型估算风险比和生存率时可能存在的偏差,该模型采用不同的方法来考虑区间审查。我们基于一项膀胱癌复发随机临床试验(西南肿瘤组[SWOG] S0337[5])中报告的模型和参数估计来模拟数据集。低级别非肌肉侵袭性膀胱癌患者通常接受经尿道膀胱肿瘤切除术(turt)。在参考的临床试验中,干预组在TURBT后立即膀胱滴注吉西他滨,对照组给予生理盐水。吉西他滨组第4年的复发率(35%)明显低于对照组(47%)。 患者在前2年每3个月进行膀胱镜检查和细胞学检查,此后每6个月进行一次检查,导致复发时间间隔。为了简单起见,我们使用分段指数分布来模拟复发时间数据,校准后大致匹配1年、2年、3年和4年的复发率,两组间的风险比为0.68,与试验结果一致。表1的标题中提供了额外的模拟细节。表1中的模拟研究结果表明,使用区间审查Cox回归对所有数量产生无偏估计。然而,使用区间的上端点作为事件时间可能会高估复发时间,相应地低估在给定时间内复发的百分比。这对于估计在较早的时间点(1年)重复出现的百分比尤其正确。随着访问间隔的增加,低估的情况也越来越严重。虽然在实践中是不现实的,但如果每年只安排随访,那么使用间隔的上端点作为复发时间,1年复发率被严重低估(在两组中约有70%向下偏倚)。即使有6个月的随访间隔(更现实),使用上终点,1年复发率也被大大低估了(在两组中大约有25%的向下偏倚)。值得注意的是,所有方法估计的4年复发率几乎是无偏的。同样,所有方法的风险比估计值都是无偏的。这是意料之中的,因为Cox模型只依赖于事件时间的顺序(排名),而不是它们的确切值。因此,Cox模型中的风险比估计值对使用上终点的敏感性低于生存百分比估计值,这也需要在正确的时间尺度上估计基线生存函数。因此,如果估计风险比,使用Cox模型的上终点似乎没有什么偏差。此外,在收集了大多数随访数据之后(例如第4年),检查较晚的时间点也显示出最小的偏差。由于可能的间隔审查配置(例如,随访之间的时间)和产生数据的潜在生存分布的范围很广,因此很难从小规模模拟研究中得出明确的结论。我们只能提出一般的建议。基于我们的模拟,间隔截尾的Cox模型表现最好。虽然对于频繁的随访,使用随访间隔的上限是可以接受的,但当随访间隔较长,特别是在较早的时间点时,可能会引入偏差。考虑到区间审查方法在统计软件中广泛实施,我们建议使用它们,至少进行敏感性分析,以评估使用简化方法(例如,使用区间的上端点)的影响,特别是如果对早期失败感兴趣。该模拟侧重于估计偏差,而不是确定最佳随访计划,该计划应以复发风险和疾病严重程度为指导。Tanujit Dey:进行统计数据分析;形成研究理念和重点;主导稿件编辑;指导项目,并担任项目联合负责人。Stuart Lipsitz:进行统计数据分析;形成研究理念和重点;主导稿件编辑;指导项目,并担任项目联合负责人。Zara Cooper:参与稿件编辑并提供咨询。Debajyoti Sinha:参与稿件编辑并提供咨询。陈国奠:参与稿件编辑;提供咨询,指导项目。亚历山大·科尔:参与手稿编辑;提供咨询,指导项目。蒂姆·克林顿:参与手稿编辑;提供咨询,指导项目方向。所有作者都阅读并批准了最终的手稿。作者声明他们没有与这项工作相关的利益冲突。作者声明他们没有与这项工作相关的资金。本研究完全使用模拟数据进行;没有使用、生成或分析患者数据。模拟数据集可以根据文中描述的方法进行再现。用于生成数据的仿真代码可根据相应作者的合理要求提供。
{"title":"A practical overview and statistical analysis of interval-censored data in cancer","authors":"Tanujit Dey, Stuart Lipsitz, Zara Cooper, Debajyoti Sinha, Quoc-Dien Trinh, Alexander Cole, Timothy N. Clinton","doi":"10.1002/cac2.70073","DOIUrl":"10.1002/cac2.70073","url":null,"abstract":"<p>In many clinical, biomedical, and epidemiologic cancer studies, the outcome of interest is the time until an event (or failure) occurs, and “survival analysis” is performed to estimate the effect of covariates on survival. Most clinical investigators are familiar with the term “right censoring”. The type of censoring we focus on in this paper is “interval-censoring” [<span>1-3</span>]. Interval-censoring occurs when the event of interest cannot be easily observed but is determined by imaging and/or blood biomarker tests at clinic visits. For example, suppose a patient comes in for imaging every 3 months, to detect the event (cancer recurrence). If recurrence is found at a visit, it occurred sometime between that visit and the previous visit where no recurrence was detected—this is “interval-censoring”. Researchers often code the event date as the visit when recurrence is detected, but this overestimates time to recurrence since it occurred earlier within the interval. Modern statistical software supports proper interval-censoring methods, making it unnecessary to use visit dates as event dates. Since most cancer recurrences are identified through imaging or biomarker tests at clinic visits, interval-censoring methods should be used, at least as a sensitivity analysis to assess the impact of coding the event date as the visit date when recurrence is detected .</p><p>Consider a study of biochemical recurrence in 260 patients with localized prostate cancer who underwent radical prostatectomy [<span>4</span>]. After surgery, prostate-specific antigen (PSA) levels drop to 0 ng/mL, and recurrence is defined as the time PSA exceeds 0.1 ng/mL. Since PSA is measured only at regular intervals (every 3 or 6 months), the recurrence time is unknown and interval-censored between visits. Clinicians often use the first elevated PSA test as the recurrence date, though the true event occurred between the current and previous visit. An interval-censored Cox model assessed the impact of Gleason score, surgical margin, preoperative PSA, and pathologic stage on recurrence time.</p><p>Parameters of survival models are estimated by maximizing a “likelihood” (ML). A patient with an exact failure time <i>t</i> contributes the probability of failure at <i>t</i> to the likelihood. A patient right-censored at time <i>t</i> contributes the probability of surviving beyond <i>t</i>. An interval-censored patient whose event occurs in the interval (<i>L</i>, <i>U</i>), contributes the probability of failure within that interval. After assuming a survival model (e.g., proportional hazards), the probabilities of failure at a given time (no censoring), between two time points (interval-censoring), or greater than a time point (right-censoring), are easily calculated and used in ML.</p><p>Despite the long history of interval-censoring statistical methodology [<span>1-3</span>], until recently, few software packages were available to analyze interval-censored data in a statistical","PeriodicalId":9495,"journal":{"name":"Cancer Communications","volume":"45 12","pages":"1666-1669"},"PeriodicalIF":24.9,"publicationDate":"2025-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12728475/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145430321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}