Validation of an Updated Algorithm to Identify Patients With Incident Non-Small Cell Lung Cancer in Administrative Claims Databases.

IF 2.8 Q2 ONCOLOGY JCO Clinical Cancer Informatics Pub Date : 2024-03-01 DOI:10.1200/CCI.23.00165

Sandip Pravin Patel, Rongrong Wang, Summera Qiheng Zhou, Daniel Sheinson, Ann Johnson, Janet Shin Lee

{"title":"Validation of an Updated Algorithm to Identify Patients With Incident Non-Small Cell Lung Cancer in Administrative Claims Databases.","authors":"Sandip Pravin Patel, Rongrong Wang, Summera Qiheng Zhou, Daniel Sheinson, Ann Johnson, Janet Shin Lee","doi":"10.1200/CCI.23.00165","DOIUrl":null,"url":null,"abstract":"Purpose: Real-world lung cancer data in administrative claims databases often lack staging information and specific diagnostic codes for lung cancer histology subtypes. This study updates and validates Turner's 2017 treatment-based algorithm using more recent claims and electronic health record (EHR) data.Methods: This study used Optum's deidentified Market Clarity Data of linked medical and pharmacy claims with EHR data. Eligible patients had an incident lung cancer diagnosis (January 2014-December 2020) and ≥one valid histology code for lung cancer 30 days before to 60 days after diagnosis. Histology and stage information from the EHR were used to evaluate the sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). We evaluated the Turner algorithm using cohort 1 patients diagnosed between June 2014 and October 2015 (step 1) and between November 2015 and December 2020 after approval of immunotherapies (step 2). Next, we evaluated cohort 2 patients diagnosed between November 2015 and December 2020 using an updated algorithm incorporating the latest US treatment guidelines (step 3), and compared the results for cohort 2 (Turner algorithm, step 2 patients). Furthermore, an algorithm to determine early NSCLC (eNSCLC; stage I-III) versus metastatic or advanced/metastatic non-small cell lung cancer (stage IV) was evaluated among patients with available histology and stage information.Results: A total of 5,012 patients were included (cohort 1, step 1: n = 406; cohort 1, step 2: n = 2,573; cohort 2, step 3: n = 2,744). The updated algorithm showed improved performance relative to the previous Turner algorithm for sensitivity (0.920-0.932), specificity (0.865-0.923), PPV (0.976-0.988), and NPV (0.640-0.673). The eNSCLC algorithm showed high specificity (0.874) and relatively low sensitivity (0.539).Conclusion: An updated treatment-based algorithm identifying patients with incident NSCLC was validated using EHR data and distinguished lung cancer subtypes in claims databases when EHR data were not available.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"8 ","pages":"e2300165"},"PeriodicalIF":2.8000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10965218/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JCO Clinical Cancer Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1200/CCI.23.00165","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose: Real-world lung cancer data in administrative claims databases often lack staging information and specific diagnostic codes for lung cancer histology subtypes. This study updates and validates Turner's 2017 treatment-based algorithm using more recent claims and electronic health record (EHR) data.

Methods: This study used Optum's deidentified Market Clarity Data of linked medical and pharmacy claims with EHR data. Eligible patients had an incident lung cancer diagnosis (January 2014-December 2020) and ≥one valid histology code for lung cancer 30 days before to 60 days after diagnosis. Histology and stage information from the EHR were used to evaluate the sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). We evaluated the Turner algorithm using cohort 1 patients diagnosed between June 2014 and October 2015 (step 1) and between November 2015 and December 2020 after approval of immunotherapies (step 2). Next, we evaluated cohort 2 patients diagnosed between November 2015 and December 2020 using an updated algorithm incorporating the latest US treatment guidelines (step 3), and compared the results for cohort 2 (Turner algorithm, step 2 patients). Furthermore, an algorithm to determine early NSCLC (eNSCLC; stage I-III) versus metastatic or advanced/metastatic non-small cell lung cancer (stage IV) was evaluated among patients with available histology and stage information.

Results: A total of 5,012 patients were included (cohort 1, step 1: n = 406; cohort 1, step 2: n = 2,573; cohort 2, step 3: n = 2,744). The updated algorithm showed improved performance relative to the previous Turner algorithm for sensitivity (0.920-0.932), specificity (0.865-0.923), PPV (0.976-0.988), and NPV (0.640-0.673). The eNSCLC algorithm showed high specificity (0.874) and relatively low sensitivity (0.539).

Conclusion: An updated treatment-based algorithm identifying patients with incident NSCLC was validated using EHR data and distinguished lung cancer subtypes in claims databases when EHR data were not available.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

验证在行政索赔数据库中识别非小细胞肺癌患者的最新算法。

目的：行政报销数据库中的真实肺癌数据往往缺乏肺癌组织学亚型的分期信息和特定诊断代码。本研究利用最新的理赔和电子健康记录（EHR）数据，更新并验证了特纳 2017 年基于治疗的算法：本研究使用了 Optum 的去标识化 Market Clarity 数据，该数据将医疗和药房索赔与电子病历数据联系在一起。符合条件的患者均已确诊肺癌（2014 年 1 月至 2020 年 12 月），且在确诊前 30 天至确诊后 60 天内≥有一个有效的肺癌组织学代码。电子病历中的组织学和分期信息用于评估灵敏度、特异性、阳性预测值 (PPV) 和阴性预测值 (NPV)。我们利用在 2014 年 6 月至 2015 年 10 月（第一步）以及在免疫疗法获批后的 2015 年 11 月至 2020 年 12 月（第二步）期间确诊的第一组患者对特纳算法进行了评估。接下来，我们使用结合美国最新治疗指南的更新算法（第 3 步）对 2015 年 11 月至 2020 年 12 月期间确诊的第 2 组患者进行了评估，并比较了第 2 组（特纳算法，第 2 步患者）的结果。此外，还在有组织学和分期信息的患者中评估了一种确定早期NSCLC（eNSCLC；I-III期）与转移性或晚期/转移性非小细胞肺癌（IV期）的算法：共纳入 5012 例患者（队列 1，第 1 步：n = 406；队列 1，第 2 步：n = 2573；队列 2，第 3 步：n = 2744）。与之前的特纳算法相比，更新后的算法在灵敏度（0.920-0.932）、特异性（0.865-0.923）、PPV（0.976-0.988）和 NPV（0.640-0.673）方面均有提高。eNSCLC算法显示出较高的特异性（0.874）和相对较低的敏感性（0.539）：结论：使用电子病历数据对基于治疗的最新算法进行了验证，该算法可识别NSCLC事件患者，并在电子病历数据不可用时区分索赔数据库中的肺癌亚型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊