Nicole Princic, Donna McMorrow, Philip Chan, Lisa Hess
{"title":"评估行政索赔中软组织肉瘤(STS)识别算法的准确性。","authors":"Nicole Princic, Donna McMorrow, Philip Chan, Lisa Hess","doi":"10.1186/s13569-020-00130-y","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Lack of using a validated algorithm to select patients is a source of selection bias in oncology studies using administrative claims. The objective of this study to evaluate published algorithms to identify patients with soft tissue sarcoma (STS) in administrative claims and to evaluate new algorithms to improved performance.</p><p><strong>Methods: </strong>Two cancer populations including STS cases and non-STS controls were selected from the MarketScan Explorys Linked Claims-Electronic Medical Record (EMR) Database between January 1, 2000 and July 31, 2018. Eligible cases had a diagnosis on a clinical record for STS in the EMR while controls had no evidence of STS on any EMR records. Both cases and controls were enrolled in administrative claims during a period of observation and were aged ≥ 18 years. A split sample was used to test and validate algorithms using data from administrative claims. Values for sensitivity, specificity, and positive predictive value (PPV) were calculated for 14 algorithms. Prior literature validating algorithms in administrative claims across other cancer types report both sensitivity and specificity ranging from as low as 73% to as high as 95%. This was used as a benchmark for defining algorithm success.</p><p><strong>Results: </strong>There were 784 STS cases and 249,062 non-STS cancer controls eligible for analysis. Requiring at least two claims with an ICD-CM diagnosis code for STS achieved a sensitivity of 67% but had a specificity of 72%. Algorithms that required NCCN-recommended systemic treatment for STS improved the specificity to over 90% but dropped the sensitivity to below 20%. Other combinations of diagnostic tests, symptoms, and procedures did not improve performance.</p><p><strong>Conclusions: </strong>The algorithms tested in this study sample did not achieve sufficient performance and suggest the ability to accurately identify the STS population in administrative data is problematic. Difficulties are likely due to the origin of STS in a variety of locations, the non-specific symptoms of STS, and the common diagnostic tests recommended to diagnose the disease. Future research applying machine learning to examine timing and patterns of variables that comprise the diagnostic process may further investigate the ability to accurately identify STS cases in claims databases.</p>","PeriodicalId":10684,"journal":{"name":"Clinical Sarcoma Research","volume":"10 ","pages":"8"},"PeriodicalIF":0.0000,"publicationDate":"2020-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13569-020-00130-y","citationCount":"1","resultStr":"{\"title\":\"Evaluation of the accuracy of algorithms to identify soft tissue sarcoma (STS) in administrative claims.\",\"authors\":\"Nicole Princic, Donna McMorrow, Philip Chan, Lisa Hess\",\"doi\":\"10.1186/s13569-020-00130-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Lack of using a validated algorithm to select patients is a source of selection bias in oncology studies using administrative claims. The objective of this study to evaluate published algorithms to identify patients with soft tissue sarcoma (STS) in administrative claims and to evaluate new algorithms to improved performance.</p><p><strong>Methods: </strong>Two cancer populations including STS cases and non-STS controls were selected from the MarketScan Explorys Linked Claims-Electronic Medical Record (EMR) Database between January 1, 2000 and July 31, 2018. Eligible cases had a diagnosis on a clinical record for STS in the EMR while controls had no evidence of STS on any EMR records. Both cases and controls were enrolled in administrative claims during a period of observation and were aged ≥ 18 years. A split sample was used to test and validate algorithms using data from administrative claims. Values for sensitivity, specificity, and positive predictive value (PPV) were calculated for 14 algorithms. Prior literature validating algorithms in administrative claims across other cancer types report both sensitivity and specificity ranging from as low as 73% to as high as 95%. This was used as a benchmark for defining algorithm success.</p><p><strong>Results: </strong>There were 784 STS cases and 249,062 non-STS cancer controls eligible for analysis. Requiring at least two claims with an ICD-CM diagnosis code for STS achieved a sensitivity of 67% but had a specificity of 72%. Algorithms that required NCCN-recommended systemic treatment for STS improved the specificity to over 90% but dropped the sensitivity to below 20%. Other combinations of diagnostic tests, symptoms, and procedures did not improve performance.</p><p><strong>Conclusions: </strong>The algorithms tested in this study sample did not achieve sufficient performance and suggest the ability to accurately identify the STS population in administrative data is problematic. Difficulties are likely due to the origin of STS in a variety of locations, the non-specific symptoms of STS, and the common diagnostic tests recommended to diagnose the disease. Future research applying machine learning to examine timing and patterns of variables that comprise the diagnostic process may further investigate the ability to accurately identify STS cases in claims databases.</p>\",\"PeriodicalId\":10684,\"journal\":{\"name\":\"Clinical Sarcoma Research\",\"volume\":\"10 \",\"pages\":\"8\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-05-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1186/s13569-020-00130-y\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Clinical Sarcoma Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1186/s13569-020-00130-y\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2020/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Sarcoma Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s13569-020-00130-y","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2020/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
摘要
背景:在使用行政声明的肿瘤学研究中,缺乏使用经过验证的算法来选择患者是选择偏差的一个来源。本研究的目的是评估已发表的在行政索赔中识别软组织肉瘤(STS)患者的算法,并评估改进性能的新算法。方法:从2000年1月1日至2018年7月31日的MarketScan Explorys Linked Claims-Electronic Medical Record (EMR)数据库中选择两组癌症人群,包括STS病例和非STS对照。符合条件的病例在EMR的临床记录中诊断为STS,而对照组在任何EMR记录中没有STS的证据。在观察期间,病例和对照组均纳入行政索赔,年龄≥18岁。分割样本用于使用来自行政索赔的数据来测试和验证算法。计算14种算法的敏感性、特异性和阳性预测值(PPV)。先前的文献证实了其他癌症类型的行政索赔算法的敏感性和特异性从低至73%到高至95%不等。这被用作定义算法成功的基准。结果:有784例STS病例和249062例非STS对照符合分析条件。对STS要求至少两份具有ICD-CM诊断代码的声明,敏感性为67%,特异性为72%。需要nccn推荐的全身治疗的算法将STS的特异性提高到90%以上,但灵敏度降至20%以下。诊断测试、症状和程序的其他组合不能提高性能。结论:在本研究样本中测试的算法没有达到足够的性能,并且表明在行政数据中准确识别STS人群的能力是有问题的。困难可能是由于STS起源于不同的位置,STS的非特异性症状,以及推荐用于诊断该疾病的常见诊断测试。未来的研究应用机器学习来检查组成诊断过程的变量的时间和模式,可能会进一步研究在索赔数据库中准确识别STS病例的能力。
Evaluation of the accuracy of algorithms to identify soft tissue sarcoma (STS) in administrative claims.
Background: Lack of using a validated algorithm to select patients is a source of selection bias in oncology studies using administrative claims. The objective of this study to evaluate published algorithms to identify patients with soft tissue sarcoma (STS) in administrative claims and to evaluate new algorithms to improved performance.
Methods: Two cancer populations including STS cases and non-STS controls were selected from the MarketScan Explorys Linked Claims-Electronic Medical Record (EMR) Database between January 1, 2000 and July 31, 2018. Eligible cases had a diagnosis on a clinical record for STS in the EMR while controls had no evidence of STS on any EMR records. Both cases and controls were enrolled in administrative claims during a period of observation and were aged ≥ 18 years. A split sample was used to test and validate algorithms using data from administrative claims. Values for sensitivity, specificity, and positive predictive value (PPV) were calculated for 14 algorithms. Prior literature validating algorithms in administrative claims across other cancer types report both sensitivity and specificity ranging from as low as 73% to as high as 95%. This was used as a benchmark for defining algorithm success.
Results: There were 784 STS cases and 249,062 non-STS cancer controls eligible for analysis. Requiring at least two claims with an ICD-CM diagnosis code for STS achieved a sensitivity of 67% but had a specificity of 72%. Algorithms that required NCCN-recommended systemic treatment for STS improved the specificity to over 90% but dropped the sensitivity to below 20%. Other combinations of diagnostic tests, symptoms, and procedures did not improve performance.
Conclusions: The algorithms tested in this study sample did not achieve sufficient performance and suggest the ability to accurately identify the STS population in administrative data is problematic. Difficulties are likely due to the origin of STS in a variety of locations, the non-specific symptoms of STS, and the common diagnostic tests recommended to diagnose the disease. Future research applying machine learning to examine timing and patterns of variables that comprise the diagnostic process may further investigate the ability to accurately identify STS cases in claims databases.
期刊介绍:
Clinical Sarcoma Research considers for publication articles related to research on sarcomas, including both soft tissue and bone. The journal publishes original articles and review articles on the diagnosis and treatment of sarcomas along with new insights in sarcoma research, which may be of immediate or future interest for diagnosis and treatment. The journal also considers negative results, especially those from studies on new agents, as it is vital for the medical community to learn whether new agents have been proven effective or ineffective within subtypes of sarcomas. The journal also aims to offer a forum for active discussion on topics of major interest for the sarcoma community, which may be related to both research results and methodological topics.