Accuracy of Asthma Computable Phenotypes to Identify Pediatric Asthma at an Academic Institution.

IF 1.3 4区医学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Methods of Information in Medicine Pub Date : 2020-12-01 Epub Date: 2021-07-14 DOI:10.1055/s-0041-1729951

Mindy K Ross, Henry Zheng, Bing Zhu, Ailina Lao, Hyejin Hong, Alamelu Natesan, Melina Radparvar, Alex A T Bui

{"title":"Accuracy of Asthma Computable Phenotypes to Identify Pediatric Asthma at an Academic Institution.","authors":"Mindy K Ross, Henry Zheng, Bing Zhu, Ailina Lao, Hyejin Hong, Alamelu Natesan, Melina Radparvar, Alex A T Bui","doi":"10.1055/s-0041-1729951","DOIUrl":null,"url":null,"abstract":"Objectives: Asthma is a heterogenous condition with significant diagnostic complexity, including variations in symptoms and temporal criteria. The disease can be difficult for clinicians to diagnose accurately. Properly identifying asthma patients from the electronic health record is consequently challenging as current algorithms (computable phenotypes) rely on diagnostic codes (e.g., International Classification of Disease, ICD) in addition to other criteria (e.g., inhaler medications)-but presume an accurate diagnosis. As such, there is no universally accepted or rigorously tested computable phenotype for asthma.Methods: We compared two established asthma computable phenotypes: the Chicago Area Patient-Outcomes Research Network (CAPriCORN) and Phenotype KnowledgeBase (PheKB). We established a large-scale, consensus gold standard (n = 1,365) from the University of California, Los Angeles Health System's clinical data warehouse for patients 5 to 17 years old. Results were manually reviewed and predictive performance (positive predictive value [PPV], sensitivity/specificity, F1-score) determined. We then examined the classification errors to gain insight for future algorithm optimizations.Results: As applied to our final cohort of 1,365 expert-defined gold standard patients, the CAPriCORN algorithms performed with a balanced PPV = 95.8% (95% CI: 94.4-97.2%), sensitivity = 85.7% (95% CI: 83.9-87.5%), and harmonized F1 = 90.4% (95% CI: 89.2-91.7%). The PheKB algorithm was performed with a balanced PPV = 83.1% (95% CI: 80.5-85.7%), sensitivity = 69.4% (95% CI: 66.3-72.5%), and F1 = 75.4% (95% CI: 73.1-77.8%). Four categories of errors were identified related to method limitations, disease definition, human error, and design implementation.Conclusion: The performance of the CAPriCORN and PheKB algorithms was lower than previously reported as applied to pediatric data (PPV = 97.7 and 96%, respectively). There is room to improve the performance of current methods, including targeted use of natural language processing and clinical feature engineering.","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":"59 6","pages":"219-226"},"PeriodicalIF":1.3000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9113735/pdf/nihms-1774084.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methods of Information in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1055/s-0041-1729951","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/7/14 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Objectives: Asthma is a heterogenous condition with significant diagnostic complexity, including variations in symptoms and temporal criteria. The disease can be difficult for clinicians to diagnose accurately. Properly identifying asthma patients from the electronic health record is consequently challenging as current algorithms (computable phenotypes) rely on diagnostic codes (e.g., International Classification of Disease, ICD) in addition to other criteria (e.g., inhaler medications)-but presume an accurate diagnosis. As such, there is no universally accepted or rigorously tested computable phenotype for asthma.

Methods: We compared two established asthma computable phenotypes: the Chicago Area Patient-Outcomes Research Network (CAPriCORN) and Phenotype KnowledgeBase (PheKB). We established a large-scale, consensus gold standard (n = 1,365) from the University of California, Los Angeles Health System's clinical data warehouse for patients 5 to 17 years old. Results were manually reviewed and predictive performance (positive predictive value [PPV], sensitivity/specificity, F1-score) determined. We then examined the classification errors to gain insight for future algorithm optimizations.

Results: As applied to our final cohort of 1,365 expert-defined gold standard patients, the CAPriCORN algorithms performed with a balanced PPV = 95.8% (95% CI: 94.4-97.2%), sensitivity = 85.7% (95% CI: 83.9-87.5%), and harmonized F1 = 90.4% (95% CI: 89.2-91.7%). The PheKB algorithm was performed with a balanced PPV = 83.1% (95% CI: 80.5-85.7%), sensitivity = 69.4% (95% CI: 66.3-72.5%), and F1 = 75.4% (95% CI: 73.1-77.8%). Four categories of errors were identified related to method limitations, disease definition, human error, and design implementation.

Conclusion: The performance of the CAPriCORN and PheKB algorithms was lower than previously reported as applied to pediatric data (PPV = 97.7 and 96%, respectively). There is room to improve the performance of current methods, including targeted use of natural language processing and clinical feature engineering.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

哮喘可计算表型识别儿科哮喘在学术机构的准确性。

目的:哮喘是一种异质性疾病，具有显著的诊断复杂性，包括症状和时间标准的变化。临床医生很难准确诊断这种疾病。因此，从电子健康记录中正确识别哮喘患者具有挑战性，因为目前的算法(可计算表型)依赖于诊断代码(例如，国际疾病分类，ICD)以及其他标准(例如，吸入器药物)，但假设是准确的诊断。因此，没有普遍接受或严格测试的可计算的哮喘表型。方法:我们比较了两种已建立的哮喘可计算表型:芝加哥地区患者结局研究网络(CAPriCORN)和表型知识库(PheKB)。我们从加州大学洛杉矶卫生系统的临床数据仓库中为5至17岁的患者建立了一个大规模的、一致的黄金标准(n = 1365)。人工审查结果并确定预测性能(阳性预测值[PPV]，敏感性/特异性，f1评分)。然后，我们检查了分类错误，以深入了解未来的算法优化。结果:应用于1365名专家定义的金标准患者的最终队列，CAPriCORN算法的平衡PPV = 95.8% (95% CI: 94.4-97.2%)，灵敏度= 85.7% (95% CI: 83.9-87.5%)，协调F1 = 90.4% (95% CI: 89.2-91.7%)。PheKB算法的平衡PPV = 83.1% (95% CI: 80.5-85.7%)，灵敏度= 69.4% (95% CI: 66.3-72.5%)， F1 = 75.4% (95% CI: 73.1-77.8%)。确定了与方法限制、疾病定义、人为错误和设计实施相关的四类错误。结论:CAPriCORN和PheKB算法应用于儿科数据的性能低于先前报道(PPV分别为97.7和96%)。现有方法的性能还有改进的空间，包括有针对性地使用自然语言处理和临床特征工程。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Methods of Information in Medicine 医学-计算机：信息系统

CiteScore

3.70

自引率

11.80%

发文量

审稿时长

6-12 weeks

期刊介绍： Good medicine and good healthcare demand good information. Since the journal''s founding in 1962, Methods of Information in Medicine has stressed the methodology and scientific fundamentals of organizing, representing and analyzing data, information and knowledge in biomedicine and health care. Covering publications in the fields of biomedical and health informatics, medical biometry, and epidemiology, the journal publishes original papers, reviews, reports, opinion papers, editorials, and letters to the editor. From time to time, the journal publishes articles on particular focus themes as part of a journal''s issue.