Using SNOMED to automate clinical concept mapping

Proceedings of the ACM Conference on Health, Inference, and Learning Pub Date : 2020-04-02 DOI:10.1145/3368555.3384453

Shaun Gupta, Frederik Dieleman, P. Long, O. Doyle, N. Leavitt

{"title":"Using SNOMED to automate clinical concept mapping","authors":"Shaun Gupta, Frederik Dieleman, P. Long, O. Doyle, N. Leavitt","doi":"10.1145/3368555.3384453","DOIUrl":null,"url":null,"abstract":"The International Classification of Disease (ICD) is a widely used diagnostic ontology for the classification of health disorders and a valuable resource for healthcare analytics. However, ICD is an evolving ontology and subject to periodic revisions (e.g. ICD-9-CM to ICD-10-CM) resulting in the absence of complete cross-walks between versions. While clinical experts can create custom mappings across ICD versions, this process is both time-consuming and costly. We propose an automated solution that facilitates interoperability without sacrificing accuracy. Our solution leverages the SNOMED-CT ontology whereby medical concepts are organised in a directed acyclic graph. We use this to map ICD-9-CM to ICD-10-CM by associating codes to clinical concepts in the SNOMED graph using a nearest neighbors search in combination with natural language processing. To assess the impact of our method, the performance of a gradient boosted tree (XGBoost) developed to classify patients with Exocrine Pancreatic Insufficiency (EPI) disorder, was compared when using features constructed by our solution versus clinically-driven methods. This dataset comprised of 23, 204 EPI patients and 277, 324 non-EPI patients with data spanning from October 2011 to April 2017. Our algorithm generated clinical predictors with comparable stability across the ICD-9-CM to ICD-10-CM transition point when compared to ICD-9-CM/ICD-10-CM mappings generated by clinical experts. Preliminary modeling results showed highly similar performance for models based on the SNOMED mapping vs clinically defined mapping (71% precision at 20% recall for both models). Overall, the framework does not compromise on accuracy at the individual code level or at the model-level while obviating the need for time-consuming manual mapping.","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":"224 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM Conference on Health, Inference, and Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3368555.3384453","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

The International Classification of Disease (ICD) is a widely used diagnostic ontology for the classification of health disorders and a valuable resource for healthcare analytics. However, ICD is an evolving ontology and subject to periodic revisions (e.g. ICD-9-CM to ICD-10-CM) resulting in the absence of complete cross-walks between versions. While clinical experts can create custom mappings across ICD versions, this process is both time-consuming and costly. We propose an automated solution that facilitates interoperability without sacrificing accuracy. Our solution leverages the SNOMED-CT ontology whereby medical concepts are organised in a directed acyclic graph. We use this to map ICD-9-CM to ICD-10-CM by associating codes to clinical concepts in the SNOMED graph using a nearest neighbors search in combination with natural language processing. To assess the impact of our method, the performance of a gradient boosted tree (XGBoost) developed to classify patients with Exocrine Pancreatic Insufficiency (EPI) disorder, was compared when using features constructed by our solution versus clinically-driven methods. This dataset comprised of 23, 204 EPI patients and 277, 324 non-EPI patients with data spanning from October 2011 to April 2017. Our algorithm generated clinical predictors with comparable stability across the ICD-9-CM to ICD-10-CM transition point when compared to ICD-9-CM/ICD-10-CM mappings generated by clinical experts. Preliminary modeling results showed highly similar performance for models based on the SNOMED mapping vs clinically defined mapping (71% precision at 20% recall for both models). Overall, the framework does not compromise on accuracy at the individual code level or at the model-level while obviating the need for time-consuming manual mapping.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用SNOMED自动化临床概念映射

国际疾病分类(ICD)是一个广泛使用的健康疾病分类诊断本体，也是医疗保健分析的宝贵资源。然而，ICD是一个不断发展的本体，并受到定期修订的影响(例如，ICD-9- cm到ICD-10- cm)，导致版本之间缺乏完整的交叉行走。虽然临床专家可以跨ICD版本创建自定义映射，但这个过程既耗时又昂贵。我们提出一个自动化的解决方案，在不牺牲准确性的情况下促进互操作性。我们的解决方案利用了SNOMED-CT本体，将医学概念组织在一个有向无环图中。我们使用最近邻搜索结合自然语言处理，将代码与SNOMED图中的临床概念相关联，从而将ICD-9-CM映射到ICD-10-CM。为了评估我们的方法的影响，在使用我们的解决方案构建的特征与临床驱动的方法时，比较了用于对外分泌胰腺功能不全(EPI)疾病患者进行分类的梯度增强树(XGBoost)的性能。该数据集包括23,204名EPI患者和277,324名非EPI患者，数据时间跨度为2011年10月至2017年4月。与临床专家生成的ICD-9-CM/ICD-10-CM映射相比，我们的算法生成的临床预测因子在ICD-9-CM到ICD-10-CM的过渡点上具有相当的稳定性。初步的建模结果显示，基于SNOMED映射和临床定义映射的模型的性能非常相似(两种模型的准确率为71%，召回率为20%)。总的来说，框架在避免耗时的手工映射的同时，不会在单个代码级别或模型级别的准确性上做出妥协。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the ACM Conference on Health, Inference, and Learning

自引率

0.00%

发文量

期刊最新文献

Explaining a machine learning decision to physicians via counterfactuals Rare Life Event Detection via Mobile Sensing Using Multi-Task Learning PTGB: Pre-Train Graph Neural Networks for Brain Network Analysis Large-Scale Study of Temporal Shift in Health Insurance Claims Token Imbalance Adaptation for Radiology Report Generation