Using Machine Learning to Capture Quality Metrics from Natural Language: A Case Study of Diabetic Eye Exams.

IF 1.3 4区医学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Methods of Information in Medicine Pub Date : 2021-09-01 Epub Date: 2021-10-01 DOI:10.1055/s-0041-1736311

Allan Fong, Nicholas Scoulios, H Joseph Blumenthal, Ryan E Anderson

{"title":"Using Machine Learning to Capture Quality Metrics from Natural Language: A Case Study of Diabetic Eye Exams.","authors":"Allan Fong, Nicholas Scoulios, H Joseph Blumenthal, Ryan E Anderson","doi":"10.1055/s-0041-1736311","DOIUrl":null,"url":null,"abstract":"Background and objective: The prevalence of value-based payment models has led to an increased use of the electronic health record to capture quality measures, necessitating additional documentation requirements for providers.Methods: This case study uses text mining and natural language processing techniques to identify the timely completion of diabetic eye exams (DEEs) from 26,203 unique clinician notes for reporting as an electronic clinical quality measure (eCQM). Logistic regression and support vector machine (SVM) using unbalanced and balanced datasets, using the synthetic minority over-sampling technique (SMOTE) algorithm, were evaluated on precision, recall, sensitivity, and f1-score for classifying records positive for DEE. We then integrate a high precision DEE model to evaluate free-text clinical narratives from our clinical EHR system.Results: Logistic regression and SVM models had comparable f1-score and specificity metrics with models trained and validated with no oversampling favoring precision over recall. SVM with and without oversampling resulted in the best precision, 0.96, and recall, 0.85, respectively. These two SVM models were applied to the unannotated 31,585 text segments representing 24,823 unique records and 13,714 unique patients. The number of records classified as positive for DEE using the SVM models ranged from 667 to 8,935 (2.7-36% out of 24,823, respectively). Unique patients classified as positive for DEE ranged from 3.5 to 41.8% highlighting the potential utility of these models.Discussion: We believe the impact of oversampling on SVM model performance to be caused by the potential of overfitting of the SVM SMOTE model on the synthesized data and the data synthesis process. However, the specificities of SVM with and without SMOTE were comparable, suggesting both models were confident in their negative predictions. By prioritizing to implement the SVM model with higher precision over sensitivity or recall in the categorization of DEEs, we can provide a highly reliable pool of results that can be documented through automation, reducing the burden of secondary review. Although the focus of this work was on completed DEEs, this method could be applied to completing other necessary documentation by extracting information from natural language in clinician notes.Conclusion: By enabling the capture of data for eCQMs from documentation generated by usual clinical practice, this work represents a case study in how such techniques can be leveraged to drive quality without increasing clinician work.","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":"60 3-04","pages":"110-115"},"PeriodicalIF":1.3000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methods of Information in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1055/s-0041-1736311","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/10/1 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Background and objective: The prevalence of value-based payment models has led to an increased use of the electronic health record to capture quality measures, necessitating additional documentation requirements for providers.

Methods: This case study uses text mining and natural language processing techniques to identify the timely completion of diabetic eye exams (DEEs) from 26,203 unique clinician notes for reporting as an electronic clinical quality measure (eCQM). Logistic regression and support vector machine (SVM) using unbalanced and balanced datasets, using the synthetic minority over-sampling technique (SMOTE) algorithm, were evaluated on precision, recall, sensitivity, and f1-score for classifying records positive for DEE. We then integrate a high precision DEE model to evaluate free-text clinical narratives from our clinical EHR system.

Results: Logistic regression and SVM models had comparable f1-score and specificity metrics with models trained and validated with no oversampling favoring precision over recall. SVM with and without oversampling resulted in the best precision, 0.96, and recall, 0.85, respectively. These two SVM models were applied to the unannotated 31,585 text segments representing 24,823 unique records and 13,714 unique patients. The number of records classified as positive for DEE using the SVM models ranged from 667 to 8,935 (2.7-36% out of 24,823, respectively). Unique patients classified as positive for DEE ranged from 3.5 to 41.8% highlighting the potential utility of these models.

Discussion: We believe the impact of oversampling on SVM model performance to be caused by the potential of overfitting of the SVM SMOTE model on the synthesized data and the data synthesis process. However, the specificities of SVM with and without SMOTE were comparable, suggesting both models were confident in their negative predictions. By prioritizing to implement the SVM model with higher precision over sensitivity or recall in the categorization of DEEs, we can provide a highly reliable pool of results that can be documented through automation, reducing the burden of secondary review. Although the focus of this work was on completed DEEs, this method could be applied to completing other necessary documentation by extracting information from natural language in clinician notes.

Conclusion: By enabling the capture of data for eCQMs from documentation generated by usual clinical practice, this work represents a case study in how such techniques can be leveraged to drive quality without increasing clinician work.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用机器学习从自然语言中获取质量指标:糖尿病眼科检查的案例研究。

背景和目的:基于价值的支付模式的流行导致越来越多地使用电子健康记录来记录质量措施，因此需要对提供者提出额外的文件要求。方法:本案例研究使用文本挖掘和自然语言处理技术，从26,203份独特的临床医生笔记中识别及时完成糖尿病眼科检查(dee)，并将其作为电子临床质量测量(eCQM)报告。采用非平衡和平衡数据集的逻辑回归和支持向量机(SVM)，采用合成少数过采样技术(SMOTE)算法，对DEE阳性记录分类的精度、召回率、灵敏度和f1评分进行评估。然后，我们整合了一个高精度的DEE模型来评估来自临床电子病历系统的自由文本临床叙述。结果:逻辑回归和支持向量机模型与经过训练和验证的模型具有可比较的f1评分和特异性指标，没有过采样，精度高于召回率。有过采样和无过采样的支持向量机分别获得了最佳的精度0.96和召回率0.85。将这两个SVM模型应用于未注释的31,585个文本片段，代表24,823条唯一记录和13,714个唯一患者。使用SVM模型分类为DEE阳性的记录数量从667到8,935不等(分别为24,823条中的2.7-36%)。被归类为DEE阳性的独特患者范围从3.5%到41.8%，突出了这些模型的潜在效用。讨论:我们认为过采样对SVM模型性能的影响是由SVM SMOTE模型对合成数据和数据合成过程的过拟合的可能性引起的。然而，有和没有SMOTE的SVM的特异性是可比的，这表明两个模型都对其负面预测有信心。通过优先实现SVM模型在dee分类中具有更高的精度而不是灵敏度或召回率，我们可以提供一个高度可靠的结果池，可以通过自动化记录，减少二次审查的负担。虽然这项工作的重点是完成的学位，但该方法可以通过从临床医生笔记的自然语言中提取信息来应用于完成其他必要的文档。结论:通过从常规临床实践生成的文件中获取eCQMs数据，这项工作代表了如何利用这些技术在不增加临床医生工作的情况下提高质量的案例研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Methods of Information in Medicine 医学-计算机：信息系统

CiteScore

3.70

自引率

11.80%

发文量

审稿时长

6-12 weeks

期刊介绍： Good medicine and good healthcare demand good information. Since the journal''s founding in 1962, Methods of Information in Medicine has stressed the methodology and scientific fundamentals of organizing, representing and analyzing data, information and knowledge in biomedicine and health care. Covering publications in the fields of biomedical and health informatics, medical biometry, and epidemiology, the journal publishes original papers, reviews, reports, opinion papers, editorials, and letters to the editor. From time to time, the journal publishes articles on particular focus themes as part of a journal''s issue.