An EHR Data Quality Evaluation Approach Based on Medical Knowledge and Text Matching

IF 5.6 4区 医学 Q1 ENGINEERING, BIOMEDICAL Irbm Pub Date : 2023-10-01 DOI:10.1016/j.irbm.2023.100782
Nanya Chen, Jiangtao Ren
{"title":"An EHR Data Quality Evaluation Approach Based on Medical Knowledge and Text Matching","authors":"Nanya Chen,&nbsp;Jiangtao Ren","doi":"10.1016/j.irbm.2023.100782","DOIUrl":null,"url":null,"abstract":"<div><p><em>Introduction</em><span><span>: Recently, medical artificial intelligence based on Electronic Health Records (EHR) is a significant research field, and EHR data has been widely used in </span>clinical decision support systems and medical diagnosis systems. However, because EHR are used to record the patient's disease information and are not primarily designed for research and discovery, the utility of EHR for research will be hindered by data quality problems. Therefore, it is a meaningful and challenging task to evaluate the data quality of EHR before they are used in medical artificial intelligence. Most of the current EHR data quality evaluation methods are based on some conventional evaluation indicators, and rarely consider the introduction of clinical evidence.</span></p><p><em>Materials and methods</em>: we propose an EHR data quality evaluation approach based on clinical evidence and a deep text matching model. First, based on the medical knowledge of the particular disease, we establish the list of standard clinical evidence descriptions including typical symptoms and special signs, etc. Then we find the relevant clinical evidence from the EHR based on the text matching model, and finally evaluate the quality of the EHR based on the quantity and quality of the relevant clinical evidence found.</p><p><em>Results</em><span>: The experimental results of more than 1,000 EHR for two types of diseases show that our approach can effectively distinguish high-quality EHR from low-quality EHR, and the high-quality EHR found generally contains sufficient and consistent information related to disease diagnosis.</span></p><p><em>Conclusions</em>: Experiments results on a real-world dataset demonstrate the effectiveness of our EHR data quality evaluation approach based on medical knowledge and text matching.</p></div>","PeriodicalId":14605,"journal":{"name":"Irbm","volume":"44 5","pages":"Article 100782"},"PeriodicalIF":5.6000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Irbm","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1959031823000313","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction: Recently, medical artificial intelligence based on Electronic Health Records (EHR) is a significant research field, and EHR data has been widely used in clinical decision support systems and medical diagnosis systems. However, because EHR are used to record the patient's disease information and are not primarily designed for research and discovery, the utility of EHR for research will be hindered by data quality problems. Therefore, it is a meaningful and challenging task to evaluate the data quality of EHR before they are used in medical artificial intelligence. Most of the current EHR data quality evaluation methods are based on some conventional evaluation indicators, and rarely consider the introduction of clinical evidence.

Materials and methods: we propose an EHR data quality evaluation approach based on clinical evidence and a deep text matching model. First, based on the medical knowledge of the particular disease, we establish the list of standard clinical evidence descriptions including typical symptoms and special signs, etc. Then we find the relevant clinical evidence from the EHR based on the text matching model, and finally evaluate the quality of the EHR based on the quantity and quality of the relevant clinical evidence found.

Results: The experimental results of more than 1,000 EHR for two types of diseases show that our approach can effectively distinguish high-quality EHR from low-quality EHR, and the high-quality EHR found generally contains sufficient and consistent information related to disease diagnosis.

Conclusions: Experiments results on a real-world dataset demonstrate the effectiveness of our EHR data quality evaluation approach based on medical knowledge and text matching.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于医学知识和文本匹配的电子病历数据质量评价方法
引言:近年来,基于电子健康记录(EHR)的医疗人工智能是一个重要的研究领域,EHR数据已被广泛应用于临床决策支持系统和医疗诊断系统。然而,由于EHR用于记录患者的疾病信息,而不是主要为研究和发现而设计的,因此EHR用于研究的效用将受到数据质量问题的阻碍。因此,在EHR应用于医疗人工智能之前,对其数据质量进行评估是一项有意义且具有挑战性的任务。目前的EHR数据质量评价方法大多基于一些常规的评价指标,很少考虑引入临床证据。材料和方法:我们提出了一种基于临床证据和深度文本匹配模型的EHR数据质量评估方法。首先,基于特定疾病的医学知识,我们建立了标准临床证据描述列表,包括典型症状和特殊体征等。然后,我们基于文本匹配模型从EHR中找到相关临床证据,最后根据找到的相关临床证据的数量和质量来评估EHR的质量。结果:1000多个EHR对两种疾病的实验结果表明,我们的方法可以有效地区分高质量EHR和低质量EHR,并且发现的高质量EHR通常包含足够和一致的与疾病诊断相关的信息。结论:在真实世界数据集上的实验结果证明了我们基于医学知识和文本匹配的EHR数据质量评估方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Irbm
Irbm ENGINEERING, BIOMEDICAL-
CiteScore
10.30
自引率
4.20%
发文量
81
审稿时长
57 days
期刊介绍: IRBM is the journal of the AGBM (Alliance for engineering in Biology an Medicine / Alliance pour le génie biologique et médical) and the SFGBM (BioMedical Engineering French Society / Société française de génie biologique médical) and the AFIB (French Association of Biomedical Engineers / Association française des ingénieurs biomédicaux). As a vehicle of information and knowledge in the field of biomedical technologies, IRBM is devoted to fundamental as well as clinical research. Biomedical engineering and use of new technologies are the cornerstones of IRBM, providing authors and users with the latest information. Its six issues per year propose reviews (state-of-the-art and current knowledge), original articles directed at fundamental research and articles focusing on biomedical engineering. All articles are submitted to peer reviewers acting as guarantors for IRBM''s scientific and medical content. The field covered by IRBM includes all the discipline of Biomedical engineering. Thereby, the type of papers published include those that cover the technological and methodological development in: -Physiological and Biological Signal processing (EEG, MEG, ECG…)- Medical Image processing- Biomechanics- Biomaterials- Medical Physics- Biophysics- Physiological and Biological Sensors- Information technologies in healthcare- Disability research- Computational physiology- …
期刊最新文献
Editorial Board Contents Potential of Near-Infrared Optical Techniques for Non-invasive Blood Glucose Measurement: A Pilot Study Corrigendum to “Automatic Detection of Severely and Mildly Infected COVID-19 Patients with Supervised Machine Learning Models” [IRBM (2023) 100725] Comprehensive Review of Feature Extraction Techniques for sEMG Signal Classification: From Handcrafted Features to Deep Learning Approaches
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1