The Secondary Use of Electronic Health Records for Data Mining: Data Characteristics and Challenges

Tabinda Sarwar, S. Seifollahi, Jeffrey A Chan, Xiuzhen Zhang, V. Aksakalli, I. Hudson, Karin M. Verspoor, L. Cavedon
{"title":"The Secondary Use of Electronic Health Records for Data Mining: Data Characteristics and Challenges","authors":"Tabinda Sarwar, S. Seifollahi, Jeffrey A Chan, Xiuzhen Zhang, V. Aksakalli, I. Hudson, Karin M. Verspoor, L. Cavedon","doi":"10.1145/3490234","DOIUrl":null,"url":null,"abstract":"The primary objective of implementing Electronic Health Records (EHRs) is to improve the management of patients’ health-related information. However, these records have also been extensively used for the secondary purpose of clinical research and to improve healthcare practice. EHRs provide a rich set of information that includes demographics, medical history, medications, laboratory test results, and diagnosis. Data mining and analytics techniques have extensively exploited EHR information to study patient cohorts for various clinical and research applications, such as phenotype extraction, precision medicine, intervention evaluation, disease prediction, detection, and progression. But the presence of diverse data types and associated characteristics poses many challenges to the use of EHR data. In this article, we provide an overview of information found in EHR systems and their characteristics that could be utilized for secondary applications. We first discuss the different types of data stored in EHRs, followed by the data transformations necessary for data analysis and mining. Later, we discuss the data quality issues and characteristics of the EHRs along with the relevant methods used to address them. Moreover, this survey also highlights the usage of various data types for different applications. Hence, this article can serve as a primer for researchers to understand the use of EHRs for data mining and analytics purposes.","PeriodicalId":7000,"journal":{"name":"ACM Computing Surveys (CSUR)","volume":"8 1","pages":"1 - 40"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Computing Surveys (CSUR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3490234","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18

Abstract

The primary objective of implementing Electronic Health Records (EHRs) is to improve the management of patients’ health-related information. However, these records have also been extensively used for the secondary purpose of clinical research and to improve healthcare practice. EHRs provide a rich set of information that includes demographics, medical history, medications, laboratory test results, and diagnosis. Data mining and analytics techniques have extensively exploited EHR information to study patient cohorts for various clinical and research applications, such as phenotype extraction, precision medicine, intervention evaluation, disease prediction, detection, and progression. But the presence of diverse data types and associated characteristics poses many challenges to the use of EHR data. In this article, we provide an overview of information found in EHR systems and their characteristics that could be utilized for secondary applications. We first discuss the different types of data stored in EHRs, followed by the data transformations necessary for data analysis and mining. Later, we discuss the data quality issues and characteristics of the EHRs along with the relevant methods used to address them. Moreover, this survey also highlights the usage of various data types for different applications. Hence, this article can serve as a primer for researchers to understand the use of EHRs for data mining and analytics purposes.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
电子健康记录在数据挖掘中的二次使用:数据特征和挑战
实施电子健康记录(EHRs)的主要目标是改善对患者健康相关信息的管理。然而,这些记录也被广泛用于临床研究和改善医疗保健实践的次要目的。电子病历提供了一组丰富的信息,包括人口统计、病史、药物、实验室测试结果和诊断。数据挖掘和分析技术已经广泛利用电子病历信息来研究各种临床和研究应用的患者群体,如表型提取、精准医学、干预评估、疾病预测、检测和进展。但是,各种数据类型和相关特征的存在给电子病历数据的使用带来了许多挑战。在本文中,我们概述了在EHR系统中发现的信息及其可用于辅助应用程序的特征。我们首先讨论存储在ehr中的不同类型的数据,然后讨论数据分析和挖掘所需的数据转换。稍后,我们将讨论电子病历的数据质量问题和特征,以及用于解决这些问题的相关方法。此外,该调查还强调了不同应用程序对不同数据类型的使用。因此,本文可以作为研究人员了解电子病历用于数据挖掘和分析目的的入门读物。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Experimental Comparisons of Clustering Approaches for Data Representation On the Structure of the Boolean Satisfiability Problem: A Survey A Brief Overview of Universal Sentence Representation Methods: A Linguistic View The Eye in Extended Reality: A Survey on Gaze Interaction and Eye Tracking in Head-worn Extended Reality A Comprehensive Report on Machine Learning-based Early Detection of Alzheimer's Disease using Multi-modal Neuroimaging Data
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1