临床流行病学研究中电子健康记录自动与手动提取中结构化数据的质量和一致性评估

J. G. Brazeal, A. Alekseyenko, Hong Li, M. Fugal, K. Kirchoff, Courtney H. Marsh, D. Lewin, Jennifer D. Wu, J. Obeid, Kristin Wallace
{"title":"临床流行病学研究中电子健康记录自动与手动提取中结构化数据的质量和一致性评估","authors":"J. G. Brazeal, A. Alekseyenko, Hong Li, M. Fugal, K. Kirchoff, Courtney H. Marsh, D. Lewin, Jennifer D. Wu, J. Obeid, Kristin Wallace","doi":"10.1177/26320843211061287","DOIUrl":null,"url":null,"abstract":"Objective We evaluate data agreement between an electronic health record (EHR) sample abstracted by automated characterization with a standard abstracted by manual review. Study Design and Setting We obtain data for an epidemiology cohort study using standard manual abstraction of the EHR and automated identification of the same patients using a structured algorithm to query the EHR. Summary measures of agreement (e.g., Cohen’s kappa) are reported for 12 variables commonly used in epidemiological studies. Results Best agreement between abstraction methods is observed among demographic characteristics such as age, sex, and race, and for positive history of disease. Poor agreement is found in missing data and negative history, suggesting potential impact for researchers using automated EHR characterization. EHR data quality depends upon providers, who may be influenced by both institutional and federal government documentation guidelines. Conclusion Automated EHR abstraction discrepancies may decrease power and increase bias; therefore, caution is warranted when selecting variables from EHRs for epidemiological study using an automated characterization approach. Validation of automated methods must also continue to advance in sophistication with other technologies, such as machine learning and natural language processing, to extract non-structured data from the EHR, for application to EHR characterization for clinical epidemiology.","PeriodicalId":74683,"journal":{"name":"Research methods in medicine & health sciences","volume":"2 1","pages":"168 - 178"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Assessing quality and agreement of structured data in automatic versus manual abstraction of the electronic health record for a clinical epidemiology study\",\"authors\":\"J. G. Brazeal, A. Alekseyenko, Hong Li, M. Fugal, K. Kirchoff, Courtney H. Marsh, D. Lewin, Jennifer D. Wu, J. Obeid, Kristin Wallace\",\"doi\":\"10.1177/26320843211061287\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Objective We evaluate data agreement between an electronic health record (EHR) sample abstracted by automated characterization with a standard abstracted by manual review. Study Design and Setting We obtain data for an epidemiology cohort study using standard manual abstraction of the EHR and automated identification of the same patients using a structured algorithm to query the EHR. Summary measures of agreement (e.g., Cohen’s kappa) are reported for 12 variables commonly used in epidemiological studies. Results Best agreement between abstraction methods is observed among demographic characteristics such as age, sex, and race, and for positive history of disease. Poor agreement is found in missing data and negative history, suggesting potential impact for researchers using automated EHR characterization. EHR data quality depends upon providers, who may be influenced by both institutional and federal government documentation guidelines. Conclusion Automated EHR abstraction discrepancies may decrease power and increase bias; therefore, caution is warranted when selecting variables from EHRs for epidemiological study using an automated characterization approach. Validation of automated methods must also continue to advance in sophistication with other technologies, such as machine learning and natural language processing, to extract non-structured data from the EHR, for application to EHR characterization for clinical epidemiology.\",\"PeriodicalId\":74683,\"journal\":{\"name\":\"Research methods in medicine & health sciences\",\"volume\":\"2 1\",\"pages\":\"168 - 178\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Research methods in medicine & health sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1177/26320843211061287\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research methods in medicine & health sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/26320843211061287","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

目的评估通过自动表征提取的电子健康记录(EHR)样本与通过手动审查提取的标准之间的数据一致性。研究设计和设置我们使用EHR的标准手动抽象和使用结构化算法查询EHR的相同患者的自动识别来获得流行病学队列研究的数据。报告了流行病学研究中常用的12个变量的一致性汇总指标(如Cohen’s kappa)。结果在年龄、性别、种族等人口学特征和阳性病史方面,抽象方法之间的一致性最好。在缺失的数据和负面历史中发现了不一致的情况,这表明使用自动EHR表征的研究人员可能会受到影响。EHR数据质量取决于提供者,他们可能会受到机构和联邦政府文件指南的影响。结论EHR的自动化提取差异可能会降低功率并增加偏差;因此,在使用自动表征方法从EHR中选择变量进行流行病学研究时,需要谨慎。自动化方法的验证还必须继续与其他技术(如机器学习和自然语言处理)相结合,以从EHR中提取非结构化数据,用于临床流行病学的EHR表征。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Assessing quality and agreement of structured data in automatic versus manual abstraction of the electronic health record for a clinical epidemiology study
Objective We evaluate data agreement between an electronic health record (EHR) sample abstracted by automated characterization with a standard abstracted by manual review. Study Design and Setting We obtain data for an epidemiology cohort study using standard manual abstraction of the EHR and automated identification of the same patients using a structured algorithm to query the EHR. Summary measures of agreement (e.g., Cohen’s kappa) are reported for 12 variables commonly used in epidemiological studies. Results Best agreement between abstraction methods is observed among demographic characteristics such as age, sex, and race, and for positive history of disease. Poor agreement is found in missing data and negative history, suggesting potential impact for researchers using automated EHR characterization. EHR data quality depends upon providers, who may be influenced by both institutional and federal government documentation guidelines. Conclusion Automated EHR abstraction discrepancies may decrease power and increase bias; therefore, caution is warranted when selecting variables from EHRs for epidemiological study using an automated characterization approach. Validation of automated methods must also continue to advance in sophistication with other technologies, such as machine learning and natural language processing, to extract non-structured data from the EHR, for application to EHR characterization for clinical epidemiology.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Choice of Link Functions for Generalized Linear Mixed Models in Meta-Analyses of Proportions. Disclosure of suicidal ideation in non-psychiatric clinical research: Experience using a novel suicide risk management algorithm in a multi-center smoking cessation trial Dynamic relationship among immediate release fentanyl use and cancer incidence: A multivariate time-series analysis using vector autoregressive models Monitoring metrics over time: Why clinical trialists need to systematically collect site performance metrics. Editorial
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1