Use of Sequential Hot-Deck Imputation for Missing Health Care Systems Data for Population Health Research.

IF 3.3 2区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Medical Care Pub Date : 2024-05-01 Epub Date: 2024-03-28 DOI:10.1097/MLR.0000000000001995
Ella A Chrenka, Steven P Dehmer, Michael V Maciosek, Inih J Essien, Bjorn C Westgard
{"title":"Use of Sequential Hot-Deck Imputation for Missing Health Care Systems Data for Population Health Research.","authors":"Ella A Chrenka, Steven P Dehmer, Michael V Maciosek, Inih J Essien, Bjorn C Westgard","doi":"10.1097/MLR.0000000000001995","DOIUrl":null,"url":null,"abstract":"<p><p>Electronic medical record (EMR) data present many opportunities for population health research. The use of EMR data for population risk models can be impeded by the high proportion of missingness in key patient variables. Common approaches like complete case analysis and multiple imputation may not be appropriate for some population health initiatives that require a single, complete analytic data set. In this study, we demonstrate a sequential hot-deck imputation (HDI) procedure to address missingness in a set of cardiometabolic measures in an EMR data set. We assessed the performance of sequential HDI within the individual variables and a commonly used composite risk score. A data set of cardiometabolic measures based on EMR data from 2 large urban hospitals was used to create a benchmark data set with simulated missingness. Sequential HDI was applied, and the resulting data were used to calculate atherosclerotic cardiovascular disease risk scores. The performance of the imputation approach was assessed using a set of metrics to evaluate the distribution and validity of the imputed data. Of the 567,841 patients, 65% had at least 1 missing cardiometabolic measure. Sequential HDI resulted in the distribution of variables and risk scores that reflected those in the simulated data while retaining correlation. When stratified by age and sex, risk scores were plausible and captured patterns expected in the general population. The use of sequential HDI was shown to be a suitable approach to multivariate missingness in EMR data. Sequential HDI could benefit population health research by providing a straightforward, computationally nonintensive approach to missing EMR data that results in a single analytic data set.</p>","PeriodicalId":18364,"journal":{"name":"Medical Care","volume":" ","pages":"319-325"},"PeriodicalIF":3.3000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10997447/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical Care","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/MLR.0000000000001995","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/3/28 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

Electronic medical record (EMR) data present many opportunities for population health research. The use of EMR data for population risk models can be impeded by the high proportion of missingness in key patient variables. Common approaches like complete case analysis and multiple imputation may not be appropriate for some population health initiatives that require a single, complete analytic data set. In this study, we demonstrate a sequential hot-deck imputation (HDI) procedure to address missingness in a set of cardiometabolic measures in an EMR data set. We assessed the performance of sequential HDI within the individual variables and a commonly used composite risk score. A data set of cardiometabolic measures based on EMR data from 2 large urban hospitals was used to create a benchmark data set with simulated missingness. Sequential HDI was applied, and the resulting data were used to calculate atherosclerotic cardiovascular disease risk scores. The performance of the imputation approach was assessed using a set of metrics to evaluate the distribution and validity of the imputed data. Of the 567,841 patients, 65% had at least 1 missing cardiometabolic measure. Sequential HDI resulted in the distribution of variables and risk scores that reflected those in the simulated data while retaining correlation. When stratified by age and sex, risk scores were plausible and captured patterns expected in the general population. The use of sequential HDI was shown to be a suitable approach to multivariate missingness in EMR data. Sequential HDI could benefit population health research by providing a straightforward, computationally nonintensive approach to missing EMR data that results in a single analytic data set.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在人口健康研究中对缺失的医疗保健系统数据进行序列热甲板估算。
电子病历(EMR)数据为人口健康研究提供了许多机会。由于关键患者变量的遗漏比例较高,因此在人口风险模型中使用电子病历数据可能会受到阻碍。对于某些需要单一、完整分析数据集的人群健康计划来说,完整病例分析和多重估算等常见方法可能并不合适。在本研究中,我们展示了一种顺序热甲板归因(HDI)程序,用于解决 EMR 数据集中一组心脏代谢指标的缺失问题。我们评估了连续 HDI 在单个变量和常用综合风险评分中的表现。我们使用了基于两家大型城市医院 EMR 数据的心脏代谢指标数据集来创建模拟缺失的基准数据集。应用序列 HDI,所得数据用于计算动脉粥样硬化性心血管疾病风险评分。使用一组指标评估了估算方法的性能,以评价估算数据的分布和有效性。在 567,841 名患者中,65% 的患者至少有一项心血管代谢指标缺失。顺序 HDI 使变量和风险评分的分布反映了模拟数据的分布,同时保留了相关性。按年龄和性别分层后,风险评分是合理的,并反映了普通人群的预期模式。结果表明,使用序列 HDI 是解决 EMR 数据中多变量缺失的一种合适方法。序列式 HDI 为缺失的 EMR 数据提供了一种直接、计算不密集的方法,可产生单一的分析数据集,从而有利于人口健康研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Medical Care
Medical Care 医学-公共卫生、环境卫生与职业卫生
CiteScore
5.20
自引率
3.30%
发文量
228
审稿时长
3-8 weeks
期刊介绍: Rated as one of the top ten journals in healthcare administration, Medical Care is devoted to all aspects of the administration and delivery of healthcare. This scholarly journal publishes original, peer-reviewed papers documenting the most current developments in the rapidly changing field of healthcare. This timely journal reports on the findings of original investigations into issues related to the research, planning, organization, financing, provision, and evaluation of health services.
期刊最新文献
Differences in Exposures to Adverse Childhood Experiences by Primary Source of Health Care, Behavioral Risk Factor Surveillance System 2019-2020. Health Care Access and COVID-19 Vaccination in the United States: Comment. Health Care Workers' Perspectives on Collecting Sexual Orientation and Gender Identity in the Adult Primary Care Setting. Reply to Comments on Health Care Access and COVID-19 Vaccination in the United States. Response to Letter to the Editor Regarding Our Paper, "Patient-Centered Innovation: Involving Patients in Open Social Innovation".
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1