Conceptualizing bias in EHR data: A case study in performance disparities by demographic subgroups for a pediatric obesity incidence classifier.

PLOS digital health Pub Date : 2024-10-23 eCollection Date: 2024-10-01 DOI:10.1371/journal.pdig.0000642
Elizabeth A Campbell, Saurav Bose, Aaron J Masino
{"title":"Conceptualizing bias in EHR data: A case study in performance disparities by demographic subgroups for a pediatric obesity incidence classifier.","authors":"Elizabeth A Campbell, Saurav Bose, Aaron J Masino","doi":"10.1371/journal.pdig.0000642","DOIUrl":null,"url":null,"abstract":"<p><p>Electronic Health Records (EHRs) are increasingly used to develop machine learning models in predictive medicine. There has been limited research on utilizing machine learning methods to predict childhood obesity and related disparities in classifier performance among vulnerable patient subpopulations. In this work, classification models are developed to recognize pediatric obesity using temporal condition patterns obtained from patient EHR data in a U.S. study population. We trained four machine learning algorithms (Logistic Regression, Random Forest, Gradient Boosted Trees, and Neural Networks) to classify cases and controls as obesity positive or negative, and optimized hyperparameter settings through a bootstrapping methodology. To assess the classifiers for bias, we studied model performance by population subgroups then used permutation analysis to identify the most predictive features for each model and the demographic characteristics of patients with these features. Mean AUC-ROC values were consistent across classifiers, ranging from 0.72-0.80. Some evidence of bias was identified, although this was through the models performing better for minority subgroups (African Americans and patients enrolled in Medicaid). Permutation analysis revealed that patients from vulnerable population subgroups were over-represented among patients with the most predictive diagnostic patterns. We hypothesize that our models performed better on under-represented groups because the features more strongly associated with obesity were more commonly observed among minority patients. These findings highlight the complex ways that bias may arise in machine learning models and can be incorporated into future research to develop a thorough analytical approach to identify and mitigate bias that may arise from features and within EHR datasets when developing more equitable models.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 10","pages":"e0000642"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11498669/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLOS digital health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1371/journal.pdig.0000642","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/10/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Electronic Health Records (EHRs) are increasingly used to develop machine learning models in predictive medicine. There has been limited research on utilizing machine learning methods to predict childhood obesity and related disparities in classifier performance among vulnerable patient subpopulations. In this work, classification models are developed to recognize pediatric obesity using temporal condition patterns obtained from patient EHR data in a U.S. study population. We trained four machine learning algorithms (Logistic Regression, Random Forest, Gradient Boosted Trees, and Neural Networks) to classify cases and controls as obesity positive or negative, and optimized hyperparameter settings through a bootstrapping methodology. To assess the classifiers for bias, we studied model performance by population subgroups then used permutation analysis to identify the most predictive features for each model and the demographic characteristics of patients with these features. Mean AUC-ROC values were consistent across classifiers, ranging from 0.72-0.80. Some evidence of bias was identified, although this was through the models performing better for minority subgroups (African Americans and patients enrolled in Medicaid). Permutation analysis revealed that patients from vulnerable population subgroups were over-represented among patients with the most predictive diagnostic patterns. We hypothesize that our models performed better on under-represented groups because the features more strongly associated with obesity were more commonly observed among minority patients. These findings highlight the complex ways that bias may arise in machine learning models and can be incorporated into future research to develop a thorough analytical approach to identify and mitigate bias that may arise from features and within EHR datasets when developing more equitable models.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
将电子病历数据中的偏差概念化:儿科肥胖症发病率分类器的人口亚群绩效差异案例研究。
电子健康记录(EHR)越来越多地被用于开发预测医学中的机器学习模型。在利用机器学习方法预测儿童肥胖症以及易受影响的患者亚群中分类器性能的相关差异方面,研究还很有限。在这项工作中,我们开发了分类模型,利用从美国研究人群的患者电子病历数据中获得的时间条件模式来识别小儿肥胖症。我们训练了四种机器学习算法(逻辑回归、随机森林、梯度提升树和神经网络),将病例和对照组划分为肥胖阳性或阴性,并通过引导方法优化了超参数设置。为了评估分类器的偏差,我们研究了不同人群亚群的模型性能,然后使用置换分析确定了每个模型最具预测性的特征以及具有这些特征的患者的人口统计学特征。不同分类器的平均 AUC-ROC 值一致,范围在 0.72-0.80 之间。发现了一些偏倚的证据,但这是通过模型对少数族裔亚群(非裔美国人和参加医疗补助的患者)的表现更好而发现的。置换分析表明,弱势人群亚群的患者在最具预测性诊断模式的患者中比例过高。我们假设,我们的模型在代表性不足的群体中表现更佳,因为在少数群体患者中更常观察到与肥胖关联更强的特征。这些发现凸显了机器学习模型中可能出现偏差的复杂方式,可将其纳入未来的研究中,以开发一种全面的分析方法,在开发更公平的模型时,识别并减轻可能来自特征和电子病历数据集的偏差。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Opportunities to design better computer vison-assisted food diaries to support individuals and experts in dietary assessment: An observation and interview study with nutrition experts. Deep learning-based screening for locomotive syndrome using single-camera walking video: Development and validation study. A recurrent neural network and parallel hidden Markov model algorithm to segment and detect heart murmurs in phonocardiograms. On-site electronic consent in pediatrics using generic Informed Consent Service (gICS): Creating a specialized setup and collecting consent data. A feature-based qualitative assessment of smoking cessation mobile applications.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1