Development and external validation of multimodal postoperative acute kidney injury risk machine learning models

IF 2.5 Q2 HEALTH CARE SCIENCES & SERVICES JAMIA Open Pub Date : 2023-12-01 DOI:10.1093/jamiaopen/ooad109
G. Karway, J. Koyner, John Caskey, Alexandra B Spicer, Kyle A. Carey, Emily R. Gilbert, D. Dligach, A. Mayampurath, Majid Afshar, M. Churpek
{"title":"Development and external validation of multimodal postoperative acute kidney injury risk machine learning models","authors":"G. Karway, J. Koyner, John Caskey, Alexandra B Spicer, Kyle A. Carey, Emily R. Gilbert, D. Dligach, A. Mayampurath, Majid Afshar, M. Churpek","doi":"10.1093/jamiaopen/ooad109","DOIUrl":null,"url":null,"abstract":"Abstract Objectives To develop and externally validate machine learning models using structured and unstructured electronic health record data to predict postoperative acute kidney injury (AKI) across inpatient settings. Materials and Methods Data for adult postoperative admissions to the Loyola University Medical Center (2009-2017) were used for model development and admissions to the University of Wisconsin-Madison (2009-2020) were used for validation. Structured features included demographics, vital signs, laboratory results, and nurse-documented scores. Unstructured text from clinical notes were converted into concept unique identifiers (CUIs) using the clinical Text Analysis and Knowledge Extraction System. The primary outcome was the development of Kidney Disease Improvement Global Outcomes stage 2 AKI within 7 days after leaving the operating room. We derived unimodal extreme gradient boosting machines (XGBoost) and elastic net logistic regression (GLMNET) models using structured-only data and multimodal models combining structured data with CUI features. Model comparison was performed using the receiver operating characteristic curve (AUROC), with Delong’s test for statistical differences. Results The study cohort included 138 389 adult patient admissions (mean [SD] age 58 [16] years; 11 506 [8%] African-American; and 70 826 [51%] female) across the 2 sites. Of those, 2959 (2.1%) developed stage 2 AKI or higher. Across all data types, XGBoost outperformed GLMNET (mean AUROC 0.81 [95% confidence interval (CI), 0.80-0.82] vs 0.78 [95% CI, 0.77-0.79]). The multimodal XGBoost model incorporating CUIs parameterized as term frequency-inverse document frequency (TF-IDF) showed the highest discrimination performance (AUROC 0.82 [95% CI, 0.81-0.83]) over unimodal models (AUROC 0.79 [95% CI, 0.78-0.80]). Discussion A multimodality approach with structured data and TF-IDF weighting of CUIs increased model performance over structured data-only models. Conclusion These findings highlight the predictive power of CUIs when merged with structured data for clinical prediction models, which may improve the detection of postoperative AKI.","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"60 24","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JAMIA Open","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/jamiaopen/ooad109","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract Objectives To develop and externally validate machine learning models using structured and unstructured electronic health record data to predict postoperative acute kidney injury (AKI) across inpatient settings. Materials and Methods Data for adult postoperative admissions to the Loyola University Medical Center (2009-2017) were used for model development and admissions to the University of Wisconsin-Madison (2009-2020) were used for validation. Structured features included demographics, vital signs, laboratory results, and nurse-documented scores. Unstructured text from clinical notes were converted into concept unique identifiers (CUIs) using the clinical Text Analysis and Knowledge Extraction System. The primary outcome was the development of Kidney Disease Improvement Global Outcomes stage 2 AKI within 7 days after leaving the operating room. We derived unimodal extreme gradient boosting machines (XGBoost) and elastic net logistic regression (GLMNET) models using structured-only data and multimodal models combining structured data with CUI features. Model comparison was performed using the receiver operating characteristic curve (AUROC), with Delong’s test for statistical differences. Results The study cohort included 138 389 adult patient admissions (mean [SD] age 58 [16] years; 11 506 [8%] African-American; and 70 826 [51%] female) across the 2 sites. Of those, 2959 (2.1%) developed stage 2 AKI or higher. Across all data types, XGBoost outperformed GLMNET (mean AUROC 0.81 [95% confidence interval (CI), 0.80-0.82] vs 0.78 [95% CI, 0.77-0.79]). The multimodal XGBoost model incorporating CUIs parameterized as term frequency-inverse document frequency (TF-IDF) showed the highest discrimination performance (AUROC 0.82 [95% CI, 0.81-0.83]) over unimodal models (AUROC 0.79 [95% CI, 0.78-0.80]). Discussion A multimodality approach with structured data and TF-IDF weighting of CUIs increased model performance over structured data-only models. Conclusion These findings highlight the predictive power of CUIs when merged with structured data for clinical prediction models, which may improve the detection of postoperative AKI.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
多模式术后急性肾损伤风险机器学习模型的开发和外部验证
摘要 目的 利用结构化和非结构化电子健康记录数据开发和外部验证机器学习模型,以预测不同住院环境下的术后急性肾损伤(AKI)。材料与方法 洛约拉大学医学中心(2009-2017 年)的成人术后入院数据用于模型开发,威斯康星大学麦迪逊分校(2009-2020 年)的入院数据用于验证。结构化特征包括人口统计学、生命体征、实验室结果和护士记录的评分。临床笔记中的非结构化文本通过临床文本分析和知识提取系统转换为概念唯一标识符(CUI)。主要结果是在离开手术室后 7 天内出现肾病改善全球结果 2 期 AKI。我们利用纯结构化数据推导出了单模态极端梯度提升机(XGBoost)和弹性网逻辑回归(GLMNET)模型,并结合结构化数据和 CUI 特征推导出了多模态模型。模型比较采用接收者操作特征曲线 (AUROC),并通过德龙检验法进行统计学差异检验。结果 研究队列包括两个地点收治的 138 389 名成年患者(平均 [SD] 年龄 58 [16] 岁;11 506 [8%] 非洲裔美国人;70 826 [51%] 女性)。其中 2959 人(2.1%)发展为 2 期 AKI 或以上。在所有数据类型中,XGBoost 的表现均优于 GLMNET(平均 AUROC 为 0.81 [95% 置信区间 (CI),0.80-0.82] vs 0.78 [95% CI,0.77-0.79])。与单模态模型(AUROC 0.79 [95% CI, 0.78-0.80])相比,以词频-反文档频率(TF-IDF)为参数的 CUI 多模态 XGBoost 模型显示出最高的识别性能(AUROC 0.82 [95% CI, 0.81-0.83])。讨论 与仅使用结构化数据的模型相比,使用结构化数据和 TF-IDF 加权 CUI 的多模态方法提高了模型性能。结论 这些研究结果凸显了 CUI 与结构化数据合并用于临床预测模型时的预测能力,这可能会改善术后 AKI 的检测。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
JAMIA Open
JAMIA Open Medicine-Health Informatics
CiteScore
4.10
自引率
4.80%
发文量
102
审稿时长
16 weeks
期刊最新文献
Implementation of a rule-based algorithm to find patients eligible for cancer clinical trials. Implications of mappings between International Classification of Diseases clinical diagnosis codes and Human Phenotype Ontology terms. MMFP-Tableau: enabling precision mitochondrial medicine through integration, visualization, and analytics of clinical and research health system electronic data. Addressing ethical issues in healthcare artificial intelligence using a lifecycle-informed process. Development of an evidence- and consensus-based Digital Healthcare Equity Framework.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1