Cracking the “Sepsis” Code: Assessing Time Series Nature of EHR Data, and Using Deep Learning for Early Sepsis Prediction

Soodabeh Sarafrazi, R. Choudhari, Chiral Mehta, H. Mehta, Omid K. Japalaghi, Jie Han, Kinjal A Mehta, H. Han, P. Francis-Lyon
{"title":"Cracking the “Sepsis” Code: Assessing Time Series Nature of EHR Data, and Using Deep Learning for Early Sepsis Prediction","authors":"Soodabeh Sarafrazi, R. Choudhari, Chiral Mehta, H. Mehta, Omid K. Japalaghi, Jie Han, Kinjal A Mehta, H. Han, P. Francis-Lyon","doi":"10.23919/CinC49843.2019.9005940","DOIUrl":null,"url":null,"abstract":"On a yearly basis, sepsis costs US hospitals more than any other health condition. A majority of patients who suffer from sepsis are not diagnosed at the time of admission. Early detection and antibiotic treatment of sepsis are vital to improve outcomes for these patients, as each hour of delayed treatment is associated with increased mortality. In this study our goal is to predict sepsis 12 hours before its diagnosis using vitals and blood tests routinely taken in the ICU. We have investigated the performance of several machine learning algorithms including XGBoost, CNN, CNN-LSTM and CNN-XGBoost. Contrary to our expectations, XGBoost outperforms all of the sequential models and yields the best hour-by-hour prediction, perhaps due to the way we imputed missing values, losing signal that relates to the time-series nature of the EHR data. We added feature engineering to detect change points in tests and vitals, resulting in 5% improvement in XGBoost. Our team, USF-Sepsis-Phys, achieved a utility score of 0.22 (untuned threshold) and an average of the three reported AUCs (test sets A, B, C) of 0.82. As expected with this AUC, the same model with tuned threshold (not run in the PhysioNet challenge) performed significantly better, as evaluated with 3-fold cross-validation of the entire PhyisoNet training set.","PeriodicalId":6697,"journal":{"name":"2019 Computing in Cardiology (CinC)","volume":"42 1","pages":"Page 1-Page 4"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Computing in Cardiology (CinC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/CinC49843.2019.9005940","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

On a yearly basis, sepsis costs US hospitals more than any other health condition. A majority of patients who suffer from sepsis are not diagnosed at the time of admission. Early detection and antibiotic treatment of sepsis are vital to improve outcomes for these patients, as each hour of delayed treatment is associated with increased mortality. In this study our goal is to predict sepsis 12 hours before its diagnosis using vitals and blood tests routinely taken in the ICU. We have investigated the performance of several machine learning algorithms including XGBoost, CNN, CNN-LSTM and CNN-XGBoost. Contrary to our expectations, XGBoost outperforms all of the sequential models and yields the best hour-by-hour prediction, perhaps due to the way we imputed missing values, losing signal that relates to the time-series nature of the EHR data. We added feature engineering to detect change points in tests and vitals, resulting in 5% improvement in XGBoost. Our team, USF-Sepsis-Phys, achieved a utility score of 0.22 (untuned threshold) and an average of the three reported AUCs (test sets A, B, C) of 0.82. As expected with this AUC, the same model with tuned threshold (not run in the PhysioNet challenge) performed significantly better, as evaluated with 3-fold cross-validation of the entire PhyisoNet training set.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
破解“败血症”代码:评估电子病历数据的时间序列性质,并使用深度学习进行早期败血症预测
每年,败血症给美国医院造成的损失超过其他任何健康状况。大多数患有败血症的患者在入院时没有得到诊断。败血症的早期发现和抗生素治疗对于改善这些患者的预后至关重要,因为每延迟治疗一小时,死亡率就会增加。在这项研究中,我们的目标是在诊断前12小时通过ICU例行的生命体征和血液检查来预测败血症。我们研究了几种机器学习算法的性能,包括XGBoost、CNN、CNN- lstm和CNN-XGBoost。与我们的预期相反,XGBoost优于所有序列模型,并产生最佳的逐小时预测,这可能是由于我们输入缺失值的方式,丢失了与EHR数据的时间序列特性相关的信号。我们添加了特征工程来检测测试和生命体征中的变化点,从而使XGBoost提高了5%。我们的团队usf -脓毒症- phys的效用得分为0.22(未调优阈值),三个报告的auc(测试集a, B, C)的平均值为0.82。正如预期的那样,在这个AUC中,调优阈值的相同模型(不在PhysioNet挑战中运行)表现明显更好,正如对整个PhysioNet训练集进行3倍交叉验证所评估的那样。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
An Automated Approach Based on a Convolutional Neural Network for Left Atrium Segmentation From Late Gadolinium Enhanced Magnetic Resonance Imaging Multiobjective Optimization Approach to Localization of Ectopic Beats by Single Dipole: Case Study Sepsis Prediction in Intensive Care Unit Using Ensemble of XGboost Models A Comparative Analysis of HMM and CRF for Early Prediction of Sepsis Blocking L-Type Calcium Current Reduces Vulnerability to Re-Entry in Human iPSC-Derived Cardiomyocytes Tissue
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1