An Ensemble Machine Learning Model for the Early Detection of Sepsis from Clinical Data

Mengsha Fu, Jiabin Yuan, Menglin Lu, Pengfei Hong, M. Zeng
{"title":"An Ensemble Machine Learning Model for the Early Detection of Sepsis from Clinical Data","authors":"Mengsha Fu, Jiabin Yuan, Menglin Lu, Pengfei Hong, M. Zeng","doi":"10.22489/cinc.2019.317","DOIUrl":null,"url":null,"abstract":"Sepsis is a life-threatening disease with high mortality and expensive cost of treatment. In order to improve the outcomes of patients, it is important to detect atrisk patients with sepsis at an early stage. The PhysioNet/Computing in Cardiology Challenge 2019 focused on improving predicting sepsis six hours before the clinical diagnosis by using the latest definition of Sepsis-3. A total of 40,336 ICU patients were provided as public training data, A hidden test dataset was used to evaluate. An ensemble model, which combined boosting and bagging tree models (lightgbm, xgboost and random forest ) were designed to predict sepsis based on the records of the patient’s hourly data. We compared the ensemble model and each single model of evaluation metrics results on selected inner test data Offline, the best performance was achieved AUC of 0.792, ACC of 0.727. Finally, the proposed model was evaluated on the full test sets received an official utility score, defined by the organizers, was 0.087, ranked 75/105 (our team name: cinc sepsis pass). While the single model of lightgbm only received a utility score of -0.036. The ensemble model utilized the preprocessing data and achieved better performance than a single tree-based model.","PeriodicalId":6716,"journal":{"name":"2019 Computing in Cardiology Conference (CinC)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Computing in Cardiology Conference (CinC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22489/cinc.2019.317","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

Sepsis is a life-threatening disease with high mortality and expensive cost of treatment. In order to improve the outcomes of patients, it is important to detect atrisk patients with sepsis at an early stage. The PhysioNet/Computing in Cardiology Challenge 2019 focused on improving predicting sepsis six hours before the clinical diagnosis by using the latest definition of Sepsis-3. A total of 40,336 ICU patients were provided as public training data, A hidden test dataset was used to evaluate. An ensemble model, which combined boosting and bagging tree models (lightgbm, xgboost and random forest ) were designed to predict sepsis based on the records of the patient’s hourly data. We compared the ensemble model and each single model of evaluation metrics results on selected inner test data Offline, the best performance was achieved AUC of 0.792, ACC of 0.727. Finally, the proposed model was evaluated on the full test sets received an official utility score, defined by the organizers, was 0.087, ranked 75/105 (our team name: cinc sepsis pass). While the single model of lightgbm only received a utility score of -0.036. The ensemble model utilized the preprocessing data and achieved better performance than a single tree-based model.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
从临床数据中早期检测败血症的集成机器学习模型
败血症是一种危及生命的疾病,死亡率高,治疗费用昂贵。为了改善患者的预后,早期发现脓毒症患者的危险是很重要的。2019年PhysioNet/Computing in Cardiology挑战赛的重点是通过使用败血症-3的最新定义,在临床诊断前6小时提高对败血症的预测。共提供40336例ICU患者作为公开训练数据,采用隐式测试数据集进行评估。设计了一个集合模型,结合了增强和bagging树模型(lightgbm, xgboost和random forest),根据患者每小时的数据记录来预测脓毒症。在选取的内部测试数据上,将集成模型与各单一模型的评价指标结果进行了离线比较,获得了最佳性能的AUC为0.792,ACC为0.727。最后,提出的模型在完整的测试集上进行评估,得到由组织者定义的官方效用得分,为0.087,排名75/105(我们的团队名称:cinc sepsis pass)。而光基单模型的效用得分仅为-0.036。该集成模型利用了预处理数据,比单一的基于树的模型取得了更好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Effects of Reducing the Number of Leads from Body Surface Potential Mapping in Computer Models of Atrial Arrhythmias Autonomic Nervous System Response to Heat Stress Exposure by Means of Heart Rate Variability Automatic Emotions Assessment Using Heart Rate Variability Analysis and 2D Regression Model of Emotions A New Graphical Method for Reporting Performance Results of a Diagnostic Test A Low Dimensional Algorithm for Detection of Sepsis from Electronic Medical Record Data
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1