Feature Selection Strategy for Intrahospital Mortality Prediction after Coronary Artery Bypass Graft Surgery on an Unbalanced Sample

K. Shakhgeldyan, B. Geltser, V. Rublev, Basil Shirobokov, Dan Geltser, A. Kriger
{"title":"Feature Selection Strategy for Intrahospital Mortality Prediction after Coronary Artery Bypass Graft Surgery on an Unbalanced Sample","authors":"K. Shakhgeldyan, B. Geltser, V. Rublev, Basil Shirobokov, Dan Geltser, A. Kriger","doi":"10.1145/3424978.3425090","DOIUrl":null,"url":null,"abstract":"The aim of the study is to develop models of intrahospital mortality (IHM) prediction on an unbalanced sample of patients with coronary artery disease (CAD) post coronary artery bypass graft (CABG) surgery. Methods. Models for IHM prediction were built following the analysis of 866 electronic case histories based on the analysis of CAD patients, revascularized with the CABG operation. The patient cohort consisted of two groups. The first included 35 (4%) patients who died within the first 30 days after CABG, the second - 831 (96%) patients with a favorable operation outcome. We analyzed 99 factors, including the results of clinical, laboratory and instrumental studies obtained before CABG. For feature compilation, classical filtering and model selection methods were used (wrapper method). The primary drawback to applying a classical approach was the unbalanced sample as one cohort only consisted of 4% of subjects. In that case, it was not possible to apply the cross-validation procedure with three types of samples, standard quality metrics and multi-category factors. Results. Features searching approach using the multi-stage selection procedure, which combined the validation of predefined predictors, filtering methods and multifactor model development based on logistic regression, random forest (RF) and artificial neural networks (ANNs) was proposed. The models' accuracy was evaluated by a combined quality metric. RF and ANNs based models allowed not only to build more accurate forecasting tools, but also assisted in verifying five additional IHM predictors.","PeriodicalId":178822,"journal":{"name":"Proceedings of the 4th International Conference on Computer Science and Application Engineering","volume":"200 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th International Conference on Computer Science and Application Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3424978.3425090","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

The aim of the study is to develop models of intrahospital mortality (IHM) prediction on an unbalanced sample of patients with coronary artery disease (CAD) post coronary artery bypass graft (CABG) surgery. Methods. Models for IHM prediction were built following the analysis of 866 electronic case histories based on the analysis of CAD patients, revascularized with the CABG operation. The patient cohort consisted of two groups. The first included 35 (4%) patients who died within the first 30 days after CABG, the second - 831 (96%) patients with a favorable operation outcome. We analyzed 99 factors, including the results of clinical, laboratory and instrumental studies obtained before CABG. For feature compilation, classical filtering and model selection methods were used (wrapper method). The primary drawback to applying a classical approach was the unbalanced sample as one cohort only consisted of 4% of subjects. In that case, it was not possible to apply the cross-validation procedure with three types of samples, standard quality metrics and multi-category factors. Results. Features searching approach using the multi-stage selection procedure, which combined the validation of predefined predictors, filtering methods and multifactor model development based on logistic regression, random forest (RF) and artificial neural networks (ANNs) was proposed. The models' accuracy was evaluated by a combined quality metric. RF and ANNs based models allowed not only to build more accurate forecasting tools, but also assisted in verifying five additional IHM predictors.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
非平衡样本冠状动脉搭桥术后院内死亡率预测的特征选择策略
本研究的目的是在不平衡的冠状动脉疾病(CAD)患者冠状动脉旁路移植术(CABG)手术后建立院内死亡率(IHM)预测模型。方法。IHM预测模型是在分析866例CAD患者的电子病历基础上建立的,这些患者通过冠脉搭桥手术重建了血管。患者队列包括两组。第一组有35例(4%)患者在CABG术后30天内死亡,第二组有831例(96%)患者手术结果良好。我们分析了99个因素,包括CABG前获得的临床、实验室和仪器研究结果。特征编译采用经典滤波和模型选择方法(包装法)。应用经典方法的主要缺点是样本不平衡,因为一个队列只包括4%的受试者。在这种情况下,不可能对三种类型的样品、标准质量度量和多类别因素应用交叉验证程序。结果。提出了基于逻辑回归、随机森林(RF)和人工神经网络(ann)的多因素模型开发,结合预定义预测因子验证、滤波方法和多阶段选择过程的特征搜索方法。模型的准确性通过综合质量度量来评估。基于射频和人工神经网络的模型不仅可以建立更准确的预测工具,还可以帮助验证另外五个IHM预测器。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Study on Improved Algorithm of RSSI Correction and Location in Mine-well Based on Bluetooth Positioning Information Distributed Predefined-time Consensus Tracking Protocol for Multi-agent Systems Evaluation Method Study of Blog's Subject Influence and User's Subject Influence Performance Evaluation of Full Turnover-based Policy in the Flow-rack AS/RS A Hybrid Encoding Based Particle Swarm Optimizer for Feature Selection and Classification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1