{"title":"Data Modeling Using Vital Sign Dynamics for In-hospital Mortality Classification in Patients with Acute Coronary Syndrome.","authors":"Sarawuth Limprasert, Ajchara Phu-Ang","doi":"10.4258/hir.2023.29.2.120","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>This study compared feature selection by machine learning or expert recommendation in the performance of classification models for in-hospital mortality among patients with acute coronary syndrome (ACS) who underwent percutaneous coronary intervention (PCI).</p><p><strong>Methods: </strong>A dataset of 1,123 patients with ACS who underwent PCI was analyzed. After assigning 80% of instances to the training set through random splitting, we performed feature scaling and resampling with the synthetic minority over-sampling technique and Tomek link method. We compared two feature selection.</p><p><strong>Methods: </strong>recursive feature elimination with cross-validation (RFECV) and selection by interventional cardiologists. We used five simple models: support vector machine (SVM), random forest, decision tree, logistic regression, and artificial neural network. The performance metrics were accuracy, recall, and the false-negative rate, measured with 10-fold cross-validation in the training set and validated in the test set.</p><p><strong>Results: </strong>Patients' mean age was 66.22 ± 12.88 years, and 33.63% had ST-elevation ACS. Fifteen of 34 features were selected as important with the RFECV method, while the experts chose 11 features. All models with feature selection by RFECV had higher accuracy than the models with expert-chosen features. In the training set, the random forest model had the highest accuracy (0.96 ± 0.01) and recall (0.97 ± 0.02). After validation in the test set, the SVM model displayed the highest accuracy (0.81) and a recall of 0.61.</p><p><strong>Conclusions: </strong>Models with feature selection by RFECV had higher accuracy than those with feature selection by experts in identifying patients with ACS at high risk for in-hospital mortality.</p>","PeriodicalId":12947,"journal":{"name":"Healthcare Informatics Research","volume":null,"pages":null},"PeriodicalIF":2.3000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/1d/67/hir-2023-29-2-120.PMC10209722.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Healthcare Informatics Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4258/hir.2023.29.2.120","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives: This study compared feature selection by machine learning or expert recommendation in the performance of classification models for in-hospital mortality among patients with acute coronary syndrome (ACS) who underwent percutaneous coronary intervention (PCI).
Methods: A dataset of 1,123 patients with ACS who underwent PCI was analyzed. After assigning 80% of instances to the training set through random splitting, we performed feature scaling and resampling with the synthetic minority over-sampling technique and Tomek link method. We compared two feature selection.
Methods: recursive feature elimination with cross-validation (RFECV) and selection by interventional cardiologists. We used five simple models: support vector machine (SVM), random forest, decision tree, logistic regression, and artificial neural network. The performance metrics were accuracy, recall, and the false-negative rate, measured with 10-fold cross-validation in the training set and validated in the test set.
Results: Patients' mean age was 66.22 ± 12.88 years, and 33.63% had ST-elevation ACS. Fifteen of 34 features were selected as important with the RFECV method, while the experts chose 11 features. All models with feature selection by RFECV had higher accuracy than the models with expert-chosen features. In the training set, the random forest model had the highest accuracy (0.96 ± 0.01) and recall (0.97 ± 0.02). After validation in the test set, the SVM model displayed the highest accuracy (0.81) and a recall of 0.61.
Conclusions: Models with feature selection by RFECV had higher accuracy than those with feature selection by experts in identifying patients with ACS at high risk for in-hospital mortality.