开发决策树分类算法,预测 COVID-19 患者的死亡率。

IF 2 Q2 EMERGENCY MEDICINE International Journal of Emergency Medicine Pub Date : 2024-09-27 DOI:10.1186/s12245-024-00681-7
Zahra Mohammadi-Pirouz, Karimollah Hajian-Tilaki, Mahmoud Sadeghi Haddat-Zavareh, Abazar Amoozadeh, Shabnam Bahrami
{"title":"开发决策树分类算法,预测 COVID-19 患者的死亡率。","authors":"Zahra Mohammadi-Pirouz, Karimollah Hajian-Tilaki, Mahmoud Sadeghi Haddat-Zavareh, Abazar Amoozadeh, Shabnam Bahrami","doi":"10.1186/s12245-024-00681-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>The accurate prediction of COVID-19 mortality risk, considering influencing factors, is crucial in guiding effective public policies to alleviate the strain on the healthcare system. As such, this study aimed to assess the efficacy of decision tree algorithms (CART, C5.0, and CHAID) in predicting COVID-19 mortality risk and compare their performance with that of the logistic model.</p><p><strong>Methods: </strong>This retrospective cohort study examined 5080 cases of COVID-19 in Babol, a city in northern Iran, who tested positive for the virus via PCR from March 2020 to March 2022. In order to check the validity of the findings, the data was randomly divided into an 80% training set and a 20% testing set. The prediction models, such as Logistic regression models and decision tree algorithms, were trained on the 80% training data and tested on the 20% testing data. The accuracy of these methods for the test samples was assessed using measures like ROC curve, sensitivity, specificity, and AUC.</p><p><strong>Results: </strong>The findings revealed that the mortality rate for COVID-19 patients who were admitted to hospitals was 7.7%. Through cross validation, it was determined that the CHAID algorithm outperformed other decision tree and logistic regression algorithms in specificity, and precision but not sensitivity in predicting the risk of COVID-19 mortality. The CHAID algorithm demonstrated a specificity, precision, accuracy, and F-score of 0.98, 0.70, 0.95, and 0.52 respectively. All models indicated that factors such as ICU hospitalization, intubation, age, kidney disease, BUN, CRP, WBC, NLR, O2 sat, and hemoglobin were among the factors that influenced the mortality rate of COVID-19 patients.</p><p><strong>Conclusions: </strong>The CART and C5.0 models had outperformed in sensitivity but CHAID demonstrates a better performance compared to other decision tree algorithms in specificity, precision, accuracy and shows a slight improvement over the logistic regression method in predicting the risk of COVID-19 mortality in the population under study.</p>","PeriodicalId":13967,"journal":{"name":"International Journal of Emergency Medicine","volume":null,"pages":null},"PeriodicalIF":2.0000,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11438402/pdf/","citationCount":"0","resultStr":"{\"title\":\"Development of decision tree classification algorithms in predicting mortality of COVID-19 patients.\",\"authors\":\"Zahra Mohammadi-Pirouz, Karimollah Hajian-Tilaki, Mahmoud Sadeghi Haddat-Zavareh, Abazar Amoozadeh, Shabnam Bahrami\",\"doi\":\"10.1186/s12245-024-00681-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>The accurate prediction of COVID-19 mortality risk, considering influencing factors, is crucial in guiding effective public policies to alleviate the strain on the healthcare system. As such, this study aimed to assess the efficacy of decision tree algorithms (CART, C5.0, and CHAID) in predicting COVID-19 mortality risk and compare their performance with that of the logistic model.</p><p><strong>Methods: </strong>This retrospective cohort study examined 5080 cases of COVID-19 in Babol, a city in northern Iran, who tested positive for the virus via PCR from March 2020 to March 2022. In order to check the validity of the findings, the data was randomly divided into an 80% training set and a 20% testing set. The prediction models, such as Logistic regression models and decision tree algorithms, were trained on the 80% training data and tested on the 20% testing data. The accuracy of these methods for the test samples was assessed using measures like ROC curve, sensitivity, specificity, and AUC.</p><p><strong>Results: </strong>The findings revealed that the mortality rate for COVID-19 patients who were admitted to hospitals was 7.7%. Through cross validation, it was determined that the CHAID algorithm outperformed other decision tree and logistic regression algorithms in specificity, and precision but not sensitivity in predicting the risk of COVID-19 mortality. The CHAID algorithm demonstrated a specificity, precision, accuracy, and F-score of 0.98, 0.70, 0.95, and 0.52 respectively. All models indicated that factors such as ICU hospitalization, intubation, age, kidney disease, BUN, CRP, WBC, NLR, O2 sat, and hemoglobin were among the factors that influenced the mortality rate of COVID-19 patients.</p><p><strong>Conclusions: </strong>The CART and C5.0 models had outperformed in sensitivity but CHAID demonstrates a better performance compared to other decision tree algorithms in specificity, precision, accuracy and shows a slight improvement over the logistic regression method in predicting the risk of COVID-19 mortality in the population under study.</p>\",\"PeriodicalId\":13967,\"journal\":{\"name\":\"International Journal of Emergency Medicine\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2024-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11438402/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Emergency Medicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1186/s12245-024-00681-7\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"EMERGENCY MEDICINE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Emergency Medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s12245-024-00681-7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"EMERGENCY MEDICINE","Score":null,"Total":0}
引用次数: 0

摘要

导言:考虑到各种影响因素,准确预测 COVID-19 的死亡风险对于指导有效的公共政策以减轻医疗系统的压力至关重要。因此,本研究旨在评估决策树算法(CART、C5.0 和 CHAID)在预测 COVID-19 死亡风险方面的功效,并比较其与逻辑模型的性能:这项回顾性队列研究调查了伊朗北部城市巴博勒的 5080 例 COVID-19 病例,这些病例在 2020 年 3 月至 2022 年 3 月期间通过 PCR 检测出病毒呈阳性。为了检验研究结果的有效性,研究人员将数据随机分为 80% 的训练集和 20% 的测试集。逻辑回归模型和决策树算法等预测模型在 80% 的训练数据上进行了训练,并在 20% 的测试数据上进行了测试。使用 ROC 曲线、灵敏度、特异性和 AUC 等指标评估了这些方法对测试样本的准确性:研究结果显示,COVID-19 住院患者的死亡率为 7.7%。通过交叉验证,确定 CHAID 算法在预测 COVID-19 死亡风险方面的特异性和精确性优于其他决策树算法和逻辑回归算法,但灵敏度不佳。CHAID 算法的特异性、精确性、准确性和 F 值分别为 0.98、0.70、0.95 和 0.52。所有模型都表明,ICU住院、插管、年龄、肾病、BUN、CRP、WBC、NLR、O2 饱和度和血红蛋白等因素都是影响 COVID-19 患者死亡率的因素:CART和C5.0模型的灵敏度优于其他决策树算法,但CHAID在特异性、精确性和准确性方面的表现优于其他决策树算法,而且在预测研究人群COVID-19死亡风险方面,CHAID比逻辑回归方法略有改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Development of decision tree classification algorithms in predicting mortality of COVID-19 patients.

Introduction: The accurate prediction of COVID-19 mortality risk, considering influencing factors, is crucial in guiding effective public policies to alleviate the strain on the healthcare system. As such, this study aimed to assess the efficacy of decision tree algorithms (CART, C5.0, and CHAID) in predicting COVID-19 mortality risk and compare their performance with that of the logistic model.

Methods: This retrospective cohort study examined 5080 cases of COVID-19 in Babol, a city in northern Iran, who tested positive for the virus via PCR from March 2020 to March 2022. In order to check the validity of the findings, the data was randomly divided into an 80% training set and a 20% testing set. The prediction models, such as Logistic regression models and decision tree algorithms, were trained on the 80% training data and tested on the 20% testing data. The accuracy of these methods for the test samples was assessed using measures like ROC curve, sensitivity, specificity, and AUC.

Results: The findings revealed that the mortality rate for COVID-19 patients who were admitted to hospitals was 7.7%. Through cross validation, it was determined that the CHAID algorithm outperformed other decision tree and logistic regression algorithms in specificity, and precision but not sensitivity in predicting the risk of COVID-19 mortality. The CHAID algorithm demonstrated a specificity, precision, accuracy, and F-score of 0.98, 0.70, 0.95, and 0.52 respectively. All models indicated that factors such as ICU hospitalization, intubation, age, kidney disease, BUN, CRP, WBC, NLR, O2 sat, and hemoglobin were among the factors that influenced the mortality rate of COVID-19 patients.

Conclusions: The CART and C5.0 models had outperformed in sensitivity but CHAID demonstrates a better performance compared to other decision tree algorithms in specificity, precision, accuracy and shows a slight improvement over the logistic regression method in predicting the risk of COVID-19 mortality in the population under study.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.60
自引率
0.00%
发文量
63
审稿时长
13 weeks
期刊介绍: The aim of the journal is to bring to light the various clinical advancements and research developments attained over the world and thus help the specialty forge ahead. It is directed towards physicians and medical personnel undergoing training or working within the field of Emergency Medicine. Medical students who are interested in pursuing a career in Emergency Medicine will also benefit from the journal. This is particularly useful for trainees in countries where the specialty is still in its infancy. Disciplines covered will include interesting clinical cases, the latest evidence-based practice and research developments in Emergency medicine including emergency pediatrics.
期刊最新文献
The need for implementing a standardized, evidence-based emergency department discharge plan for optimizing adult asthma patient outcomes in the UAE, expert meeting report. Peri-injury symptomatology as predictors of brain computed tomography (CT) scan abnormalities in mild traumatic brain injury (mTBI). Epidemiology and outcomes of critically ill patients in the emergency department of a tertiary teaching hospital in Rwanda. How many is enough? Measuring the number of FAST exams needed by emergency medicine trainees to reach competence. Right iliac deep vein thrombosis and pulmonary embolism associated with recreational nitrous oxide: a case report.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1