Thomas Wetere Tulu, Tsz Kin Wan, Ching Long Chan, Chun Hei Wu, Peter Yat Ming Woo, Cee Zhung Steven Tseng, Asmir Vodencarevic, Cristina Menni, Kei Hang Katie Chan
{"title":"Machine learning-based prediction of COVID-19 mortality using immunological and metabolic biomarkers.","authors":"Thomas Wetere Tulu, Tsz Kin Wan, Ching Long Chan, Chun Hei Wu, Peter Yat Ming Woo, Cee Zhung Steven Tseng, Asmir Vodencarevic, Cristina Menni, Kei Hang Katie Chan","doi":"10.1186/s44247-022-00001-0","DOIUrl":null,"url":null,"abstract":"<p><p><b>COVID-19 mortality prediction</b> <b>Background</b> COVID-19 has become a major global public health problem, despite prevention and efforts. The daily number of COVID-19 cases rapidly increases, and the time and financial costs associated with testing procedure are burdensome. <b>Method</b> To overcome this, we aim to identify immunological and metabolic biomarkers to predict COVID-19 mortality using a machine learning model. We included inpatients from Hong Kong's public hospitals between January 1, and September 30, 2020, who were diagnosed with COVID-19 using RT-PCR. We developed three machine learning models to predict the mortality of COVID-19 patients based on data in their electronic medical records. We performed statistical analysis to compare the trained machine learning models which are Deep Neural Networks (DNN), Random Forest Classifier (RF) and Support Vector Machine (SVM) using data from a cohort of 5,059 patients (median age = 46 years; 49.3% male) who had tested positive for COVID-19 based on electronic health records and data from 532,427 patients as controls. <b>Result</b> We identified top 20 immunological and metabolic biomarkers that can accurately predict the risk of mortality from COVID-19 with ROC-AUC of 0.98 (95% CI 0.96-0.98). Of the three models used, our result demonstrate that the random forest (RF) model achieved the most accurate prediction of mortality among COVID-19 patients with age, glomerular filtration, albumin, urea, procalcitonin, c-reactive protein, oxygen, bicarbonate, carbon dioxide, ferritin, glucose, erythrocytes, creatinine, lymphocytes, PH of blood and leukocytes among the most important biomarkers identified. A cohort from Kwong Wah Hospital (131 patients) was used for model validation with ROC-AUC of 0.90 (95% CI 0.84-0.92). <b>Conclusion</b> We recommend physicians closely monitor hematological, coagulation, cardiac, hepatic, renal and inflammatory factors for potential progression to severe conditions among COVID-19 patients. To the best of our knowledge, no previous research has identified important immunological and metabolic biomarkers to the extent demonstrated in our study.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1186/s44247-022-00001-0.</p>","PeriodicalId":72426,"journal":{"name":"BMC digital health","volume":" ","pages":"6"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9896457/pdf/","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC digital health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s44247-022-00001-0","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/2/3 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
COVID-19 mortality predictionBackground COVID-19 has become a major global public health problem, despite prevention and efforts. The daily number of COVID-19 cases rapidly increases, and the time and financial costs associated with testing procedure are burdensome. Method To overcome this, we aim to identify immunological and metabolic biomarkers to predict COVID-19 mortality using a machine learning model. We included inpatients from Hong Kong's public hospitals between January 1, and September 30, 2020, who were diagnosed with COVID-19 using RT-PCR. We developed three machine learning models to predict the mortality of COVID-19 patients based on data in their electronic medical records. We performed statistical analysis to compare the trained machine learning models which are Deep Neural Networks (DNN), Random Forest Classifier (RF) and Support Vector Machine (SVM) using data from a cohort of 5,059 patients (median age = 46 years; 49.3% male) who had tested positive for COVID-19 based on electronic health records and data from 532,427 patients as controls. Result We identified top 20 immunological and metabolic biomarkers that can accurately predict the risk of mortality from COVID-19 with ROC-AUC of 0.98 (95% CI 0.96-0.98). Of the three models used, our result demonstrate that the random forest (RF) model achieved the most accurate prediction of mortality among COVID-19 patients with age, glomerular filtration, albumin, urea, procalcitonin, c-reactive protein, oxygen, bicarbonate, carbon dioxide, ferritin, glucose, erythrocytes, creatinine, lymphocytes, PH of blood and leukocytes among the most important biomarkers identified. A cohort from Kwong Wah Hospital (131 patients) was used for model validation with ROC-AUC of 0.90 (95% CI 0.84-0.92). Conclusion We recommend physicians closely monitor hematological, coagulation, cardiac, hepatic, renal and inflammatory factors for potential progression to severe conditions among COVID-19 patients. To the best of our knowledge, no previous research has identified important immunological and metabolic biomarkers to the extent demonstrated in our study.
Supplementary information: The online version contains supplementary material available at 10.1186/s44247-022-00001-0.
背景尽管进行了预防和努力,COVID-19已成为一个重大的全球公共卫生问题。每天的COVID-19病例数量迅速增加,与检测程序相关的时间和财务成本令人负担沉重。为了克服这一问题,我们的目标是利用机器学习模型识别免疫和代谢生物标志物来预测COVID-19的死亡率。我们纳入了2020年1月1日至9月30日期间通过RT-PCR诊断为COVID-19的香港公立医院住院患者。我们开发了三个机器学习模型,根据电子病历中的数据预测COVID-19患者的死亡率。我们对深度神经网络(DNN)、随机森林分类器(RF)和支持向量机(SVM)训练的机器学习模型进行了统计分析,使用来自5059例患者(中位年龄= 46岁;根据电子健康记录和532,427名患者作为对照的数据,他们的COVID-19检测呈阳性。结果我们确定了20个能够准确预测COVID-19死亡风险的免疫和代谢生物标志物,ROC-AUC为0.98 (95% CI 0.96-0.98)。在使用的三种模型中,我们的结果表明随机森林(RF)模型在年龄、肾小球滤过、白蛋白、尿素、降钙素原、c反应蛋白、氧气、碳酸氢盐、二氧化碳、铁蛋白、葡萄糖、红细胞、肌酐、淋巴细胞、血液PH和白细胞等最重要的生物标志物中能够最准确地预测COVID-19患者的死亡率。采用来自广华医院的队列(131例患者)进行模型验证,ROC-AUC为0.90 (95% CI 0.84-0.92)。结论建议医师密切监测COVID-19患者的血液学、凝血、心、肝、肾和炎症因素,以防病情恶化。据我们所知,没有先前的研究确定了重要的免疫和代谢生物标志物,在我们的研究中证明了这一点。补充资料:在线版本提供补充资料,网址为10.1186/s44247-022-00001-0。