Comparing different machine learning techniques for predicting COVID-19 severity.

IF 4.8 1区 医学 Q1 INFECTIOUS DISEASES Infectious Diseases of Poverty Pub Date : 2022-02-17 DOI:10.1186/s40249-022-00946-4
Yibai Xiong, Yan Ma, Lianguo Ruan, Dan Li, Cheng Lu, Luqi Huang
{"title":"Comparing different machine learning techniques for predicting COVID-19 severity.","authors":"Yibai Xiong,&nbsp;Yan Ma,&nbsp;Lianguo Ruan,&nbsp;Dan Li,&nbsp;Cheng Lu,&nbsp;Luqi Huang","doi":"10.1186/s40249-022-00946-4","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Coronavirus disease 2019 (COVID-19) is still ongoing spreading globally, machine learning techniques were used in disease diagnosis and to predict treatment outcomes, which showed favorable performance. The present study aims to predict COVID-19 severity at admission by different machine learning techniques including random forest (RF), support vector machine (SVM), and logistic regression (LR). Feature importance to COVID-19 severity were further identified.</p><p><strong>Methods: </strong>A retrospective design was adopted in the JinYinTan Hospital from January 26 to March 28, 2020, eighty-six demographic, clinical, and laboratory features were selected with LassoCV method, Spearman's rank correlation, experts' opinions, and literature evaluation. RF, SVM, and LR were performed to predict severe COVID-19, the performance of the models was compared by the area under curve (AUC). Additionally, feature importance to COVID-19 severity were analyzed by the best performance model.</p><p><strong>Results: </strong>A total of 287 patients were enrolled with 36.6% severe cases and 63.4% non-severe cases. The median age was 60.0 years (interquartile range: 49.0-68.0 years). Three models were established using 23 features including 1 clinical, 1 chest computed tomography (CT) and 21 laboratory features. Among three models, RF yielded better overall performance with the highest AUC of 0.970 than SVM of 0.948 and LR of 0.928, RF also achieved a favorable sensitivity of 96.7%, specificity of 69.5%, and accuracy of 84.5%. SVM had sensitivity of 93.9%, specificity of 79.0%, and accuracy of 88.5%. LR also achieved a favorable sensitivity of 92.3%, specificity of 72.3%, and accuracy of 85.2%. Additionally, chest-CT had highest importance to illness severity, and the following features were neutrophil to lymphocyte ratio, lactate dehydrogenase, and D-dimer, respectively.</p><p><strong>Conclusions: </strong>Our results indicated that RF could be a useful predictive tool to identify patients with severe COVID-19, which may facilitate effective care and further optimize resources.</p>","PeriodicalId":13587,"journal":{"name":"Infectious Diseases of Poverty","volume":"11 1","pages":"19"},"PeriodicalIF":4.8000,"publicationDate":"2022-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8851750/pdf/","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infectious Diseases of Poverty","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s40249-022-00946-4","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"INFECTIOUS DISEASES","Score":null,"Total":0}
引用次数: 24

Abstract

Background: Coronavirus disease 2019 (COVID-19) is still ongoing spreading globally, machine learning techniques were used in disease diagnosis and to predict treatment outcomes, which showed favorable performance. The present study aims to predict COVID-19 severity at admission by different machine learning techniques including random forest (RF), support vector machine (SVM), and logistic regression (LR). Feature importance to COVID-19 severity were further identified.

Methods: A retrospective design was adopted in the JinYinTan Hospital from January 26 to March 28, 2020, eighty-six demographic, clinical, and laboratory features were selected with LassoCV method, Spearman's rank correlation, experts' opinions, and literature evaluation. RF, SVM, and LR were performed to predict severe COVID-19, the performance of the models was compared by the area under curve (AUC). Additionally, feature importance to COVID-19 severity were analyzed by the best performance model.

Results: A total of 287 patients were enrolled with 36.6% severe cases and 63.4% non-severe cases. The median age was 60.0 years (interquartile range: 49.0-68.0 years). Three models were established using 23 features including 1 clinical, 1 chest computed tomography (CT) and 21 laboratory features. Among three models, RF yielded better overall performance with the highest AUC of 0.970 than SVM of 0.948 and LR of 0.928, RF also achieved a favorable sensitivity of 96.7%, specificity of 69.5%, and accuracy of 84.5%. SVM had sensitivity of 93.9%, specificity of 79.0%, and accuracy of 88.5%. LR also achieved a favorable sensitivity of 92.3%, specificity of 72.3%, and accuracy of 85.2%. Additionally, chest-CT had highest importance to illness severity, and the following features were neutrophil to lymphocyte ratio, lactate dehydrogenase, and D-dimer, respectively.

Conclusions: Our results indicated that RF could be a useful predictive tool to identify patients with severe COVID-19, which may facilitate effective care and further optimize resources.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
比较预测COVID-19严重程度的不同机器学习技术。
背景:2019冠状病毒病(COVID-19)仍在全球持续传播,机器学习技术被用于疾病诊断和预测治疗结果,并显示出良好的效果。本研究旨在通过不同的机器学习技术,包括随机森林(RF)、支持向量机(SVM)和逻辑回归(LR),预测入院时COVID-19的严重程度。进一步确定特征对COVID-19严重程度的重要性。方法:采用回顾性设计,采用LassoCV法、Spearman秩相关法、专家意见法和文献评价法,选取金银潭医院2020年1月26日至3月28日的86例人口统计学、临床和实验室特征。采用RF、SVM和LR预测重症COVID-19,通过曲线下面积(AUC)比较模型的性能。此外,通过最佳性能模型分析特征对COVID-19严重程度的重要性。结果:共纳入287例患者,重症病例占36.6%,非重症病例占63.4%。年龄中位数为60.0岁(四分位数间距为49.0 ~ 68.0岁)。3个模型采用23个特征,包括1个临床特征、1个胸部CT特征和21个实验室特征。三种模型中,RF的综合性能优于SVM, AUC为0.970,最大AUC为0.948,LR为0.928,灵敏度为96.7%,特异度为69.5%,准确率为84.5%。SVM的灵敏度为93.9%,特异度为79.0%,准确率为88.5%。LR的敏感性为92.3%,特异性为72.3%,准确性为85.2%。此外,胸部ct对疾病严重程度的重要性最高,以下特征分别是中性粒细胞与淋巴细胞比例、乳酸脱氢酶和d -二聚体。结论:我们的研究结果表明射频可作为一种有用的预测工具来识别COVID-19重症患者,有助于有效的护理和进一步优化资源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Infectious Diseases of Poverty
Infectious Diseases of Poverty Medicine-Public Health, Environmental and Occupational Health
CiteScore
16.70
自引率
1.20%
发文量
368
审稿时长
13 weeks
期刊介绍: Infectious Diseases of Poverty is a peer-reviewed, open access journal that focuses on essential public health questions related to infectious diseases of poverty. It covers a wide range of topics and methods, including the biology of pathogens and vectors, diagnosis and detection, treatment and case management, epidemiology and modeling, zoonotic hosts and animal reservoirs, control strategies and implementation, new technologies, and their application. The journal also explores the impact of transdisciplinary or multisectoral approaches on health systems, ecohealth, environmental management, and innovative technologies. It aims to provide a platform for the exchange of research and ideas that can contribute to the improvement of public health in resource-limited settings. In summary, Infectious Diseases of Poverty aims to address the urgent challenges posed by infectious diseases in impoverished populations. By publishing high-quality research in various areas, the journal seeks to advance our understanding of these diseases and contribute to the development of effective strategies for prevention, diagnosis, and treatment.
期刊最新文献
Spatio-temporal dynamics of malaria in Rwanda between 2012 and 2022: a demography-specific analysis Global patterns of syphilis, gonococcal infection, typhoid fever, paratyphoid fever, diphtheria, pertussis, tetanus, and leprosy from 1990 to 2021: findings from the Global Burden of Disease Study 2021 The abundance of snail hosts mediates the effects of antagonist interactions between trematodes on the transmission of human schistosomes MODELS: a six-step framework for developing an infectious disease model Mutations and intron polymorphisms in voltage-gated sodium channel genes of different geographic populations of Culex pipiens pallens/Culex pipiens quinquefasciatus in China
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1