Nasrin Talkhi, Narges Akhavan Fatemi, M. Jabbari Nooghabi
{"title":"Revealing Behavior Patterns of SARS-CoV-2 using Clustering Analysis and XGBoost Error Forecasting Models","authors":"Nasrin Talkhi, Narges Akhavan Fatemi, M. Jabbari Nooghabi","doi":"10.30699/ijmm.16.3.221","DOIUrl":null,"url":null,"abstract":"Background and Aim: COVID-19 is a highly contagious infectious disease, and it has affected people's daily life and has raised great concern for governments and public health officials. Forecasting its future behavior may be useful for allocating medical resources and defining effective strategies for disease control, etc. Materials and Methods: The collected data was the cumulative and the absolute number of confirmed, death, and recovered cases of COVID-19 from February 20 to July 03, 2021. We used hierarchical cluster analysis. To forecast the future behavior of COVID-19, the Auto-Regressive Integrated Moving Average (ARIMA), Exponential Smoothing (ETS), Automatic Forecasting Procedure (Prophet), Naive, Seasonal Naive (s-Naive), boosted ARIMA, and boosted Prophet models were used. Results: The results of clustering showed a similar behavior of coronavirus in Iran and other countries such as France, Russia, Turkey, United Kingdom (UK), Argentina, Colombia, Italy, Spain, Germany, Poland, Mexico, and Indonesia. It also revealed similar patterns of SARS-CoV-2 for the same countries in six groups. Results showed that XGBoost models' family had higher accuracy than other models. Conclusion: In Iran, COVID-19 showed similar behavior patterns compared to the studied developed countries. The family of XGBoost models showed practical results and high precision in forecasting behavior patterns of the virus. Concerning the rapid spread of the virus worldwide, these models can be used to forecast the behavior patterns of SARS-CoV-2. Preventing the spread of the coronavirus, controlling the disease, and breaking down its chain necessitates community assistance, and in this mission, the role of statisticians cannot be neglected.","PeriodicalId":14580,"journal":{"name":"Iranian Journal of Medical Microbiology","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Iranian Journal of Medical Microbiology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30699/ijmm.16.3.221","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0
Abstract
Background and Aim: COVID-19 is a highly contagious infectious disease, and it has affected people's daily life and has raised great concern for governments and public health officials. Forecasting its future behavior may be useful for allocating medical resources and defining effective strategies for disease control, etc. Materials and Methods: The collected data was the cumulative and the absolute number of confirmed, death, and recovered cases of COVID-19 from February 20 to July 03, 2021. We used hierarchical cluster analysis. To forecast the future behavior of COVID-19, the Auto-Regressive Integrated Moving Average (ARIMA), Exponential Smoothing (ETS), Automatic Forecasting Procedure (Prophet), Naive, Seasonal Naive (s-Naive), boosted ARIMA, and boosted Prophet models were used. Results: The results of clustering showed a similar behavior of coronavirus in Iran and other countries such as France, Russia, Turkey, United Kingdom (UK), Argentina, Colombia, Italy, Spain, Germany, Poland, Mexico, and Indonesia. It also revealed similar patterns of SARS-CoV-2 for the same countries in six groups. Results showed that XGBoost models' family had higher accuracy than other models. Conclusion: In Iran, COVID-19 showed similar behavior patterns compared to the studied developed countries. The family of XGBoost models showed practical results and high precision in forecasting behavior patterns of the virus. Concerning the rapid spread of the virus worldwide, these models can be used to forecast the behavior patterns of SARS-CoV-2. Preventing the spread of the coronavirus, controlling the disease, and breaking down its chain necessitates community assistance, and in this mission, the role of statisticians cannot be neglected.