P. Rathnayake, J. M. D. Senanayake, D. Wickramaarachchi
{"title":"Estimation of the incubation period of COVID-19 using boosted random forest algorithm","authors":"P. Rathnayake, J. M. D. Senanayake, D. Wickramaarachchi","doi":"10.1109/scse53661.2021.9568282","DOIUrl":null,"url":null,"abstract":"Coronavirus disease was first discovered in December 2019. As of July 2021, within nineteen months since this infectious disease started, more than one hundred and eighty million cases have been reported. The incubation period of the virus, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), can be defined as the period between exposure to the virus and symptom onset. Most of the affected cases are asymptomatic during this period, but they can transmit the virus to others. The incubation period is an important factor in deciding quarantine or isolation periods. According to current studies, the incubation period of SARS-CoV-2 ranges from2 to 14 days. Since there is a range, it is difficult to identify a specific incubation period for suspected cases. Therefore, all suspected cases should undergo an isolation period of 14 days, and it may lead to unnecessarily allocation of resources. The main objective of this research is to develop a classification model to classify the incubation period using machine learning techniques after identifying the factors affecting the incubation period. Patient records within the age group 5–80 years were used in this study. The dataset consists of 500 patient records from various countries such as China, Japan, South Korea and the USA. This study identified that the patients' age, immunocompetent state, gender, direct/indirect contact with the affected patients and the residing location affect the incubation period. Several supervised learning classification algorithms were compared in this study to find the best performing algorithm to classify the incubation classes. The weighted average of each incubation class was used to evaluate the overall model performance. The random forest algorithm outperformed other algorithms achieving 0.78 precision, 0.84 recall, and 0.80 F1-score in classifying the incubation classes. To fine-tune the model AdaBoost algorithm was used.","PeriodicalId":319650,"journal":{"name":"2021 International Research Conference on Smart Computing and Systems Engineering (SCSE)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Research Conference on Smart Computing and Systems Engineering (SCSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/scse53661.2021.9568282","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Coronavirus disease was first discovered in December 2019. As of July 2021, within nineteen months since this infectious disease started, more than one hundred and eighty million cases have been reported. The incubation period of the virus, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), can be defined as the period between exposure to the virus and symptom onset. Most of the affected cases are asymptomatic during this period, but they can transmit the virus to others. The incubation period is an important factor in deciding quarantine or isolation periods. According to current studies, the incubation period of SARS-CoV-2 ranges from2 to 14 days. Since there is a range, it is difficult to identify a specific incubation period for suspected cases. Therefore, all suspected cases should undergo an isolation period of 14 days, and it may lead to unnecessarily allocation of resources. The main objective of this research is to develop a classification model to classify the incubation period using machine learning techniques after identifying the factors affecting the incubation period. Patient records within the age group 5–80 years were used in this study. The dataset consists of 500 patient records from various countries such as China, Japan, South Korea and the USA. This study identified that the patients' age, immunocompetent state, gender, direct/indirect contact with the affected patients and the residing location affect the incubation period. Several supervised learning classification algorithms were compared in this study to find the best performing algorithm to classify the incubation classes. The weighted average of each incubation class was used to evaluate the overall model performance. The random forest algorithm outperformed other algorithms achieving 0.78 precision, 0.84 recall, and 0.80 F1-score in classifying the incubation classes. To fine-tune the model AdaBoost algorithm was used.