Nurul Fathia Mohamand Noor, Herold Sylvestro Sipail, N. Ahmad, Bayram Annanurov, N. Mohd Noor
{"title":"COVID-19: Symptoms Clustering and Severity Classification Using Machine Learning Approach","authors":"Nurul Fathia Mohamand Noor, Herold Sylvestro Sipail, N. Ahmad, Bayram Annanurov, N. Mohd Noor","doi":"10.30880/ijie.2023.15.03.001","DOIUrl":null,"url":null,"abstract":"COVID-19 is an extremely contagious illness that causes illnesses varying from either the common cold to more chronic illnesses or even death. The constant mutation of a new variant of COVID-19 makes it important to identify the symptom of COVID-19 in order to contain the infection. The use of clustering and classification in machine learning is in mainstream use in different aspects of research, especially in recent years to generate useful knowledge on COVID-19outbreak. Many researchers have shared their COVID-19 data on public database and a lot of studies have been carried out. However, the meritof the dataset is unknown and analysis need to be carried by the researchers to check on its reliability. The dataset that is used in thisworkwas sourced from the Kaggle website. The data wasobtained through a survey collected from participants of various gender and age who had been to at least ten countries.There are four levels of severity based on the COVID-19 symptom, which was developed in accordance to World Health Organization (WHO)and the Indian Ministry of Health and Family Welfare recommendations. This paperpresented an inquiry on the dataset utilising supervised and unsupervised machine learning approaches in order to better comprehend the dataset.In this study, the analysisof the severity group based on theCOVID-19 symptomsusing supervised learning techniques employeda total of seven classifiers, namelythe K-NN, Linear SVM, Naive Bayes, Decision Tree (J48), Ada Boost, Bagging, and Stacking.For the unsupervised learning techniques, the clustering algorithm utilized in this work areSimple K-Means and Expectation-Maximization. From the result obtained from both supervised and unsupervised learning techniques, we observed that the result analysis yielded relatively poor classification and clustering results.The findings for the dataset analysed in this study donot appear to be providing the correctresult for the symptoms categorized against the severity levelwhich raises concerns about the validity and reliability of the dataset.","PeriodicalId":14189,"journal":{"name":"International Journal of Integrated Engineering","volume":" ","pages":""},"PeriodicalIF":0.4000,"publicationDate":"2023-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Integrated Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30880/ijie.2023.15.03.001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
COVID-19 is an extremely contagious illness that causes illnesses varying from either the common cold to more chronic illnesses or even death. The constant mutation of a new variant of COVID-19 makes it important to identify the symptom of COVID-19 in order to contain the infection. The use of clustering and classification in machine learning is in mainstream use in different aspects of research, especially in recent years to generate useful knowledge on COVID-19outbreak. Many researchers have shared their COVID-19 data on public database and a lot of studies have been carried out. However, the meritof the dataset is unknown and analysis need to be carried by the researchers to check on its reliability. The dataset that is used in thisworkwas sourced from the Kaggle website. The data wasobtained through a survey collected from participants of various gender and age who had been to at least ten countries.There are four levels of severity based on the COVID-19 symptom, which was developed in accordance to World Health Organization (WHO)and the Indian Ministry of Health and Family Welfare recommendations. This paperpresented an inquiry on the dataset utilising supervised and unsupervised machine learning approaches in order to better comprehend the dataset.In this study, the analysisof the severity group based on theCOVID-19 symptomsusing supervised learning techniques employeda total of seven classifiers, namelythe K-NN, Linear SVM, Naive Bayes, Decision Tree (J48), Ada Boost, Bagging, and Stacking.For the unsupervised learning techniques, the clustering algorithm utilized in this work areSimple K-Means and Expectation-Maximization. From the result obtained from both supervised and unsupervised learning techniques, we observed that the result analysis yielded relatively poor classification and clustering results.The findings for the dataset analysed in this study donot appear to be providing the correctresult for the symptoms categorized against the severity levelwhich raises concerns about the validity and reliability of the dataset.
期刊介绍:
The International Journal of Integrated Engineering (IJIE) is a single blind peer reviewed journal which publishes 3 times a year since 2009. The journal is dedicated to various issues focusing on 3 different fields which are:- Civil and Environmental Engineering. Original contributions for civil and environmental engineering related practices will be publishing under this category and as the nucleus of the journal contents. The journal publishes a wide range of research and application papers which describe laboratory and numerical investigations or report on full scale projects. Electrical and Electronic Engineering. It stands as a international medium for the publication of original papers concerned with the electrical and electronic engineering. The journal aims to present to the international community important results of work in this field, whether in the form of research, development, application or design. Mechanical, Materials and Manufacturing Engineering. It is a platform for the publication and dissemination of original work which contributes to the understanding of the main disciplines underpinning the mechanical, materials and manufacturing engineering. Original contributions giving insight into engineering practices related to mechanical, materials and manufacturing engineering form the core of the journal contents.