{"title":"Predicting COVID-19 Cases on a Large Chest X-Ray Dataset Using Modified Pre-trained CNN Architectures","authors":"Abdulkadir Karac","doi":"10.2478/acss-2023-0005","DOIUrl":null,"url":null,"abstract":"Abstract The Coronavirus is a virus that spreads very quickly. Therefore, it has had very destructive effects in many areas worldwide. Because X-ray images are an easily accessible, fast, and inexpensive method, they are widely used worldwide to diagnose COVID-19. This study tried detecting COVID-19 from X-ray images using pre-trained VGG16, VGG19, InceptionV3, and Resnet50 CNN architectures and modified versions of these architectures. The fully connected layers of the pre-trained architectures have been reorganized in the modified CNN architectures. These architectures were trained on binary and three-class datasets, revealing their classification performance. The data set was collected from four different sources and consisted of 594 COVID-19, 1345 viral pneumonia, and 1341 normal X-ray images. Models are built using Tensorflow and Keras Libraries with Python programming language. Preprocessing was performed on the dataset by applying resizing, normalization, and one hot encoding operation. Model performances were evaluated according to many performance metrics such as recall, specificity, accuracy, precision, F1-score, confusion matrix, ROC analysis, etc., using 5-fold cross-validation. The highest classification performance was obtained in the modified VGG19 model with 99.84 % accuracy for binary classification (COVID-19 vs. Normal) and in the modified VGG16 model with 98.26 % accuracy for triple classification (COVID-19 vs. Pneumonia vs. Normal). These models have a higher accuracy rate than other studies in the literature. In addition, the number of COVID-19 X-ray images in the dataset used in this study is approximately two times higher than in other studies. Since it is obtained from different sources, it is irregular and does not have a standard. Despite this, it is noteworthy that higher classification performance was achieved than in previous studies. Modified VGG16 and VGG19 models (available at github.com/akaraci/LargeDatasetCovid19) can be used as an auxiliary tool in slight healthcare organizations’ shortage of specialists to detect COVID-19.","PeriodicalId":41960,"journal":{"name":"Applied Computer Systems","volume":"10 1","pages":"44 - 57"},"PeriodicalIF":0.5000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Computer Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/acss-2023-0005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract The Coronavirus is a virus that spreads very quickly. Therefore, it has had very destructive effects in many areas worldwide. Because X-ray images are an easily accessible, fast, and inexpensive method, they are widely used worldwide to diagnose COVID-19. This study tried detecting COVID-19 from X-ray images using pre-trained VGG16, VGG19, InceptionV3, and Resnet50 CNN architectures and modified versions of these architectures. The fully connected layers of the pre-trained architectures have been reorganized in the modified CNN architectures. These architectures were trained on binary and three-class datasets, revealing their classification performance. The data set was collected from four different sources and consisted of 594 COVID-19, 1345 viral pneumonia, and 1341 normal X-ray images. Models are built using Tensorflow and Keras Libraries with Python programming language. Preprocessing was performed on the dataset by applying resizing, normalization, and one hot encoding operation. Model performances were evaluated according to many performance metrics such as recall, specificity, accuracy, precision, F1-score, confusion matrix, ROC analysis, etc., using 5-fold cross-validation. The highest classification performance was obtained in the modified VGG19 model with 99.84 % accuracy for binary classification (COVID-19 vs. Normal) and in the modified VGG16 model with 98.26 % accuracy for triple classification (COVID-19 vs. Pneumonia vs. Normal). These models have a higher accuracy rate than other studies in the literature. In addition, the number of COVID-19 X-ray images in the dataset used in this study is approximately two times higher than in other studies. Since it is obtained from different sources, it is irregular and does not have a standard. Despite this, it is noteworthy that higher classification performance was achieved than in previous studies. Modified VGG16 and VGG19 models (available at github.com/akaraci/LargeDatasetCovid19) can be used as an auxiliary tool in slight healthcare organizations’ shortage of specialists to detect COVID-19.