ConViT对COVID-19肺部图像分类及图像分辨率和注意头数影响的研究

IF 0.4 Q4 ENGINEERING, MULTIDISCIPLINARY International Journal of Integrated Engineering Pub Date : 2023-07-31 DOI:10.30880/ijie.2023.15.03.005

P. L. Thon, J. Than, Norliza M. Noor, Jun Han, Patrick Then

{"title":"ConViT对COVID-19肺部图像分类及图像分辨率和注意头数影响的研究","authors":"P. L. Thon, J. Than, Norliza M. Noor, Jun Han, Patrick Then","doi":"10.30880/ijie.2023.15.03.005","DOIUrl":null,"url":null,"abstract":"COVID-19 has beenone of the popular foci in the research community since its first outbreak in China, 2019. Radiological patternssuch as ground glass opacity (GGO) andconsolidations are often found inCT scan images ofmoderate to severe COVID-19 patients. Therefore, a deep learning model can be trained to distinguish COVID-19 patients using their CT scan images. Convolutional Neural Networks (CNNs) has been a popular choice for this type of classification task. Anotherpotential method is the use ofvisiontransformer with convolution, resulting in Convolutional Vision Transformer (ConViT), to possibly produce on par performance using less computational resources. In this study, ConViT is applied to diagnose COVID-19 cases from lung CT scan images. Particularly, we investigated the relationship of the input image pixel resolutions and the number of attention heads used in ConViT and their effects on the model’s performance.Specifically, we used 512x512, 224x224 and 128x128 pixels resolution to train the modelwith 4 (tiny), 9 (small) and 16 (base) number of attention heads used. An open access dataset consisting of 2282 COVID-19 CT images and 9776 Normal CT images from Iran is used in this study. Byusing 128x128 image pixels resolution,training using 16 attention heads, the ConViT modelhas achieved an accuracy of98.01%,sensitivity of90.83%, specificity of99.69%, positive predictive value (PPV) of95.58%, negative predictive value (NPV) of97.89%and F1-score of94.55%.The model has also achieved improvedperformance over other recent studiesthat usedthe same dataset.In conclusion, this study has shown that theConViTmodel can play a meaningful role to complement RT-PCR test on COVID-19 close contacts and patients.","PeriodicalId":14189,"journal":{"name":"International Journal of Integrated Engineering","volume":" ","pages":""},"PeriodicalIF":0.4000,"publicationDate":"2023-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Investigation of ConViT on COVID-19 Lung Image Classification and the Effects of Image Resolution and Number of Attention Heads\",\"authors\":\"P. L. Thon, J. Than, Norliza M. Noor, Jun Han, Patrick Then\",\"doi\":\"10.30880/ijie.2023.15.03.005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"COVID-19 has beenone of the popular foci in the research community since its first outbreak in China, 2019. Radiological patternssuch as ground glass opacity (GGO) andconsolidations are often found inCT scan images ofmoderate to severe COVID-19 patients. Therefore, a deep learning model can be trained to distinguish COVID-19 patients using their CT scan images. Convolutional Neural Networks (CNNs) has been a popular choice for this type of classification task. Anotherpotential method is the use ofvisiontransformer with convolution, resulting in Convolutional Vision Transformer (ConViT), to possibly produce on par performance using less computational resources. In this study, ConViT is applied to diagnose COVID-19 cases from lung CT scan images. Particularly, we investigated the relationship of the input image pixel resolutions and the number of attention heads used in ConViT and their effects on the model’s performance.Specifically, we used 512x512, 224x224 and 128x128 pixels resolution to train the modelwith 4 (tiny), 9 (small) and 16 (base) number of attention heads used. An open access dataset consisting of 2282 COVID-19 CT images and 9776 Normal CT images from Iran is used in this study. Byusing 128x128 image pixels resolution,training using 16 attention heads, the ConViT modelhas achieved an accuracy of98.01%,sensitivity of90.83%, specificity of99.69%, positive predictive value (PPV) of95.58%, negative predictive value (NPV) of97.89%and F1-score of94.55%.The model has also achieved improvedperformance over other recent studiesthat usedthe same dataset.In conclusion, this study has shown that theConViTmodel can play a meaningful role to complement RT-PCR test on COVID-19 close contacts and patients.\",\"PeriodicalId\":14189,\"journal\":{\"name\":\"International Journal of Integrated Engineering\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.4000,\"publicationDate\":\"2023-07-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Integrated Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.30880/ijie.2023.15.03.005\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Integrated Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30880/ijie.2023.15.03.005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

自2019年新冠肺炎在中国首次爆发以来，它一直是研究界的热点之一。在中重度新冠肺炎患者的CT扫描图像中经常发现磨玻璃样不透明（GGO）和合并等放射模式。因此，可以训练深度学习模型，以使用新冠肺炎患者的CT扫描图像来区分他们。卷积神经网络（CNNs）一直是这类分类任务的热门选择。另一种潜在的方法是使用具有卷积的视觉变换器，从而产生卷积视觉变换器（ConViT），从而可能使用较少的计算资源产生同等的性能。在本研究中，ConViT应用于从肺部CT扫描图像诊断新冠肺炎病例。特别地，我们研究了输入图像像素分辨率与ConViT中使用的注意力头数量之间的关系，以及它们对模型性能的影响。具体来说，我们使用512x512、224x224和128x128像素的分辨率来训练模型，使用了4个（微小）、9个（小）和16个（基本）注意力头。本研究使用了一个开放访问数据集，由2282张新冠肺炎CT图像和9776张伊朗正常CT图像组成。通过使用128x128像素的分辨率，使用16个注意力头进行训练，ConViT模型的准确率为98.01%，灵敏度为90.83%，特异性为99.69%，阳性预测值（PPV）为95.58%，阴性预测值（NPV）为97.89%，F1得分为94.55%。该模型的性能也比最近使用相同数据集的其他研究有所提高。总之，本研究表明，CoVit模型可以在补充新冠肺炎密切接触者和患者的RT-PCR检测方面发挥有意义的作用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Investigation of ConViT on COVID-19 Lung Image Classification and the Effects of Image Resolution and Number of Attention Heads

COVID-19 has beenone of the popular foci in the research community since its first outbreak in China, 2019. Radiological patternssuch as ground glass opacity (GGO) andconsolidations are often found inCT scan images ofmoderate to severe COVID-19 patients. Therefore, a deep learning model can be trained to distinguish COVID-19 patients using their CT scan images. Convolutional Neural Networks (CNNs) has been a popular choice for this type of classification task. Anotherpotential method is the use ofvisiontransformer with convolution, resulting in Convolutional Vision Transformer (ConViT), to possibly produce on par performance using less computational resources. In this study, ConViT is applied to diagnose COVID-19 cases from lung CT scan images. Particularly, we investigated the relationship of the input image pixel resolutions and the number of attention heads used in ConViT and their effects on the model’s performance.Specifically, we used 512x512, 224x224 and 128x128 pixels resolution to train the modelwith 4 (tiny), 9 (small) and 16 (base) number of attention heads used. An open access dataset consisting of 2282 COVID-19 CT images and 9776 Normal CT images from Iran is used in this study. Byusing 128x128 image pixels resolution,training using 16 attention heads, the ConViT modelhas achieved an accuracy of98.01%,sensitivity of90.83%, specificity of99.69%, positive predictive value (PPV) of95.58%, negative predictive value (NPV) of97.89%and F1-score of94.55%.The model has also achieved improvedperformance over other recent studiesthat usedthe same dataset.In conclusion, this study has shown that theConViTmodel can play a meaningful role to complement RT-PCR test on COVID-19 close contacts and patients.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Integrated Engineering ENGINEERING, MULTIDISCIPLINARY-

CiteScore

1.40

自引率

0.00%

发文量

期刊介绍： The International Journal of Integrated Engineering (IJIE) is a single blind peer reviewed journal which publishes 3 times a year since 2009. The journal is dedicated to various issues focusing on 3 different fields which are:- Civil and Environmental Engineering. Original contributions for civil and environmental engineering related practices will be publishing under this category and as the nucleus of the journal contents. The journal publishes a wide range of research and application papers which describe laboratory and numerical investigations or report on full scale projects. Electrical and Electronic Engineering. It stands as a international medium for the publication of original papers concerned with the electrical and electronic engineering. The journal aims to present to the international community important results of work in this field, whether in the form of research, development, application or design. Mechanical, Materials and Manufacturing Engineering. It is a platform for the publication and dissemination of original work which contributes to the understanding of the main disciplines underpinning the mechanical, materials and manufacturing engineering. Original contributions giving insight into engineering practices related to mechanical, materials and manufacturing engineering form the core of the journal contents.