{"title":"A CNN-Transformer-based Approach for Medical Image Segmentation","authors":"Thi-Thao Tran, Dinh-Thien Vu, Thi-Hue Nguyen, Van-Truong Pham","doi":"10.1109/ICSSE58758.2023.10227162","DOIUrl":null,"url":null,"abstract":"Advances in deep convolutional neural networks (CNNs) have shown excellent performances on image processing applications including segmentation for medical images. Nevertheless, CNN-based approaches like the Fully Convolutional Neural Networks (FCNs), Unet and variants for image segmentation often meet difficulties when expressing long-range dependency because of the locality properties of convolutional operations. In an alternative, the network models based on transformers have global context of the image and features, thus better expressing long-range dependency. Though having advantages, the transformer-based approach often lacks local information context, thus limiting certain applications like medical images. In the current study, we propose a new model that can inherit advantages of both global and local contexts of the two above approaches by using CNN and Transformer branches, and introduced the Convmixer and Progressive Atrous Spatial Pyramidal Pooling modules in the bottlenecks of each branches. The proposed model has been validated on various medical image databases including the Data science bowls 2018, and GlaS datasets. High evaluation scores including Dice score and Intersection Over Union metric have shown performance of the proposed segmentation model while compared with recent neural network models.","PeriodicalId":280745,"journal":{"name":"2023 International Conference on System Science and Engineering (ICSSE)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on System Science and Engineering (ICSSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSSE58758.2023.10227162","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Advances in deep convolutional neural networks (CNNs) have shown excellent performances on image processing applications including segmentation for medical images. Nevertheless, CNN-based approaches like the Fully Convolutional Neural Networks (FCNs), Unet and variants for image segmentation often meet difficulties when expressing long-range dependency because of the locality properties of convolutional operations. In an alternative, the network models based on transformers have global context of the image and features, thus better expressing long-range dependency. Though having advantages, the transformer-based approach often lacks local information context, thus limiting certain applications like medical images. In the current study, we propose a new model that can inherit advantages of both global and local contexts of the two above approaches by using CNN and Transformer branches, and introduced the Convmixer and Progressive Atrous Spatial Pyramidal Pooling modules in the bottlenecks of each branches. The proposed model has been validated on various medical image databases including the Data science bowls 2018, and GlaS datasets. High evaluation scores including Dice score and Intersection Over Union metric have shown performance of the proposed segmentation model while compared with recent neural network models.