ACU-TransNet: Attention and convolution-augmented UNet-transformer network for polyp segmentation.

IF 1.7 3区医学 Q3 INSTRUMENTS & INSTRUMENTATION Journal of X-Ray Science and Technology Pub Date : 2024-10-12 DOI:10.3233/XST-240076

Lei Huang, Yun Wu

{"title":"ACU-TransNet: Attention and convolution-augmented UNet-transformer network for polyp segmentation.","authors":"Lei Huang, Yun Wu","doi":"10.3233/XST-240076","DOIUrl":null,"url":null,"abstract":"Background: UNet has achieved great success in medical image segmentation. However, due to the inherent locality of convolution operations, UNet is deficient in capturing global features and long-range dependencies of polyps, resulting in less accurate polyp recognition for complex morphologies and backgrounds. Transformers, with their sequential operations, are better at perceiving global features but lack low-level details, leading to limited localization ability. If the advantages of both architectures can be effectively combined, the accuracy of polyp segmentation can be further improved.Methods: In this paper, we propose an attention and convolution-augmented UNet-Transformer Network (ACU-TransNet) for polyp segmentation. This network is composed of the comprehensive attention UNet and the Transformer head, sequentially connected by the bridge layer. On the one hand, the comprehensive attention UNet enhances specific feature extraction through deformable convolution and channel attention in the first layer of the encoder and achieves more accurate shape extraction through spatial attention and channel attention in the decoder. On the other hand, the Transformer head supplements fine-grained information through convolutional attention and acquires hierarchical global characteristics from the feature maps.Results: mcU-TransNet could comprehensively learn dataset features and enhance colonoscopy interpretability for polyp detection.Conclusion: Experimental results on the CVC-ClinicDB and Kvasir-SEG datasets demonstrate that mcU-TransNet outperforms existing state-of-the-art methods, showcasing its robustness.","PeriodicalId":49948,"journal":{"name":"Journal of X-Ray Science and Technology","volume":" ","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2024-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of X-Ray Science and Technology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3233/XST-240076","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"INSTRUMENTS & INSTRUMENTATION","Score":null,"Total":0}

引用次数: 0

Abstract

Background: UNet has achieved great success in medical image segmentation. However, due to the inherent locality of convolution operations, UNet is deficient in capturing global features and long-range dependencies of polyps, resulting in less accurate polyp recognition for complex morphologies and backgrounds. Transformers, with their sequential operations, are better at perceiving global features but lack low-level details, leading to limited localization ability. If the advantages of both architectures can be effectively combined, the accuracy of polyp segmentation can be further improved.

Methods: In this paper, we propose an attention and convolution-augmented UNet-Transformer Network (ACU-TransNet) for polyp segmentation. This network is composed of the comprehensive attention UNet and the Transformer head, sequentially connected by the bridge layer. On the one hand, the comprehensive attention UNet enhances specific feature extraction through deformable convolution and channel attention in the first layer of the encoder and achieves more accurate shape extraction through spatial attention and channel attention in the decoder. On the other hand, the Transformer head supplements fine-grained information through convolutional attention and acquires hierarchical global characteristics from the feature maps.

Results: mcU-TransNet could comprehensively learn dataset features and enhance colonoscopy interpretability for polyp detection.

Conclusion: Experimental results on the CVC-ClinicDB and Kvasir-SEG datasets demonstrate that mcU-TransNet outperforms existing state-of-the-art methods, showcasing its robustness.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

ACU-TransNet：用于息肉分割的注意力和卷积增强 UNet 变换器网络。

背景：UNet 在医学图像分割方面取得了巨大成功。然而，由于卷积运算固有的局部性，UNet 在捕捉息肉的全局特征和长程依赖性方面存在不足，导致对复杂形态和背景的息肉识别不够准确。变换器具有顺序操作功能，能更好地感知全局特征，但缺乏低层次细节，导致定位能力有限。如果能有效结合这两种架构的优势，就能进一步提高息肉分割的准确性：本文提出了一种用于息肉分割的注意力和卷积增强 UNet-Transformer 网络（ACU-TransNet）。该网络由综合注意力 UNet 和变换器头组成，通过桥接层依次连接。一方面，综合注意力 UNet 在第一层编码器中通过可变形卷积和通道注意力加强特定特征提取，并在解码器中通过空间注意力和通道注意力实现更精确的形状提取。结果：mcU-TransNet 可以全面学习数据集特征，提高结肠镜息肉检测的可解释性：在 CVC-ClinicDB 和 Kvasir-SEG 数据集上的实验结果表明，mcU-TransNet 的性能优于现有的先进方法，展示了其鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of X-Ray Science and Technology 工程技术-光学

CiteScore

4.90

自引率

23.30%

发文量

150

审稿时长

3 months

期刊介绍： Research areas within the scope of the journal include: Interaction of x-rays with matter: x-ray phenomena, biological effects of radiation, radiation safety and optical constants X-ray sources: x-rays from synchrotrons, x-ray lasers, plasmas, and other sources, conventional or unconventional Optical elements: grazing incidence optics, multilayer mirrors, zone plates, gratings, other diffraction optics Optical instruments: interferometers, spectrometers, microscopes, telescopes, microprobes