MCPA: multi-scale cross perceptron attention network for 2D medical image segmentation

IF 5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Complex & Intelligent Systems Pub Date : 2024-12-19 DOI:10.1007/s40747-024-01671-1
Liang Xu, Mingxiao Chen, Yi Cheng, Pengwu Song, Pengfei Shao, Shuwei Shen, Peng Yao, Ronald X. Xu
{"title":"MCPA: multi-scale cross perceptron attention network for 2D medical image segmentation","authors":"Liang Xu, Mingxiao Chen, Yi Cheng, Pengwu Song, Pengfei Shao, Shuwei Shen, Peng Yao, Ronald X. Xu","doi":"10.1007/s40747-024-01671-1","DOIUrl":null,"url":null,"abstract":"<p>The UNet architecture, based on convolutional neural networks (CNN), has demonstrated its remarkable performance in medical image analysis. However, it faces challenges in capturing long-range dependencies due to the limited receptive fields and inherent bias of convolutional operations. Recently, numerous transformer-based techniques have been incorporated into the UNet architecture to overcome this limitation by effectively capturing global feature correlations. However, the integration of the Transformer modules may result in the loss of local contextual information during the global feature fusion process. In this work, we propose a 2D medical image segmentation model called multi-scale cross perceptron attention network (MCPA). The MCPA consists of three main components: an encoder, a decoder, and a Cross Perceptron. The Cross Perceptron first captures the local correlations using multiple Multi-scale Cross Perceptron modules, facilitating the fusion of features across scales. The resulting multi-scale feature vectors are then spatially unfolded, concatenated, and fed through a Global Perceptron module to model global dependencies. Considering the high computational cost of using 3D neural network models, and the fact that many important clinical data can only be obtained in two dimensions, our MCPA focuses on 2D medical image segmentation. Furthermore, we introduce a progressive dual-branch structure (PDBS) to address the semantic segmentation of the image involving finer tissue structures. This structure gradually shifts the segmentation focus of MCPA network training from large-scale structural features to more sophisticated pixel-level features. We evaluate our proposed MCPA model on several publicly available medical image datasets from different tasks and devices, including the open large-scale dataset of CT (Synapse), MRI (ACDC), and widely used 2D medical imaging datasets captured by fundus camera (DRIVE, CHASE<span>\\(\\_\\)</span>DB1, HRF), and OCTA (ROSE). The experimental results show that our MCPA model achieves state-of-the-art performance.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"23 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-024-01671-1","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

The UNet architecture, based on convolutional neural networks (CNN), has demonstrated its remarkable performance in medical image analysis. However, it faces challenges in capturing long-range dependencies due to the limited receptive fields and inherent bias of convolutional operations. Recently, numerous transformer-based techniques have been incorporated into the UNet architecture to overcome this limitation by effectively capturing global feature correlations. However, the integration of the Transformer modules may result in the loss of local contextual information during the global feature fusion process. In this work, we propose a 2D medical image segmentation model called multi-scale cross perceptron attention network (MCPA). The MCPA consists of three main components: an encoder, a decoder, and a Cross Perceptron. The Cross Perceptron first captures the local correlations using multiple Multi-scale Cross Perceptron modules, facilitating the fusion of features across scales. The resulting multi-scale feature vectors are then spatially unfolded, concatenated, and fed through a Global Perceptron module to model global dependencies. Considering the high computational cost of using 3D neural network models, and the fact that many important clinical data can only be obtained in two dimensions, our MCPA focuses on 2D medical image segmentation. Furthermore, we introduce a progressive dual-branch structure (PDBS) to address the semantic segmentation of the image involving finer tissue structures. This structure gradually shifts the segmentation focus of MCPA network training from large-scale structural features to more sophisticated pixel-level features. We evaluate our proposed MCPA model on several publicly available medical image datasets from different tasks and devices, including the open large-scale dataset of CT (Synapse), MRI (ACDC), and widely used 2D medical imaging datasets captured by fundus camera (DRIVE, CHASE\(\_\)DB1, HRF), and OCTA (ROSE). The experimental results show that our MCPA model achieves state-of-the-art performance.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于二维医学图像分割的多尺度交叉感知器注意网络
基于卷积神经网络(CNN)的 UNet 架构在医学图像分析中表现出了卓越的性能。然而,由于卷积操作的感受野有限且存在固有偏差,该架构在捕捉长距离相关性方面面临挑战。最近,许多基于变换器的技术被纳入 UNet 架构,通过有效捕捉全局特征相关性来克服这一局限。然而,在全局特征融合过程中,变换器模块的集成可能会导致局部上下文信息的丢失。在这项工作中,我们提出了一种名为多尺度交叉感知器注意网络(MCPA)的二维医学图像分割模型。MCPA 由三个主要部分组成:编码器、解码器和交叉感知器。交叉感知器首先利用多个多尺度交叉感知器模块捕捉局部相关性,促进跨尺度特征的融合。然后,将得到的多尺度特征向量在空间上展开、连接,并通过全局感知器模块来模拟全局依赖关系。考虑到使用三维神经网络模型的计算成本较高,而且许多重要的临床数据只能在二维中获得,我们的 MCPA 专注于二维医学图像分割。此外,我们还引入了渐进式双分支结构(PDBS),以解决涉及更精细组织结构的图像语义分割问题。这种结构将 MCPA 网络训练的分割重点从大规模结构特征逐渐转移到更复杂的像素级特征。我们在多个公开的医学图像数据集上评估了我们提出的 MCPA 模型,这些数据集来自不同的任务和设备,包括开放的大规模 CT 数据集(Synapse)、MRI 数据集(ACDC),以及广泛使用的由眼底相机捕获的二维医学成像数据集(DRIVE、CHASE(\_\)DB1、HRF)和 OCTA 数据集(ROSE)。实验结果表明,我们的 MCPA 模型达到了最先进的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Complex & Intelligent Systems
Complex & Intelligent Systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-
CiteScore
9.60
自引率
10.30%
发文量
297
期刊介绍: Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.
期刊最新文献
Manet: motion-aware network for video action recognition A low-carbon scheduling method based on improved ant colony algorithm for underground electric transportation vehicles Vehicle positioning systems in tunnel environments: a review A survey of security threats in federated learning Barriers and enhance strategies for green supply chain management using continuous linear diophantine neural networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1