Multi-Scale Dynamic Sparse Attention UNet for Medical Image Segmentation

IF 6.8 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS IEEE Journal of Biomedical and Health Informatics Pub Date : 2025-03-28 DOI:10.1109/JBHI.2025.3555805
Xiang Li;Chong Fu;Qun Wang;Wenchao Zhang;Chen Ye;Junxin Chen;Chiu-Wing Sham
{"title":"Multi-Scale Dynamic Sparse Attention UNet for Medical Image Segmentation","authors":"Xiang Li;Chong Fu;Qun Wang;Wenchao Zhang;Chen Ye;Junxin Chen;Chiu-Wing Sham","doi":"10.1109/JBHI.2025.3555805","DOIUrl":null,"url":null,"abstract":"Transformers have recently gained significant attention in medical image segmentation due to their ability to capture long-range dependencies. However, the presence of excessive background noise in large regions of medical images introduces distractions and increases the computational burden on the fine-grained self-attention (SA) mechanism, which is a key component of the transformer model. Meanwhile, preserving fine-grained details is essential for accurately segmenting complex, blurred medical images with diverse shapes and sizes. Thus, we propose a novel Multi-scale Dynamic Sparse Attention (MDSA) module, which flexibly reduces computational costs while maintaining multi-scale fine-grained interactions with content awareness. Specifically, multi-scale aggregation is first applied to the feature maps to enrich the diversity of interaction information. Then, for each query, irrelevant key-value pairs are filtered out at a coarse-grained level. Finally, fine-grained SA is performed on the remaining key-value pairs. In addition, we design an enhanced downsampling merging (EDM) module and an enhanced upsampling fusion (EUF) module for building pyramid architectures. Using MDSA to construct the basic blocks, combined with EDMs and EUFs, we develop a UNet-like model named MDSA-UNet. Since MDSA-UNet dynamically processes only a small subset of relevant fine-grained features, it achieves strong segmentation performance with high computational efficiency. Extensive experiments on four datasets spanning three different types demonstrate that our MDSA-UNet, without using pre-training, significantly outperforms other non-pretrained methods and even competes with pre-trained models, achieving Dice scores of 82.10% on DDTI, 80.20% on TN3K, 90.75% on ISIC2018, and 91.05% on ACDC. Meanwhile, our model maintains lower complexity, with only 6.65 M parameters and 4.54 G FLOPs at a resolution of 224 × 224, ensuring both effectiveness and efficiency. Code is available at URL.","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"29 9","pages":"6754-6766"},"PeriodicalIF":6.8000,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Biomedical and Health Informatics","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10944714/","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Transformers have recently gained significant attention in medical image segmentation due to their ability to capture long-range dependencies. However, the presence of excessive background noise in large regions of medical images introduces distractions and increases the computational burden on the fine-grained self-attention (SA) mechanism, which is a key component of the transformer model. Meanwhile, preserving fine-grained details is essential for accurately segmenting complex, blurred medical images with diverse shapes and sizes. Thus, we propose a novel Multi-scale Dynamic Sparse Attention (MDSA) module, which flexibly reduces computational costs while maintaining multi-scale fine-grained interactions with content awareness. Specifically, multi-scale aggregation is first applied to the feature maps to enrich the diversity of interaction information. Then, for each query, irrelevant key-value pairs are filtered out at a coarse-grained level. Finally, fine-grained SA is performed on the remaining key-value pairs. In addition, we design an enhanced downsampling merging (EDM) module and an enhanced upsampling fusion (EUF) module for building pyramid architectures. Using MDSA to construct the basic blocks, combined with EDMs and EUFs, we develop a UNet-like model named MDSA-UNet. Since MDSA-UNet dynamically processes only a small subset of relevant fine-grained features, it achieves strong segmentation performance with high computational efficiency. Extensive experiments on four datasets spanning three different types demonstrate that our MDSA-UNet, without using pre-training, significantly outperforms other non-pretrained methods and even competes with pre-trained models, achieving Dice scores of 82.10% on DDTI, 80.20% on TN3K, 90.75% on ISIC2018, and 91.05% on ACDC. Meanwhile, our model maintains lower complexity, with only 6.65 M parameters and 4.54 G FLOPs at a resolution of 224 × 224, ensuring both effectiveness and efficiency. Code is available at URL.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于多尺度动态稀疏关注网的医学图像分割。
变形金刚最近在医学图像分割中获得了很大的关注,因为它们能够捕获远程依赖关系。然而,在医学图像的大区域中存在过多的背景噪声会引入干扰,并增加了细粒度自注意(SA)机制的计算负担,这是变压器模型的关键组成部分。同时,保留细粒度的细节对于准确分割形状和大小各异的复杂、模糊的医学图像至关重要。因此,我们提出了一种新的多尺度动态稀疏注意(MDSA)模块,该模块可以灵活地降低计算成本,同时保持与内容感知的多尺度细粒度交互。具体而言,首先将多尺度聚合应用到特征映射中,丰富交互信息的多样性。然后,对于每个查询,在粗粒度级别上过滤掉不相关的键值对。最后,对剩余的键值对执行细粒度SA。此外,我们还设计了一个增强的下采样合并(EDM)模块和一个增强的上采样融合(EUF)模块,用于构建金字塔结构。利用MDSA构建基本块,结合edm和euf,我们开发了一个类似unet的模型,命名为MDSA- unet。由于MDSA-UNet仅动态处理相关细粒度特征的一小部分,因此具有较强的分割性能和较高的计算效率。在跨越三种不同类型的四个数据集上进行的大量实验表明,我们的MDSA-UNet在不使用预训练的情况下,显著优于其他非预训练方法,甚至可以与预训练模型竞争,在DDTI上获得82.10%的Dice分数,在TN3K上获得80.20%的分数,在ISIC2018上获得90.75%的分数,在ACDC上获得91.05%的分数。同时,我们的模型保持了较低的复杂度,只有6.65 M个参数和4.54 G FLOPs,分辨率为224×224,保证了有效性和效率。代码可从https://github.com/NEU-LX/MDSA-UNet获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Journal of Biomedical and Health Informatics
IEEE Journal of Biomedical and Health Informatics COMPUTER SCIENCE, INFORMATION SYSTEMS-COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
CiteScore
13.60
自引率
6.50%
发文量
1151
期刊介绍: IEEE Journal of Biomedical and Health Informatics publishes original papers presenting recent advances where information and communication technologies intersect with health, healthcare, life sciences, and biomedicine. Topics include acquisition, transmission, storage, retrieval, management, and analysis of biomedical and health information. The journal covers applications of information technologies in healthcare, patient monitoring, preventive care, early disease diagnosis, therapy discovery, and personalized treatment protocols. It explores electronic medical and health records, clinical information systems, decision support systems, medical and biological imaging informatics, wearable systems, body area/sensor networks, and more. Integration-related topics like interoperability, evidence-based medicine, and secure patient data are also addressed.
期刊最新文献
Deep Learning-Based Vitiligo Activity Evaluation Using Wood's Lamp Imaging: A Clinical Decision Support. STAND-Net: A Spiking Temporal Attention autoeNcoDer Network for Efficient EEG Artifact Removal. Adaptive Segmentation of EEG for Machine Learning Applications. Unsupervised Contrastive Refinement with Graph-Aware Multimodal Interaction for Radiology Report Generation. MDRSGAN : Multi-Scale Deep Residual Shrinkage Generative Adversarial Network for Medical Image Enhancement.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1