Swin-UMamba†: Adapting Mamba-Based Vision Foundation Models for Medical Image Segmentation

Jiarun Liu;Hao Yang;Hong-Yu Zhou;Lequan Yu;Yong Liang;Yizhou Yu;Shaoting Zhang;Hairong Zheng;Shanshan Wang
{"title":"Swin-UMamba†: Adapting Mamba-Based Vision Foundation Models for Medical Image Segmentation","authors":"Jiarun Liu;Hao Yang;Hong-Yu Zhou;Lequan Yu;Yong Liang;Yizhou Yu;Shaoting Zhang;Hairong Zheng;Shanshan Wang","doi":"10.1109/TMI.2024.3508698","DOIUrl":null,"url":null,"abstract":"Vision foundation models have shown great potential in improving generalizability and data efficiency, especially for medical image segmentation since medical image datasets are relatively small due to high annotation costs and privacy concerns. However, current research on foundation models predominantly relies on transformers. The high quadratic complexity and large parameter counts make these models computationally expensive, limiting their potential for clinical applications. In this work, we introduce Swin-UMamba†, a novel Mamba-based model for medical image segmentation that seamlessly leverages the power of the vision foundation model, which is also computationally efficient with the linear complexity of Mamba. Moreover, we investigated and verified the impact of the vision foundation model on medical image segmentation, in which a self-supervised model adaptation scheme was designed to bridge the gap between natural and medical data. Notably, Swin-UMamba† outperforms 7 state-of-the-art methods, including CNN-based, transformer-based, and Mamba-based approaches across AbdomenMRI, Encoscopy, and Microscopy datasets. The code and models are publicly available at: <uri>https://github.com/JiarunLiu/Swin-UMamba</uri>.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 10","pages":"3898-3908"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical imaging","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10771659/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Vision foundation models have shown great potential in improving generalizability and data efficiency, especially for medical image segmentation since medical image datasets are relatively small due to high annotation costs and privacy concerns. However, current research on foundation models predominantly relies on transformers. The high quadratic complexity and large parameter counts make these models computationally expensive, limiting their potential for clinical applications. In this work, we introduce Swin-UMamba†, a novel Mamba-based model for medical image segmentation that seamlessly leverages the power of the vision foundation model, which is also computationally efficient with the linear complexity of Mamba. Moreover, we investigated and verified the impact of the vision foundation model on medical image segmentation, in which a self-supervised model adaptation scheme was designed to bridge the gap between natural and medical data. Notably, Swin-UMamba† outperforms 7 state-of-the-art methods, including CNN-based, transformer-based, and Mamba-based approaches across AbdomenMRI, Encoscopy, and Microscopy datasets. The code and models are publicly available at: https://github.com/JiarunLiu/Swin-UMamba.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
swan - umamba†:适应基于mamba的视觉基础模型用于医学图像分割
视觉基础模型在提高泛化性和数据效率方面显示出巨大的潜力,特别是在医学图像分割方面,因为医学图像数据集由于注释成本高和隐私问题而相对较小。然而,目前的基础模型研究主要依赖于变压器。高二次复杂度和大参数计数使得这些模型的计算成本很高,限制了它们在临床应用中的潜力。在这项工作中,我们引入了swan - umamba†,这是一种新的基于曼巴的医学图像分割模型,它无缝地利用了视觉基础模型的功能,并且在计算效率上具有曼巴的线性复杂性。此外,我们研究并验证了视觉基础模型对医学图像分割的影响,其中设计了一种自监督模型自适应方案,以弥合自然数据与医学数据之间的差距。值得注意的是,swan - umamba†优于7种最先进的方法,包括基于cnn的,基于变压器的和基于mamba的方法,跨越腹部mri, Encoscopy和显微镜数据集。代码和模型可在:https://github.com/JiarunLiu/Swin-UMamba上公开获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Interpretable Multimodal Learning for Cardiovascular Hemodynamics Assessment. Observer-usable Information as a Task-specific Image Quality Metric. A 3D Cross-modal Keypoint Descriptor for MR-US Matching and Registration. FairFedMed: Benchmarking Group Fairness in Federated Medical Imaging With FairLoRA. MIRROR: Multi-Modal Pathological Self-Supervised Representation Learning via Modality Alignment and Retention
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1