Swin-UMamba†: Adapting Mamba-Based Vision Foundation Models for Medical Image Segmentation

IEEE transactions on medical imaging Pub Date : 2024-11-28 DOI:10.1109/TMI.2024.3508698

Jiarun Liu;Hao Yang;Hong-Yu Zhou;Lequan Yu;Yong Liang;Yizhou Yu;Shaoting Zhang;Hairong Zheng;Shanshan Wang

{"title":"Swin-UMamba†: Adapting Mamba-Based Vision Foundation Models for Medical Image Segmentation","authors":"Jiarun Liu;Hao Yang;Hong-Yu Zhou;Lequan Yu;Yong Liang;Yizhou Yu;Shaoting Zhang;Hairong Zheng;Shanshan Wang","doi":"10.1109/TMI.2024.3508698","DOIUrl":null,"url":null,"abstract":"Vision foundation models have shown great potential in improving generalizability and data efficiency, especially for medical image segmentation since medical image datasets are relatively small due to high annotation costs and privacy concerns. However, current research on foundation models predominantly relies on transformers. The high quadratic complexity and large parameter counts make these models computationally expensive, limiting their potential for clinical applications. In this work, we introduce Swin-UMamba†, a novel Mamba-based model for medical image segmentation that seamlessly leverages the power of the vision foundation model, which is also computationally efficient with the linear complexity of Mamba. Moreover, we investigated and verified the impact of the vision foundation model on medical image segmentation, in which a self-supervised model adaptation scheme was designed to bridge the gap between natural and medical data. Notably, Swin-UMamba† outperforms 7 state-of-the-art methods, including CNN-based, transformer-based, and Mamba-based approaches across AbdomenMRI, Encoscopy, and Microscopy datasets. The code and models are publicly available at: <uri>https://github.com/JiarunLiu/Swin-UMamba</uri>.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 10","pages":"3898-3908"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical imaging","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10771659/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Vision foundation models have shown great potential in improving generalizability and data efficiency, especially for medical image segmentation since medical image datasets are relatively small due to high annotation costs and privacy concerns. However, current research on foundation models predominantly relies on transformers. The high quadratic complexity and large parameter counts make these models computationally expensive, limiting their potential for clinical applications. In this work, we introduce Swin-UMamba†, a novel Mamba-based model for medical image segmentation that seamlessly leverages the power of the vision foundation model, which is also computationally efficient with the linear complexity of Mamba. Moreover, we investigated and verified the impact of the vision foundation model on medical image segmentation, in which a self-supervised model adaptation scheme was designed to bridge the gap between natural and medical data. Notably, Swin-UMamba† outperforms 7 state-of-the-art methods, including CNN-based, transformer-based, and Mamba-based approaches across AbdomenMRI, Encoscopy, and Microscopy datasets. The code and models are publicly available at: https://github.com/JiarunLiu/Swin-UMamba.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

swan - umamba†：适应基于mamba的视觉基础模型用于医学图像分割

视觉基础模型在提高泛化性和数据效率方面显示出巨大的潜力，特别是在医学图像分割方面，因为医学图像数据集由于注释成本高和隐私问题而相对较小。然而，目前的基础模型研究主要依赖于变压器。高二次复杂度和大参数计数使得这些模型的计算成本很高，限制了它们在临床应用中的潜力。在这项工作中，我们引入了swan - umamba†，这是一种新的基于曼巴的医学图像分割模型，它无缝地利用了视觉基础模型的功能，并且在计算效率上具有曼巴的线性复杂性。此外，我们研究并验证了视觉基础模型对医学图像分割的影响，其中设计了一种自监督模型自适应方案，以弥合自然数据与医学数据之间的差距。值得注意的是，swan - umamba†优于7种最先进的方法，包括基于cnn的，基于变压器的和基于mamba的方法，跨越腹部mri， Encoscopy和显微镜数据集。代码和模型可在：https://github.com/JiarunLiu/Swin-UMamba上公开获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE transactions on medical imaging

自引率

0.00%

发文量