SwinSAM: Fine-grained polyp segmentation in colonoscopy images via segment anything model integrated with a Swin Transformer decoder

IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Biomedical Signal Processing and Control Pub Date : 2024-11-15 DOI:10.1016/j.bspc.2024.107055
Zhoushan Feng , Yuliang Zhang , Yanhong Chen , Yiyu Shi , Yu Liu , Wen Sun , Lili Du , Dunjin Chen
{"title":"SwinSAM: Fine-grained polyp segmentation in colonoscopy images via segment anything model integrated with a Swin Transformer decoder","authors":"Zhoushan Feng ,&nbsp;Yuliang Zhang ,&nbsp;Yanhong Chen ,&nbsp;Yiyu Shi ,&nbsp;Yu Liu ,&nbsp;Wen Sun ,&nbsp;Lili Du ,&nbsp;Dunjin Chen","doi":"10.1016/j.bspc.2024.107055","DOIUrl":null,"url":null,"abstract":"<div><div>Polyp segmentation in colonoscopy imagery is a critical procedure in the early detection and preemptive management of colorectal cancer. In facilitating the diagnostic procedures, it is pivotal to attain segmentation with high precision, emphasizing fine-grained details which can potentially harbor crucial information regarding the disease state. To address the prevailing demand for more refined segmentation techniques, this study introduces an innovative framework “SwinSAM”, which ingeniously integrates a Swin Transformer decoder with a SAM encoder. The SAM model has seen over a billion images and possesses a strong capability for image comprehension. However, its training data primarily originates from natural images rather than medical ones. Hence, we designed an adapter module to infuse specific medical domain information into SAM. Furthermore, due to the varying sizes and shapes of polyps, along with their high blending degree with the background, the simplistic convolutional decoder in the original SAM model struggles to accurately segment the intricate details of polyps. This prompted us to utilize the Swin Transformer as the decoder. Additionally, considering the significant shape variations of polyps, we employed a multi-scale perception fusion module to process the deep features extracted by SAM. By using convolutions with different receptive fields, we can extract information about polyps of various shapes. Finally, we optimized the network parameters through multi-level supervision. Comprehensive experiments were conducted on five commonly used polyp segmentation datasets. The results validate that our proposed method achieves good performance across datasets with different polyp backgrounds.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"100 ","pages":"Article 107055"},"PeriodicalIF":4.9000,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical Signal Processing and Control","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1746809424011133","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Polyp segmentation in colonoscopy imagery is a critical procedure in the early detection and preemptive management of colorectal cancer. In facilitating the diagnostic procedures, it is pivotal to attain segmentation with high precision, emphasizing fine-grained details which can potentially harbor crucial information regarding the disease state. To address the prevailing demand for more refined segmentation techniques, this study introduces an innovative framework “SwinSAM”, which ingeniously integrates a Swin Transformer decoder with a SAM encoder. The SAM model has seen over a billion images and possesses a strong capability for image comprehension. However, its training data primarily originates from natural images rather than medical ones. Hence, we designed an adapter module to infuse specific medical domain information into SAM. Furthermore, due to the varying sizes and shapes of polyps, along with their high blending degree with the background, the simplistic convolutional decoder in the original SAM model struggles to accurately segment the intricate details of polyps. This prompted us to utilize the Swin Transformer as the decoder. Additionally, considering the significant shape variations of polyps, we employed a multi-scale perception fusion module to process the deep features extracted by SAM. By using convolutions with different receptive fields, we can extract information about polyps of various shapes. Finally, we optimized the network parameters through multi-level supervision. Comprehensive experiments were conducted on five commonly used polyp segmentation datasets. The results validate that our proposed method achieves good performance across datasets with different polyp backgrounds.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
SwinSAM:通过与 Swin Transformer 解码器集成的分段任何模型对结肠镜图像中的息肉进行精细分割
结肠镜成像中的息肉分割是早期检测和预防性治疗结肠直肠癌的关键程序。在促进诊断程序的过程中,关键是要实现高精度的分割,强调细粒度的细节,因为这些细节可能蕴藏着有关疾病状态的关键信息。为了满足对更精细分割技术的普遍需求,本研究引入了一个创新框架 "SwinSAM",它巧妙地将 Swin 变压器解码器与 SAM 编码器集成在一起。SAM 模型已处理过超过十亿幅图像,具有很强的图像理解能力。不过,它的训练数据主要来自自然图像而非医学图像。因此,我们设计了一个适配器模块,为 SAM 注入特定的医学领域信息。此外,由于息肉的大小和形状各不相同,与背景的融合度也很高,原始 SAM 模型中的简单卷积解码器难以准确分割息肉的复杂细节。这促使我们使用斯温变换器作为解码器。此外,考虑到息肉形状的显著变化,我们采用了多尺度感知融合模块来处理 SAM 提取的深度特征。通过使用不同感受野的卷积,我们可以提取各种形状息肉的信息。最后,我们通过多级监督优化了网络参数。我们在五个常用的息肉分割数据集上进行了综合实验。结果验证了我们提出的方法在不同息肉背景的数据集上都能取得良好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Biomedical Signal Processing and Control
Biomedical Signal Processing and Control 工程技术-工程:生物医学
CiteScore
9.80
自引率
13.70%
发文量
822
审稿时长
4 months
期刊介绍: Biomedical Signal Processing and Control aims to provide a cross-disciplinary international forum for the interchange of information on research in the measurement and analysis of signals and images in clinical medicine and the biological sciences. Emphasis is placed on contributions dealing with the practical, applications-led research on the use of methods and devices in clinical diagnosis, patient monitoring and management. Biomedical Signal Processing and Control reflects the main areas in which these methods are being used and developed at the interface of both engineering and clinical science. The scope of the journal is defined to include relevant review papers, technical notes, short communications and letters. Tutorial papers and special issues will also be published.
期刊最新文献
Innovative brain tumor detection: Stacked random support vector-based hybrid gazelle coati algorithm A novel optimized machine learning approach with texture rectified cross-attention based transformer for COVID-19 detection A lightweight model for the retinal disease classification using optical coherence tomography An improved ECG data compression scheme based on ensemble empirical mode decomposition Performance evaluation of optimal ensemble learning approaches with PCA and LDA-based feature extraction for heart disease prediction
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1