{"title":"SAM-SP:自我提示让 SAM 再次伟大","authors":"Chunpeng Zhou, Kangjie Ning, Qianqian Shen, Sheng Zhou, Zhi Yu, Haishuai Wang","doi":"arxiv-2408.12364","DOIUrl":null,"url":null,"abstract":"The recently introduced Segment Anything Model (SAM), a Visual Foundation\nModel (VFM), has demonstrated impressive capabilities in zero-shot segmentation\ntasks across diverse natural image datasets. Despite its success, SAM\nencounters noticeably performance degradation when applied to specific domains,\nsuch as medical images. Current efforts to address this issue have involved\nfine-tuning strategies, intended to bolster the generalizability of the vanilla\nSAM. However, these approaches still predominantly necessitate the utilization\nof domain specific expert-level prompts during the evaluation phase, which\nseverely constrains the model's practicality. To overcome this limitation, we introduce a novel self-prompting based\nfine-tuning approach, called SAM-SP, tailored for extending the vanilla SAM\nmodel. Specifically, SAM-SP leverages the output from the previous iteration of\nthe model itself as prompts to guide subsequent iteration of the model. This\nself-prompting module endeavors to learn how to generate useful prompts\nautonomously and alleviates the dependence on expert prompts during the\nevaluation phase, significantly broadening SAM's applicability. Additionally,\nwe integrate a self-distillation module to enhance the self-prompting process\nfurther. Extensive experiments across various domain specific datasets validate\nthe effectiveness of the proposed SAM-SP. Our SAM-SP not only alleviates the\nreliance on expert prompts but also exhibits superior segmentation performance\ncomparing to the state-of-the-art task-specific segmentation approaches, the\nvanilla SAM, and SAM-based approaches.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"107 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SAM-SP: Self-Prompting Makes SAM Great Again\",\"authors\":\"Chunpeng Zhou, Kangjie Ning, Qianqian Shen, Sheng Zhou, Zhi Yu, Haishuai Wang\",\"doi\":\"arxiv-2408.12364\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The recently introduced Segment Anything Model (SAM), a Visual Foundation\\nModel (VFM), has demonstrated impressive capabilities in zero-shot segmentation\\ntasks across diverse natural image datasets. Despite its success, SAM\\nencounters noticeably performance degradation when applied to specific domains,\\nsuch as medical images. Current efforts to address this issue have involved\\nfine-tuning strategies, intended to bolster the generalizability of the vanilla\\nSAM. However, these approaches still predominantly necessitate the utilization\\nof domain specific expert-level prompts during the evaluation phase, which\\nseverely constrains the model's practicality. To overcome this limitation, we introduce a novel self-prompting based\\nfine-tuning approach, called SAM-SP, tailored for extending the vanilla SAM\\nmodel. Specifically, SAM-SP leverages the output from the previous iteration of\\nthe model itself as prompts to guide subsequent iteration of the model. This\\nself-prompting module endeavors to learn how to generate useful prompts\\nautonomously and alleviates the dependence on expert prompts during the\\nevaluation phase, significantly broadening SAM's applicability. Additionally,\\nwe integrate a self-distillation module to enhance the self-prompting process\\nfurther. Extensive experiments across various domain specific datasets validate\\nthe effectiveness of the proposed SAM-SP. Our SAM-SP not only alleviates the\\nreliance on expert prompts but also exhibits superior segmentation performance\\ncomparing to the state-of-the-art task-specific segmentation approaches, the\\nvanilla SAM, and SAM-based approaches.\",\"PeriodicalId\":501168,\"journal\":{\"name\":\"arXiv - CS - Emerging Technologies\",\"volume\":\"107 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Emerging Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.12364\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Emerging Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.12364","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
最近推出的视觉基础模型(Visual FoundationModel,VFM)--"任意分割模型"(Segment Anything Model,SAM)在各种自然图像数据集的零镜头分割任务中表现出了令人印象深刻的能力。尽管取得了成功,但当 SAM 应用于医疗图像等特定领域时,其性能却明显下降。目前解决这一问题的方法包括微调策略,旨在增强 vanillaSAM 的通用性。然而,这些方法在评估阶段仍然主要需要使用特定领域的专家级提示,这严重限制了模型的实用性。为了克服这一局限性,我们引入了一种新颖的基于自我提示的微调方法,称为 SAM-SP,专门用于扩展普通 SAM 模型。具体来说,SAM-SP 利用上一次模型迭代的输出作为提示,指导模型的后续迭代。这个自我提示模块努力学习如何自主生成有用的提示,减轻了评估阶段对专家提示的依赖,从而大大拓宽了 SAM 的适用性。此外,我们还集成了一个自发模块,以进一步增强自我提示过程。在各种特定领域数据集上进行的广泛实验验证了所提出的 SAM-SP 的有效性。我们的 SAM-SP 不仅减轻了对专家提示的依赖,而且与最先进的特定任务分割方法、Vanilla SAM 和基于 SAM 的方法相比,表现出更优越的分割性能。
The recently introduced Segment Anything Model (SAM), a Visual Foundation
Model (VFM), has demonstrated impressive capabilities in zero-shot segmentation
tasks across diverse natural image datasets. Despite its success, SAM
encounters noticeably performance degradation when applied to specific domains,
such as medical images. Current efforts to address this issue have involved
fine-tuning strategies, intended to bolster the generalizability of the vanilla
SAM. However, these approaches still predominantly necessitate the utilization
of domain specific expert-level prompts during the evaluation phase, which
severely constrains the model's practicality. To overcome this limitation, we introduce a novel self-prompting based
fine-tuning approach, called SAM-SP, tailored for extending the vanilla SAM
model. Specifically, SAM-SP leverages the output from the previous iteration of
the model itself as prompts to guide subsequent iteration of the model. This
self-prompting module endeavors to learn how to generate useful prompts
autonomously and alleviates the dependence on expert prompts during the
evaluation phase, significantly broadening SAM's applicability. Additionally,
we integrate a self-distillation module to enhance the self-prompting process
further. Extensive experiments across various domain specific datasets validate
the effectiveness of the proposed SAM-SP. Our SAM-SP not only alleviates the
reliance on expert prompts but also exhibits superior segmentation performance
comparing to the state-of-the-art task-specific segmentation approaches, the
vanilla SAM, and SAM-based approaches.