SAM-SP：自我提示让 SAM 再次伟大

arXiv - CS - Emerging Technologies Pub Date : 2024-08-22 DOI:arxiv-2408.12364

Chunpeng Zhou, Kangjie Ning, Qianqian Shen, Sheng Zhou, Zhi Yu, Haishuai Wang

{"title":"SAM-SP：自我提示让 SAM 再次伟大","authors":"Chunpeng Zhou, Kangjie Ning, Qianqian Shen, Sheng Zhou, Zhi Yu, Haishuai Wang","doi":"arxiv-2408.12364","DOIUrl":null,"url":null,"abstract":"The recently introduced Segment Anything Model (SAM), a Visual Foundation\nModel (VFM), has demonstrated impressive capabilities in zero-shot segmentation\ntasks across diverse natural image datasets. Despite its success, SAM\nencounters noticeably performance degradation when applied to specific domains,\nsuch as medical images. Current efforts to address this issue have involved\nfine-tuning strategies, intended to bolster the generalizability of the vanilla\nSAM. However, these approaches still predominantly necessitate the utilization\nof domain specific expert-level prompts during the evaluation phase, which\nseverely constrains the model's practicality. To overcome this limitation, we introduce a novel self-prompting based\nfine-tuning approach, called SAM-SP, tailored for extending the vanilla SAM\nmodel. Specifically, SAM-SP leverages the output from the previous iteration of\nthe model itself as prompts to guide subsequent iteration of the model. This\nself-prompting module endeavors to learn how to generate useful prompts\nautonomously and alleviates the dependence on expert prompts during the\nevaluation phase, significantly broadening SAM's applicability. Additionally,\nwe integrate a self-distillation module to enhance the self-prompting process\nfurther. Extensive experiments across various domain specific datasets validate\nthe effectiveness of the proposed SAM-SP. Our SAM-SP not only alleviates the\nreliance on expert prompts but also exhibits superior segmentation performance\ncomparing to the state-of-the-art task-specific segmentation approaches, the\nvanilla SAM, and SAM-based approaches.","PeriodicalId":501168,"journal":{"name":"arXiv - CS - Emerging Technologies","volume":"107 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SAM-SP: Self-Prompting Makes SAM Great Again\",\"authors\":\"Chunpeng Zhou, Kangjie Ning, Qianqian Shen, Sheng Zhou, Zhi Yu, Haishuai Wang\",\"doi\":\"arxiv-2408.12364\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The recently introduced Segment Anything Model (SAM), a Visual Foundation\\nModel (VFM), has demonstrated impressive capabilities in zero-shot segmentation\\ntasks across diverse natural image datasets. Despite its success, SAM\\nencounters noticeably performance degradation when applied to specific domains,\\nsuch as medical images. Current efforts to address this issue have involved\\nfine-tuning strategies, intended to bolster the generalizability of the vanilla\\nSAM. However, these approaches still predominantly necessitate the utilization\\nof domain specific expert-level prompts during the evaluation phase, which\\nseverely constrains the model's practicality. To overcome this limitation, we introduce a novel self-prompting based\\nfine-tuning approach, called SAM-SP, tailored for extending the vanilla SAM\\nmodel. Specifically, SAM-SP leverages the output from the previous iteration of\\nthe model itself as prompts to guide subsequent iteration of the model. This\\nself-prompting module endeavors to learn how to generate useful prompts\\nautonomously and alleviates the dependence on expert prompts during the\\nevaluation phase, significantly broadening SAM's applicability. Additionally,\\nwe integrate a self-distillation module to enhance the self-prompting process\\nfurther. Extensive experiments across various domain specific datasets validate\\nthe effectiveness of the proposed SAM-SP. Our SAM-SP not only alleviates the\\nreliance on expert prompts but also exhibits superior segmentation performance\\ncomparing to the state-of-the-art task-specific segmentation approaches, the\\nvanilla SAM, and SAM-based approaches.\",\"PeriodicalId\":501168,\"journal\":{\"name\":\"arXiv - CS - Emerging Technologies\",\"volume\":\"107 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Emerging Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.12364\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Emerging Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.12364","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

最近推出的视觉基础模型（Visual FoundationModel，VFM）--"任意分割模型"（Segment Anything Model，SAM）在各种自然图像数据集的零镜头分割任务中表现出了令人印象深刻的能力。尽管取得了成功，但当 SAM 应用于医疗图像等特定领域时，其性能却明显下降。目前解决这一问题的方法包括微调策略，旨在增强 vanillaSAM 的通用性。然而，这些方法在评估阶段仍然主要需要使用特定领域的专家级提示，这严重限制了模型的实用性。为了克服这一局限性，我们引入了一种新颖的基于自我提示的微调方法，称为 SAM-SP，专门用于扩展普通 SAM 模型。具体来说，SAM-SP 利用上一次模型迭代的输出作为提示，指导模型的后续迭代。这个自我提示模块努力学习如何自主生成有用的提示，减轻了评估阶段对专家提示的依赖，从而大大拓宽了 SAM 的适用性。此外，我们还集成了一个自发模块，以进一步增强自我提示过程。在各种特定领域数据集上进行的广泛实验验证了所提出的 SAM-SP 的有效性。我们的 SAM-SP 不仅减轻了对专家提示的依赖，而且与最先进的特定任务分割方法、Vanilla SAM 和基于 SAM 的方法相比，表现出更优越的分割性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

SAM-SP: Self-Prompting Makes SAM Great Again

The recently introduced Segment Anything Model (SAM), a Visual Foundation Model (VFM), has demonstrated impressive capabilities in zero-shot segmentation tasks across diverse natural image datasets. Despite its success, SAM encounters noticeably performance degradation when applied to specific domains, such as medical images. Current efforts to address this issue have involved fine-tuning strategies, intended to bolster the generalizability of the vanilla SAM. However, these approaches still predominantly necessitate the utilization of domain specific expert-level prompts during the evaluation phase, which severely constrains the model's practicality. To overcome this limitation, we introduce a novel self-prompting based fine-tuning approach, called SAM-SP, tailored for extending the vanilla SAM model. Specifically, SAM-SP leverages the output from the previous iteration of the model itself as prompts to guide subsequent iteration of the model. This self-prompting module endeavors to learn how to generate useful prompts autonomously and alleviates the dependence on expert prompts during the evaluation phase, significantly broadening SAM's applicability. Additionally, we integrate a self-distillation module to enhance the self-prompting process further. Extensive experiments across various domain specific datasets validate the effectiveness of the proposed SAM-SP. Our SAM-SP not only alleviates the reliance on expert prompts but also exhibits superior segmentation performance comparing to the state-of-the-art task-specific segmentation approaches, the vanilla SAM, and SAM-based approaches.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Emerging Technologies

自引率

0.00%

发文量