Recent advances in the Segment Anything Model (SAM) have demonstrated remarkable zero-shot segmentation and interactive editing capabilities in general computer vision. However, adapting SAM to medical imaging remains challenging due to substantial gaps in imaging physics, contrast distributions, and structural priors between natural and medical images. Achieving optimal performance typically requires extensive fine-tuning on large-scale medical datasets and high-quality manual prompts. To address these limitations, we propose CCF-SAM, a parameter-efficient adaptation framework for medical segmentation. With the SAM image encoder frozen, CCF-SAM constructs a coarse-to-fine, two-stage pipeline: a standard decoder first generates a coarse prior mask, which is then refined by a Contrastive Prototype Refiner that softly disentangles foreground/background tokens and enhances their discriminability via token-level contrastive learning. An EMA-based prototype memory accumulates stable semantic anchors across images, and a cross-attention re-embedding module injects the enhanced prototypes into spatial features to drive fine-grained decoding. The framework trains only the prompt encoder, the mask decoder, and the newly added lightweight modules, substantially reducing training costs while ensuring reproducibility. Comprehensive evaluations and ablations on multiple representative 2D medical segmentation benchmarks show that CCF-SAM consistently outperforms classical CNN/Transformer baselines and recent SAM-based approaches. Qualitative results further indicate superior recall and boundary consistency on small or low-contrast lesions, validating a prototype-guided, progressive refinement paradigm for adapting SAM to medical imaging. Code is publicly available at https://github.com/KKKKAAAAIIII/CCF-SAM.
扫码关注我们
求助内容:
应助结果提醒方式:
