To address poor generalization caused by the scarcity of real samples in image tampering localization, this paper proposes a Local Pixel-level Contrastive Learning Network (LPCLNet). The main contributions are: (1) a contour patch-oriented contrastive learning mechanism that categorizes patches into tampered, authentic, and contour classes, applying pixel-level and patch-level contrastive losses alongside binary cross-entropy loss to leverage boundary information and reduce dependence on synthetic data; (2) an LPCLNet architecture that integrates a multi-scale feature fusion module and an Atrous Spatial Pyramid Pooling module to aggregate fine-grained features and embed contextual information for multi-scale representation of tampered regions; (3) a joint optimization strategy combining InfoNCE contrastive loss with binary cross-entropy loss to enhance feature discriminability and localization accuracy. Experiments on the Columbia, NIST16, CASIA v1, and Coverage datasets demonstrate that LPCLNet achieves comparable or superior performance to mainstream methods without requiring synthetic data pre-training. Specifically, it attains leading F1 scores of 0.529 and 0.369 on CASIA v1 and NIST16, respectively, as well as the highest average IoU of 0.500 and AUC of 0.830 across benchmarks, validating its stable and highly generalizable performance with limited real samples.
扫码关注我们
求助内容:
应助结果提醒方式:
