{"title":"Asymmetric patch sampling for contrastive learning","authors":"","doi":"10.1016/j.patcog.2024.111012","DOIUrl":null,"url":null,"abstract":"<div><p>Asymmetric appearance between positive pair effectively reduces the risk of representation degradation in contrastive learning. However, there are still a mass of appearance similarities between positive pair constructed by the existing methods, thus inhibiting the further representation improvement. To address the above issue, we propose a novel asymmetric patch sampling strategy, which significantly reduces the appearance similarities but retains the image semantics. Specifically, dual patch sampling strategies are respectively applied to the given image. First, sparse patch sampling is conducted to obtain the first view, which reduces spatial redundancy of image and allows a more asymmetric view. Second, a selective patch sampling is proposed to construct another view with large appearance discrepancy relative to the first one. Due to the inappreciable appearance similarities between positive pair, the trained model is encouraged to capture the similarities on semantics, instead of low-level ones.</p><p>Experimental results demonstrate that our method significantly outperforms the existing self-supervised learning methods on ImageNet-1K and CIFAR datasets, e.g., 2.5% finetuning accuracy improvement on CIFAR100. Furthermore, our method achieves state-of-the-art performance on downstream tasks, object detection and instance segmentation on COCO. Additionally, compared to other self-supervised methods, our method is more efficient on both memory and computation during pretraining. The source code and the trained weights are available at <span><span>https://github.com/visresearch/aps</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":7.5000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320324007635","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Asymmetric appearance between positive pair effectively reduces the risk of representation degradation in contrastive learning. However, there are still a mass of appearance similarities between positive pair constructed by the existing methods, thus inhibiting the further representation improvement. To address the above issue, we propose a novel asymmetric patch sampling strategy, which significantly reduces the appearance similarities but retains the image semantics. Specifically, dual patch sampling strategies are respectively applied to the given image. First, sparse patch sampling is conducted to obtain the first view, which reduces spatial redundancy of image and allows a more asymmetric view. Second, a selective patch sampling is proposed to construct another view with large appearance discrepancy relative to the first one. Due to the inappreciable appearance similarities between positive pair, the trained model is encouraged to capture the similarities on semantics, instead of low-level ones.
Experimental results demonstrate that our method significantly outperforms the existing self-supervised learning methods on ImageNet-1K and CIFAR datasets, e.g., 2.5% finetuning accuracy improvement on CIFAR100. Furthermore, our method achieves state-of-the-art performance on downstream tasks, object detection and instance segmentation on COCO. Additionally, compared to other self-supervised methods, our method is more efficient on both memory and computation during pretraining. The source code and the trained weights are available at https://github.com/visresearch/aps.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.