{"title":"Supervised contrastive learning with multi-scale interaction and integrity learning for salient object detection","authors":"Yu Bi, Zhenxue Chen, Chengyun Liu, Tian Liang, Fei Zheng","doi":"10.1007/s00138-024-01552-0","DOIUrl":null,"url":null,"abstract":"<p>Salient object detection (SOD) is designed to mimic human visual mechanisms to identify and segment the most salient part of an image. Although related works have achieved great progress in SOD, they are limited when it comes to interferences of non-salient objects, finely shaped objects and co-salient objects. To improve the effectiveness and capability of SOD, we propose a supervised contrastive learning network with multi-scale interaction and integrity learning named SCLNet. It adopts contrastive learning (CL), multi-reception field confusion (MRFC) and context enhancement (CE) mechanisms. Using this method, the input image is first divided into two branches after two different data augmentations. Unlike existing models, which focus more on boundary guidance, we add a random position mask on one branch to break the continuous of objects. Through the CL module, we obtain more semantic information than appearance information by learning the invariance of different data augmentations. The MRFC module is then designed to learn the internal connections and common influences of various reception field features layer by layer. Next, the obtained features are learned through the CE module for the integrity and continuity of salient objects. Finally, comprehensive evaluations on five challenging benchmark datasets show that SCLNet achieves superior results. Code is available at https://github.com/YuPangpangpang/SCLNet.</p>","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":"07 1","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Vision and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00138-024-01552-0","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Salient object detection (SOD) is designed to mimic human visual mechanisms to identify and segment the most salient part of an image. Although related works have achieved great progress in SOD, they are limited when it comes to interferences of non-salient objects, finely shaped objects and co-salient objects. To improve the effectiveness and capability of SOD, we propose a supervised contrastive learning network with multi-scale interaction and integrity learning named SCLNet. It adopts contrastive learning (CL), multi-reception field confusion (MRFC) and context enhancement (CE) mechanisms. Using this method, the input image is first divided into two branches after two different data augmentations. Unlike existing models, which focus more on boundary guidance, we add a random position mask on one branch to break the continuous of objects. Through the CL module, we obtain more semantic information than appearance information by learning the invariance of different data augmentations. The MRFC module is then designed to learn the internal connections and common influences of various reception field features layer by layer. Next, the obtained features are learned through the CE module for the integrity and continuity of salient objects. Finally, comprehensive evaluations on five challenging benchmark datasets show that SCLNet achieves superior results. Code is available at https://github.com/YuPangpangpang/SCLNet.
期刊介绍:
Machine Vision and Applications publishes high-quality technical contributions in machine vision research and development. Specifically, the editors encourage submittals in all applications and engineering aspects of image-related computing. In particular, original contributions dealing with scientific, commercial, industrial, military, and biomedical applications of machine vision, are all within the scope of the journal.
Particular emphasis is placed on engineering and technology aspects of image processing and computer vision.
The following aspects of machine vision applications are of interest: algorithms, architectures, VLSI implementations, AI techniques and expert systems for machine vision, front-end sensing, multidimensional and multisensor machine vision, real-time techniques, image databases, virtual reality and visualization. Papers must include a significant experimental validation component.