Fei Wu;Jun Yin;Xiaochuan Li;Jianfeng Wu;Da Jin;Jiamin Yang
{"title":"CoNet: A Consistency-Oriented Network for Camouflaged Object Segmentation","authors":"Fei Wu;Jun Yin;Xiaochuan Li;Jianfeng Wu;Da Jin;Jiamin Yang","doi":"10.1109/TCSVT.2024.3462465","DOIUrl":null,"url":null,"abstract":"Camouflaged object segmentation (COS) is a recently emerging task due to its broad application prospect. The coloration and texture similarities between the objects and their surroundings makes it a challenging task. Motivated by this, we propose a consistency-oriented network (CoNet) to address these challenges by looking into the visual consistencies between object and background. Specifically, we design a primary detection module (PDM) to firstly locate the object by fusing the backbone features. A filter is introduced to better focus on the object’s foreground feature based on its primary location. To obtain the visual consistency between the object and background, the foreground feature is then fed into the consistency evaluation module (CEM) to interact with the global feature. Both features are simultaneously processed by a shared discriminator and then fused together to attain the consistency attention map. The final feature refinement is conducted in the detail refinement module (DRM) by merging the consistency attention map with the global features via hierarchical feature fusion. Extensive experiments on benchmark COS datasets show that the proposed CoNet outperforms the state-of-the-art (SOTA) models in most cases. Ablation experiments verify the effectiveness of different backbones, designed modules and upsampling methods. Furthermore, extra studies on the labelling techniques and interdisciplinary applications demonstrate the great potential of the proposed CoNet.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 1","pages":"287-299"},"PeriodicalIF":11.1000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10681598/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Camouflaged object segmentation (COS) is a recently emerging task due to its broad application prospect. The coloration and texture similarities between the objects and their surroundings makes it a challenging task. Motivated by this, we propose a consistency-oriented network (CoNet) to address these challenges by looking into the visual consistencies between object and background. Specifically, we design a primary detection module (PDM) to firstly locate the object by fusing the backbone features. A filter is introduced to better focus on the object’s foreground feature based on its primary location. To obtain the visual consistency between the object and background, the foreground feature is then fed into the consistency evaluation module (CEM) to interact with the global feature. Both features are simultaneously processed by a shared discriminator and then fused together to attain the consistency attention map. The final feature refinement is conducted in the detail refinement module (DRM) by merging the consistency attention map with the global features via hierarchical feature fusion. Extensive experiments on benchmark COS datasets show that the proposed CoNet outperforms the state-of-the-art (SOTA) models in most cases. Ablation experiments verify the effectiveness of different backbones, designed modules and upsampling methods. Furthermore, extra studies on the labelling techniques and interdisciplinary applications demonstrate the great potential of the proposed CoNet.
期刊介绍:
The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.