Qian Ye , Yaqin Zhou , Guanying Huo , Yan Liu , Yan Zhou , Qingwu Li
{"title":"Reverse cross-refinement network for camouflaged object detection","authors":"Qian Ye , Yaqin Zhou , Guanying Huo , Yan Liu , Yan Zhou , Qingwu Li","doi":"10.1016/j.imavis.2024.105218","DOIUrl":null,"url":null,"abstract":"<div><p>Due to the high intrinsic similarity between camouflaged objects and the background, camouflaged objects often exhibit blurred boundaries, making it challenging to distinguish the boundaries of objects. Existing methods still focus on the overall regional accuracy but not on the boundary quality and are not competent to identify camouflaged objects from the background in complex scenarios. Thus, we propose a novel reverse cross-refinement network called RCR-Net. Specifically, we design a diverse feature enhancement module that simulates the correspondingly expanded receptive fields of the human visual system by using convolutional kernels with different dilation rates in parallel. Also, the boundary attention module is used to reduce the noise of the bottom features. Moreover, a multi-scale feature aggregation module is proposed to transmit the diverse features from pixel-level camouflaged edges to the entire camouflaged object region in a coarse-to-fine manner, which consists of reverse guidance, group guidance, and position guidance. Reverse guidance mines complementary regions and details by erasing already estimated object regions. Group guidance and position guidance integrate different features through simple and effective splitting and connecting operations. Extensive experiments show that RCR-Net outperforms the existing 18 state-of-the-art methods on four widely-used COD datasets. Especially, compared with the existing top-1 model HitNet, RCR-Net significantly improves the performance by ∼<!--> <!-->16.4% (Mean Absolute Error) on the CAMO dataset, showing that RCR-Net could accurately detect camouflaged objects.</p></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"150 ","pages":"Article 105218"},"PeriodicalIF":4.2000,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885624003238","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Due to the high intrinsic similarity between camouflaged objects and the background, camouflaged objects often exhibit blurred boundaries, making it challenging to distinguish the boundaries of objects. Existing methods still focus on the overall regional accuracy but not on the boundary quality and are not competent to identify camouflaged objects from the background in complex scenarios. Thus, we propose a novel reverse cross-refinement network called RCR-Net. Specifically, we design a diverse feature enhancement module that simulates the correspondingly expanded receptive fields of the human visual system by using convolutional kernels with different dilation rates in parallel. Also, the boundary attention module is used to reduce the noise of the bottom features. Moreover, a multi-scale feature aggregation module is proposed to transmit the diverse features from pixel-level camouflaged edges to the entire camouflaged object region in a coarse-to-fine manner, which consists of reverse guidance, group guidance, and position guidance. Reverse guidance mines complementary regions and details by erasing already estimated object regions. Group guidance and position guidance integrate different features through simple and effective splitting and connecting operations. Extensive experiments show that RCR-Net outperforms the existing 18 state-of-the-art methods on four widely-used COD datasets. Especially, compared with the existing top-1 model HitNet, RCR-Net significantly improves the performance by ∼ 16.4% (Mean Absolute Error) on the CAMO dataset, showing that RCR-Net could accurately detect camouflaged objects.
期刊介绍:
Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.