Yongri Piao, Wei Ji, Jingjing Li, Miao Zhang, Huchuan Lu
{"title":"深度诱导的多尺度循环注意网络显著性检测","authors":"Yongri Piao, Wei Ji, Jingjing Li, Miao Zhang, Huchuan Lu","doi":"10.1109/ICCV.2019.00735","DOIUrl":null,"url":null,"abstract":"In this work, we propose a novel depth-induced multi-scale recurrent attention network for saliency detection. It achieves dramatic performance especially in complex scenarios. There are three main contributions of our network that are experimentally demonstrated to have significant practical merits. First, we design an effective depth refinement block using residual connections to fully extract and fuse multi-level paired complementary cues from RGB and depth streams. Second, depth cues with abundant spatial information are innovatively combined with multi-scale context features for accurately locating salient objects. Third, we boost our model's performance by a novel recurrent attention module inspired by Internal Generative Mechanism of human brain. This module can generate more accurate saliency results via comprehensively learning the internal semantic relation of the fused feature and progressively optimizing local details with memory-oriented scene understanding. In addition, we create a large scale RGB-D dataset containing more complex scenarios, which can contribute to comprehensively evaluating saliency models. Extensive experiments on six public datasets and ours demonstrate that our method can accurately identify salient objects and achieve consistently superior performance over 16 state-of-the-art RGB and RGB-D approaches.","PeriodicalId":6728,"journal":{"name":"2019 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"31 1","pages":"7253-7262"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"300","resultStr":"{\"title\":\"Depth-Induced Multi-Scale Recurrent Attention Network for Saliency Detection\",\"authors\":\"Yongri Piao, Wei Ji, Jingjing Li, Miao Zhang, Huchuan Lu\",\"doi\":\"10.1109/ICCV.2019.00735\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this work, we propose a novel depth-induced multi-scale recurrent attention network for saliency detection. It achieves dramatic performance especially in complex scenarios. There are three main contributions of our network that are experimentally demonstrated to have significant practical merits. First, we design an effective depth refinement block using residual connections to fully extract and fuse multi-level paired complementary cues from RGB and depth streams. Second, depth cues with abundant spatial information are innovatively combined with multi-scale context features for accurately locating salient objects. Third, we boost our model's performance by a novel recurrent attention module inspired by Internal Generative Mechanism of human brain. This module can generate more accurate saliency results via comprehensively learning the internal semantic relation of the fused feature and progressively optimizing local details with memory-oriented scene understanding. In addition, we create a large scale RGB-D dataset containing more complex scenarios, which can contribute to comprehensively evaluating saliency models. Extensive experiments on six public datasets and ours demonstrate that our method can accurately identify salient objects and achieve consistently superior performance over 16 state-of-the-art RGB and RGB-D approaches.\",\"PeriodicalId\":6728,\"journal\":{\"name\":\"2019 IEEE/CVF International Conference on Computer Vision (ICCV)\",\"volume\":\"31 1\",\"pages\":\"7253-7262\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"300\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE/CVF International Conference on Computer Vision (ICCV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCV.2019.00735\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/CVF International Conference on Computer Vision (ICCV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCV.2019.00735","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Depth-Induced Multi-Scale Recurrent Attention Network for Saliency Detection
In this work, we propose a novel depth-induced multi-scale recurrent attention network for saliency detection. It achieves dramatic performance especially in complex scenarios. There are three main contributions of our network that are experimentally demonstrated to have significant practical merits. First, we design an effective depth refinement block using residual connections to fully extract and fuse multi-level paired complementary cues from RGB and depth streams. Second, depth cues with abundant spatial information are innovatively combined with multi-scale context features for accurately locating salient objects. Third, we boost our model's performance by a novel recurrent attention module inspired by Internal Generative Mechanism of human brain. This module can generate more accurate saliency results via comprehensively learning the internal semantic relation of the fused feature and progressively optimizing local details with memory-oriented scene understanding. In addition, we create a large scale RGB-D dataset containing more complex scenarios, which can contribute to comprehensively evaluating saliency models. Extensive experiments on six public datasets and ours demonstrate that our method can accurately identify salient objects and achieve consistently superior performance over 16 state-of-the-art RGB and RGB-D approaches.