Exploit Visual Dependency Relations for Semantic Segmentation

2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2021-06-01 DOI:10.1109/CVPR46437.2021.00960

Mingyuan Liu, D. Schonfeld, Wei Tang

{"title":"Exploit Visual Dependency Relations for Semantic Segmentation","authors":"Mingyuan Liu, D. Schonfeld, Wei Tang","doi":"10.1109/CVPR46437.2021.00960","DOIUrl":null,"url":null,"abstract":"Dependency relations among visual entities are ubiquity because both objects and scenes are highly structured. They provide prior knowledge about the real world that can help improve the generalization ability of deep learning approaches. Different from contextual reasoning which focuses on feature aggregation in the spatial domain, visual dependency reasoning explicitly models the dependency relations among visual entities. In this paper, we introduce a novel network architecture, termed the dependency network or DependencyNet, for semantic segmentation. It unifies dependency reasoning at three semantic levels. Intra-class reasoning decouples the representations of different object categories and updates them separately based on the internal object structures. Inter-class reasoning then performs spatial and semantic reasoning based on the dependency relations among different object categories. We will have an in-depth investigation on how to discover the dependency graph from the training annotations. Global dependency reasoning further refines the representations of each object category based on the global scene information. Extensive ablative studies with a controlled model size and the same network depth show that each individual dependency reasoning component benefits semantic segmentation and they together significantly improve the base network. Experimental results on two benchmark datasets show the DependencyNet achieves comparable performance to the recent states of the art.","PeriodicalId":339646,"journal":{"name":"2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR46437.2021.00960","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 17

Abstract

Dependency relations among visual entities are ubiquity because both objects and scenes are highly structured. They provide prior knowledge about the real world that can help improve the generalization ability of deep learning approaches. Different from contextual reasoning which focuses on feature aggregation in the spatial domain, visual dependency reasoning explicitly models the dependency relations among visual entities. In this paper, we introduce a novel network architecture, termed the dependency network or DependencyNet, for semantic segmentation. It unifies dependency reasoning at three semantic levels. Intra-class reasoning decouples the representations of different object categories and updates them separately based on the internal object structures. Inter-class reasoning then performs spatial and semantic reasoning based on the dependency relations among different object categories. We will have an in-depth investigation on how to discover the dependency graph from the training annotations. Global dependency reasoning further refines the representations of each object category based on the global scene information. Extensive ablative studies with a controlled model size and the same network depth show that each individual dependency reasoning component benefits semantic segmentation and they together significantly improve the base network. Experimental results on two benchmark datasets show the DependencyNet achieves comparable performance to the recent states of the art.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用视觉依赖关系进行语义分割

视觉实体之间的依赖关系无处不在，因为对象和场景都是高度结构化的。它们提供了关于现实世界的先验知识，有助于提高深度学习方法的泛化能力。与上下文推理侧重于空间域的特征聚合不同，视觉依赖推理明确地对视觉实体之间的依赖关系进行建模。在本文中，我们介绍了一种新的网络结构，称为依赖网络或DependencyNet，用于语义分割。它在三个语义层次上统一了依赖推理。类内推理将不同对象类别的表示解耦，并根据内部对象结构分别更新它们。类间推理则根据不同对象类别之间的依赖关系进行空间推理和语义推理。我们将深入研究如何从训练注释中发现依赖图。全局依赖推理基于全局场景信息进一步细化每个对象类别的表示。在控制模型大小和相同网络深度的条件下进行的大量烧蚀研究表明，每个独立的依赖推理组件都有利于语义分割，它们共同显著改善了基础网络。在两个基准数据集上的实验结果表明，DependencyNet达到了与最新技术相当的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

自引率

0.00%

发文量

期刊最新文献

Multi-Label Learning from Single Positive Labels Panoramic Image Reflection Removal Self-Aligned Video Deraining with Transmission-Depth Consistency PSD: Principled Synthetic-to-Real Dehazing Guided by Physical Priors Ultra-High-Definition Image Dehazing via Multi-Guided Bilateral Learning