首页 > 最新文献

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)最新文献

英文 中文
Local Non-Rigid Structure-From-Motion From Diffeomorphic Mappings 微分同构映射的局部非刚性运动结构
Pub Date : 2020-06-01 DOI: 10.1109/cvpr42600.2020.00213
Shaifali Parashar, M. Salzmann, P. Fua
We propose a new formulation to non-rigid structure-from-motion that only requires the deforming surface to preserve its differential structure. This is a much weaker assumption than the traditional ones of isometry or conformality. We show that it is nevertheless sufficient to establish local correspondences between the surface in two different images and therefore to perform point-wise reconstruction using only first-order derivatives. To this end, we formulate differential constraints and solve them algebraically using the theory of resultants. We will demonstrate that our approach is more widely applicable, more stable in noisy and sparse imaging conditions and much faster than earlier ones, while delivering similar accuracy. The code is available at https://github.com/cvlab-epfl/diff-nrsfm/.
我们提出了一种新的非刚性运动结构公式,该公式只要求变形表面保持其微分结构。这是一个比传统的等距或共形假设弱得多的假设。我们表明,尽管如此,在两个不同的图像中建立表面之间的局部对应关系是足够的,因此仅使用一阶导数进行逐点重建。为此,我们提出微分约束,并利用结果理论对其进行代数求解。我们将证明我们的方法更广泛适用,在嘈杂和稀疏成像条件下更稳定,并且比以前的方法快得多,同时提供相似的精度。代码可在https://github.com/cvlab-epfl/diff-nrsfm/上获得。
{"title":"Local Non-Rigid Structure-From-Motion From Diffeomorphic Mappings","authors":"Shaifali Parashar, M. Salzmann, P. Fua","doi":"10.1109/cvpr42600.2020.00213","DOIUrl":"https://doi.org/10.1109/cvpr42600.2020.00213","url":null,"abstract":"We propose a new formulation to non-rigid structure-from-motion that only requires the deforming surface to preserve its differential structure. This is a much weaker assumption than the traditional ones of isometry or conformality. We show that it is nevertheless sufficient to establish local correspondences between the surface in two different images and therefore to perform point-wise reconstruction using only first-order derivatives. To this end, we formulate differential constraints and solve them algebraically using the theory of resultants. We will demonstrate that our approach is more widely applicable, more stable in noisy and sparse imaging conditions and much faster than earlier ones, while delivering similar accuracy. The code is available at https://github.com/cvlab-epfl/diff-nrsfm/.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"48 1","pages":"2056-2064"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90579185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Robust Reference-Based Super-Resolution With Similarity-Aware Deformable Convolution 具有相似性感知的可变形卷积鲁棒参考超分辨率
Pub Date : 2020-06-01 DOI: 10.1109/CVPR42600.2020.00845
Gyumin Shim, Jinsun Park, I. Kweon
In this paper, we propose a novel and efficient reference feature extraction module referred to as the Similarity Search and Extraction Network (SSEN) for reference-based super-resolution (RefSR) tasks. The proposed module extracts aligned relevant features from a reference image to increase the performance over single image super-resolution (SISR) methods. In contrast to conventional algorithms which utilize brute-force searches or optical flow estimations, the proposed algorithm is end-to-end trainable without any additional supervision or heavy computation, predicting the best match with a single network forward operation. Moreover, the proposed module is aware of not only the best matching position but also the relevancy of the best match. This makes our algorithm substantially robust when irrelevant reference images are given, overcoming the major cause of the performance degradation when using existing RefSR methods. Furthermore, our module can be utilized for self-similarity SR if no reference image is available. Experimental results demonstrate the superior performance of the proposed algorithm compared to previous works both quantitatively and qualitatively.
本文针对基于参考的超分辨率(RefSR)任务,提出了一种新颖高效的参考特征提取模块——相似度搜索与提取网络(SSEN)。该模块从参考图像中提取对齐的相关特征,以提高单图像超分辨率(SISR)方法的性能。与使用暴力搜索或光流估计的传统算法相比,该算法是端到端可训练的,无需任何额外的监督或繁重的计算,预测与单个网络前向操作的最佳匹配。此外,该模块不仅知道最佳匹配位置,而且知道最佳匹配的相关性。这使得我们的算法在给定不相关参考图像时具有很强的鲁棒性,克服了使用现有RefSR方法时性能下降的主要原因。此外,在没有参考图像的情况下,我们的模块可以用于自相似SR。实验结果表明,该算法在定量和定性上都优于前人的研究成果。
{"title":"Robust Reference-Based Super-Resolution With Similarity-Aware Deformable Convolution","authors":"Gyumin Shim, Jinsun Park, I. Kweon","doi":"10.1109/CVPR42600.2020.00845","DOIUrl":"https://doi.org/10.1109/CVPR42600.2020.00845","url":null,"abstract":"In this paper, we propose a novel and efficient reference feature extraction module referred to as the Similarity Search and Extraction Network (SSEN) for reference-based super-resolution (RefSR) tasks. The proposed module extracts aligned relevant features from a reference image to increase the performance over single image super-resolution (SISR) methods. In contrast to conventional algorithms which utilize brute-force searches or optical flow estimations, the proposed algorithm is end-to-end trainable without any additional supervision or heavy computation, predicting the best match with a single network forward operation. Moreover, the proposed module is aware of not only the best matching position but also the relevancy of the best match. This makes our algorithm substantially robust when irrelevant reference images are given, overcoming the major cause of the performance degradation when using existing RefSR methods. Furthermore, our module can be utilized for self-similarity SR if no reference image is available. Experimental results demonstrate the superior performance of the proposed algorithm compared to previous works both quantitatively and qualitatively.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"85 3 1","pages":"8422-8431"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90633524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 63
Weakly-Supervised Semantic Segmentation via Sub-Category Exploration 基于子类别探索的弱监督语义分割
Pub Date : 2020-06-01 DOI: 10.1109/cvpr42600.2020.00901
Yu-Ting Chang, Qiaosong Wang, Wei-Chih Hung, Robinson Piramuthu, Yi-Hsuan Tsai, Ming-Hsuan Yang
Existing weakly-supervised semantic segmentation methods using image-level annotations typically rely on initial responses to locate object regions. However, such response maps generated by the classification network usually focus on discriminative object parts, due to the fact that the network does not need the entire object for optimizing the objective function. To enforce the network to pay attention to other parts of an object, we propose a simple yet effective approach that introduces a self-supervised task by exploiting the sub-category information. Specifically, we perform clustering on image features to generate pseudo sub-categories labels within each annotated parent class, and construct a sub-category objective to assign the network to a more challenging task. By iteratively clustering image features, the training process does not limit itself to the most discriminative object parts, hence improving the quality of the response maps. We conduct extensive analysis to validate the proposed method and show that our approach performs favorably against the state-of-the-art approaches.
现有的使用图像级注释的弱监督语义分割方法通常依赖于初始响应来定位目标区域。然而,由于网络不需要整个对象来优化目标函数,因此由分类网络生成的这种响应图通常只关注具有判别性的对象部分。为了使网络关注对象的其他部分,我们提出了一种简单而有效的方法,即通过利用子类别信息引入自监督任务。具体来说,我们对图像特征进行聚类,在每个带注释的父类中生成伪子类别标签,并构建子类别目标,为网络分配更具挑战性的任务。通过迭代聚类图像特征,训练过程不局限于最具判别性的对象部分,从而提高了响应图的质量。我们进行了广泛的分析,以验证所提出的方法,并表明我们的方法优于最先进的方法。
{"title":"Weakly-Supervised Semantic Segmentation via Sub-Category Exploration","authors":"Yu-Ting Chang, Qiaosong Wang, Wei-Chih Hung, Robinson Piramuthu, Yi-Hsuan Tsai, Ming-Hsuan Yang","doi":"10.1109/cvpr42600.2020.00901","DOIUrl":"https://doi.org/10.1109/cvpr42600.2020.00901","url":null,"abstract":"Existing weakly-supervised semantic segmentation methods using image-level annotations typically rely on initial responses to locate object regions. However, such response maps generated by the classification network usually focus on discriminative object parts, due to the fact that the network does not need the entire object for optimizing the objective function. To enforce the network to pay attention to other parts of an object, we propose a simple yet effective approach that introduces a self-supervised task by exploiting the sub-category information. Specifically, we perform clustering on image features to generate pseudo sub-categories labels within each annotated parent class, and construct a sub-category objective to assign the network to a more challenging task. By iteratively clustering image features, the training process does not limit itself to the most discriminative object parts, hence improving the quality of the response maps. We conduct extensive analysis to validate the proposed method and show that our approach performs favorably against the state-of-the-art approaches.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"14 1","pages":"8988-8997"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90997304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 193
Learning Unseen Concepts via Hierarchical Decomposition and Composition 通过层次分解和组合学习看不见的概念
Pub Date : 2020-06-01 DOI: 10.1109/CVPR42600.2020.01026
Muli Yang, Cheng Deng, Junchi Yan, Xianglong Liu, D. Tao
Composing and recognizing new concepts from known sub-concepts has been a fundamental and challenging vision task, mainly due to 1) the diversity of sub-concepts and 2) the intricate contextuality between sub-concepts and their corresponding visual features. However, most of the current methods simply treat the contextuality as rigid semantic relationships and fail to capture fine-grained contextual correlations. We propose to learn unseen concepts in a hierarchical decomposition-and-composition manner. Considering the diversity of sub-concepts, our method decomposes each seen image into visual elements according to its labels, and learns corresponding sub-concepts in their individual subspaces. To model intricate contextuality between sub-concepts and their visual features, compositions are generated from these subspaces in three hierarchical forms, and the composed concepts are learned in a unified composition space. To further refine the captured contextual relationships, adaptively semi-positive concepts are defined and then learned with pseudo supervision exploited from the generated compositions. We validate the proposed approach on two challenging benchmarks, and demonstrate its superiority over state-of-the-art approaches.
从已知子概念中组合和识别新概念一直是一项基本且具有挑战性的视觉任务,主要原因是:(1)子概念的多样性;(2)子概念及其相应视觉特征之间复杂的上下文关系。然而,当前的大多数方法只是将上下文视为严格的语义关系,无法捕获细粒度的上下文相关性。我们建议以分层分解和组合的方式学习看不见的概念。考虑到子概念的多样性,我们的方法将每个看到的图像根据其标签分解为视觉元素,并在其各自的子空间中学习相应的子概念。为了模拟子概念及其视觉特征之间复杂的上下文关系,从这些子空间中以三种层次形式生成组合,并在统一的组合空间中学习组合概念。为了进一步细化捕获的上下文关系,我们定义了自适应的半肯定概念,然后利用生成的组合进行伪监督学习。我们在两个具有挑战性的基准上验证了所提出的方法,并证明了其优于最先进的方法。
{"title":"Learning Unseen Concepts via Hierarchical Decomposition and Composition","authors":"Muli Yang, Cheng Deng, Junchi Yan, Xianglong Liu, D. Tao","doi":"10.1109/CVPR42600.2020.01026","DOIUrl":"https://doi.org/10.1109/CVPR42600.2020.01026","url":null,"abstract":"Composing and recognizing new concepts from known sub-concepts has been a fundamental and challenging vision task, mainly due to 1) the diversity of sub-concepts and 2) the intricate contextuality between sub-concepts and their corresponding visual features. However, most of the current methods simply treat the contextuality as rigid semantic relationships and fail to capture fine-grained contextual correlations. We propose to learn unseen concepts in a hierarchical decomposition-and-composition manner. Considering the diversity of sub-concepts, our method decomposes each seen image into visual elements according to its labels, and learns corresponding sub-concepts in their individual subspaces. To model intricate contextuality between sub-concepts and their visual features, compositions are generated from these subspaces in three hierarchical forms, and the composed concepts are learned in a unified composition space. To further refine the captured contextual relationships, adaptively semi-positive concepts are defined and then learned with pseudo supervision exploited from the generated compositions. We validate the proposed approach on two challenging benchmarks, and demonstrate its superiority over state-of-the-art approaches.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"144 1","pages":"10245-10253"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89766060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
Reliable Weighted Optimal Transport for Unsupervised Domain Adaptation 无监督域自适应的可靠加权最优传输
Pub Date : 2020-06-01 DOI: 10.1109/cvpr42600.2020.00445
Renjun Xu, Pelen Liu, Liyan Wang, Chao Chen, Jindong Wang, Kaiming He, X. Zhang, Shaoqing Ren, Mingsheng Long, Zhangjie Cao, Jianmin Wang
Recently, extensive researches have been proposed to address the UDA problem, which aims to learn transferrable models for the unlabeled target domain. Among them, the optimal transport is a promising metric to align the representations of the source and target domains. However, most existing works based on optimal transport ignore the intra-domain structure, only achieving coarse pair-wise matching. The target samples distributed near the edge of the clusters, or far from their corresponding class centers are easily to be misclassified by the decision boundary learned from the source domain. In this paper, we present Reliable Weighted Optimal Transport (RWOT) for unsupervised domain adaptation, including novel Shrinking Subspace Reliability (SSR) and weighted optimal transport strategy. Specifically, SSR exploits spatial prototypical information and intra-domain structure to dynamically measure the sample-level domain discrepancy across domains. Besides, the weighted optimal transport strategy based on SSR is exploited to achieve the precise-pair-wise optimal transport procedure, which reduces negative transfer brought by the samples near decision boundaries in the target domain. RWOT also equips with the discriminative centroid clustering exploitation strategy to learn transfer features. A thorough evaluation shows that RWOT outperforms existing state-of-the-art method on standard domain adaptation benchmarks.
近年来,人们提出了大量的研究来解决UDA问题,其目的是学习未标记目标域的可转移模型。其中,最优传输是对齐源域和目标域表示的一个有前途的度量。然而,现有的基于最优传输的研究大多忽略了域内结构,只实现了粗略的逐对匹配。当目标样本分布在聚类的边缘附近或远离其对应的类中心时,容易被从源域学习到的决策边界错误分类。本文提出了一种用于无监督域自适应的可靠加权最优传输(RWOT)方法,包括新的收缩子空间可靠性(SSR)和加权最优传输策略。具体而言,SSR利用空间原型信息和域内结构来动态度量跨域的样本级域差异。此外,利用基于SSR的加权最优传输策略,实现了精确的成对最优传输过程,减少了目标域决策边界附近样本带来的负迁移。RWOT还配备了判别质心聚类利用策略来学习迁移特征。一项全面的评估表明,RWOT在标准领域适应基准上优于现有的最先进方法。
{"title":"Reliable Weighted Optimal Transport for Unsupervised Domain Adaptation","authors":"Renjun Xu, Pelen Liu, Liyan Wang, Chao Chen, Jindong Wang, Kaiming He, X. Zhang, Shaoqing Ren, Mingsheng Long, Zhangjie Cao, Jianmin Wang","doi":"10.1109/cvpr42600.2020.00445","DOIUrl":"https://doi.org/10.1109/cvpr42600.2020.00445","url":null,"abstract":"Recently, extensive researches have been proposed to address the UDA problem, which aims to learn transferrable models for the unlabeled target domain. Among them, the optimal transport is a promising metric to align the representations of the source and target domains. However, most existing works based on optimal transport ignore the intra-domain structure, only achieving coarse pair-wise matching. The target samples distributed near the edge of the clusters, or far from their corresponding class centers are easily to be misclassified by the decision boundary learned from the source domain. In this paper, we present Reliable Weighted Optimal Transport (RWOT) for unsupervised domain adaptation, including novel Shrinking Subspace Reliability (SSR) and weighted optimal transport strategy. Specifically, SSR exploits spatial prototypical information and intra-domain structure to dynamically measure the sample-level domain discrepancy across domains. Besides, the weighted optimal transport strategy based on SSR is exploited to achieve the precise-pair-wise optimal transport procedure, which reduces negative transfer brought by the samples near decision boundaries in the target domain. RWOT also equips with the discriminative centroid clustering exploitation strategy to learn transfer features. A thorough evaluation shows that RWOT outperforms existing state-of-the-art method on standard domain adaptation benchmarks.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"31 1","pages":"4393-4402"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87621400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 96
Weakly Supervised Fine-Grained Image Classification via Guassian Mixture Model Oriented Discriminative Learning 基于高斯混合模型的弱监督细粒度图像分类
Pub Date : 2020-06-01 DOI: 10.1109/cvpr42600.2020.00977
Zhihui Wang, Shijie Wang, Shuhui Yang, Haojie Li, Jianjun Li, Zezhou Li
Existing weakly supervised fine-grained image recognition (WFGIR) methods usually pick out the discriminative regions from the high-level feature maps directly. We discover that due to the operation of stacking local receptive filed, Convolutional Neural Network causes the discriminative region diffusion in high-level feature maps, which leads to inaccurate discriminative region localization. In this paper, we propose an end-to-end Discriminative Feature-oriented Gaussian Mixture Model (DF-GMM), to address the problem of discriminative region diffusion and find better fine-grained details. Specifically, DF-GMM consists of 1) a low-rank representation mechanism (LRM), which learns a set of low-rank discriminative bases by Gaussian Mixture Model (GMM) in high-level semantic feature maps to improve discriminative ability of feature representation, 2) a low-rank representation reorganization mechanism (LR$ ^2 $M) which resumes the space information corresponding to low-rank discriminative bases to reconstruct the low-rank feature maps. It alleviates the discriminative region diffusion problem and locate discriminative regions more precisely. Extensive experiments verify that DF-GMM yields the best performance under the same settings with the most competitive approaches, in CUB-Bird, Stanford-Cars datasets, and FGVC Aircraft.
现有的弱监督细粒度图像识别方法通常直接从高级特征映射中提取判别区域。我们发现,卷积神经网络由于对局部感受场进行叠加操作,导致高级特征映射中的判别区域扩散,导致判别区域定位不准确。在本文中,我们提出了一个端到端的判别特征导向高斯混合模型(DF-GMM),以解决判别区域扩散问题,并找到更好的细粒度细节。DF-GMM包括:1)低秩表示机制(LRM),通过高斯混合模型(GMM)在高级语义特征映射中学习一组低秩判别基,提高特征表示的判别能力;2)低秩表示重组机制(LR$ ^2 $M),恢复低秩判别基对应的空间信息,重构低秩特征映射。它缓解了判别区域扩散问题,更精确地定位了判别区域。大量的实验证明,DF-GMM在相同的设置下,在最具竞争力的方法下,在ub - bird, Stanford-Cars数据集和FGVC飞机上产生最佳性能。
{"title":"Weakly Supervised Fine-Grained Image Classification via Guassian Mixture Model Oriented Discriminative Learning","authors":"Zhihui Wang, Shijie Wang, Shuhui Yang, Haojie Li, Jianjun Li, Zezhou Li","doi":"10.1109/cvpr42600.2020.00977","DOIUrl":"https://doi.org/10.1109/cvpr42600.2020.00977","url":null,"abstract":"Existing weakly supervised fine-grained image recognition (WFGIR) methods usually pick out the discriminative regions from the high-level feature maps directly. We discover that due to the operation of stacking local receptive filed, Convolutional Neural Network causes the discriminative region diffusion in high-level feature maps, which leads to inaccurate discriminative region localization. In this paper, we propose an end-to-end Discriminative Feature-oriented Gaussian Mixture Model (DF-GMM), to address the problem of discriminative region diffusion and find better fine-grained details. Specifically, DF-GMM consists of 1) a low-rank representation mechanism (LRM), which learns a set of low-rank discriminative bases by Gaussian Mixture Model (GMM) in high-level semantic feature maps to improve discriminative ability of feature representation, 2) a low-rank representation reorganization mechanism (LR$ ^2 $M) which resumes the space information corresponding to low-rank discriminative bases to reconstruct the low-rank feature maps. It alleviates the discriminative region diffusion problem and locate discriminative regions more precisely. Extensive experiments verify that DF-GMM yields the best performance under the same settings with the most competitive approaches, in CUB-Bird, Stanford-Cars datasets, and FGVC Aircraft.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"27 1","pages":"9746-9755"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87903683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
Shoestring: Graph-Based Semi-Supervised Classification With Severely Limited Labeled Data Shoestring:标记数据严重受限的基于图的半监督分类
Pub Date : 2020-06-01 DOI: 10.1109/CVPR42600.2020.00423
Wanyu Lin, Zhaolin Gao, Baochun Li
Graph-based semi-supervised learning has been shown to be one of the most effective classification approaches, as it can exploit connectivity patterns between labeled and unlabeled samples to improve learning performance. However, we show that existing techniques perform poorly when labeled data are severely limited. To address the problem of semi-supervised learning in the presence of severely limited labeled samples, we propose a new framework, called {em Shoestring}, that incorporates metric learning into the paradigm of graph-based semi-supervised learning. In particular, our base model consists of a graph embedding network, followed by a metric learning network that learns a semantic metric space to represent the semantic similarity between the sparsely labeled and large numbers of unlabeled samples. Then the classification can be performed by clustering the unlabeled samples according to the learned semantic space. We empirically demonstrate Shoestring's superiority over many baselines, including graph convolutional networks, label propagation and their recent label-efficient variations (IGCN and GLP). We show that our framework achieves state-of-the-art performance for node classification in the low-data regime. In addition, we demonstrate the effectiveness of our framework on image classification tasks in the few-shot learning regime, with significant gains on miniImageNet ($2.57%sim3.59%$) and tieredImageNet ($1.05%sim2.70%$).
基于图的半监督学习已被证明是最有效的分类方法之一,因为它可以利用标记和未标记样本之间的连接模式来提高学习性能。然而,我们表明,当标记数据严重受限时,现有技术表现不佳。为了解决在严重有限的标记样本情况下的半监督学习问题,我们提出了一个新的框架,称为{em Shoestring},它将度量学习整合到基于图的半监督学习范式中。特别地,我们的基本模型包括一个图嵌入网络,然后是一个度量学习网络,该网络学习一个语义度量空间来表示稀疏标记和大量未标记样本之间的语义相似性。然后根据学习到的语义空间对未标记的样本进行聚类进行分类。我们通过经验证明了Shoestring优于许多基线,包括图卷积网络、标签传播及其最近的标签效率变化(IGCN和GLP)。我们表明,我们的框架在低数据状态下实现了最先进的节点分类性能。此外,我们证明了我们的框架在少镜头学习机制下对图像分类任务的有效性,在miniImageNet ($2.57%sim3.59%$)和tieredImageNet ($1.05%sim2.70%$)上取得了显著的进步。
{"title":"Shoestring: Graph-Based Semi-Supervised Classification With Severely Limited Labeled Data","authors":"Wanyu Lin, Zhaolin Gao, Baochun Li","doi":"10.1109/CVPR42600.2020.00423","DOIUrl":"https://doi.org/10.1109/CVPR42600.2020.00423","url":null,"abstract":"Graph-based semi-supervised learning has been shown to be one of the most effective classification approaches, as it can exploit connectivity patterns between labeled and unlabeled samples to improve learning performance. However, we show that existing techniques perform poorly when labeled data are severely limited. To address the problem of semi-supervised learning in the presence of severely limited labeled samples, we propose a new framework, called {em Shoestring}, that incorporates metric learning into the paradigm of graph-based semi-supervised learning. In particular, our base model consists of a graph embedding network, followed by a metric learning network that learns a semantic metric space to represent the semantic similarity between the sparsely labeled and large numbers of unlabeled samples. Then the classification can be performed by clustering the unlabeled samples according to the learned semantic space. We empirically demonstrate Shoestring's superiority over many baselines, including graph convolutional networks, label propagation and their recent label-efficient variations (IGCN and GLP). We show that our framework achieves state-of-the-art performance for node classification in the low-data regime. In addition, we demonstrate the effectiveness of our framework on image classification tasks in the few-shot learning regime, with significant gains on miniImageNet ($2.57%sim3.59%$) and tieredImageNet ($1.05%sim2.70%$).","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"70 1","pages":"4173-4181"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86273122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
Active 3D Motion Visualization Based on Spatiotemporal Light-Ray Integration 基于时空光线积分的主动三维运动可视化
Pub Date : 2020-06-01 DOI: 10.1109/CVPR42600.2020.00205
Fumihiko Sakaue, J. Sato
In this paper, we propose a method of visualizing 3D motion with zero latency. This method achieves motion visualization by projecting special high-frequency light patterns on moving objects without using any feedback mechanisms. For this objective, we focus on the time integration of light rays in the sensing system of observers. It is known that the visual system of human observers integrates light rays in a certain period. Similarly, the image sensor in a camera integrates light rays during the exposure time. Thus, our method embeds multiple images into a time-varying light field, such that the observer of the time-varying light field observes completely different images according to the dynamic motion of the scene. Based on this concept, we propose a method of generating special high-frequency patterns of projector lights. After projection onto target objects with projectors, the image observed on the target changes automatically depending on the motion of the objects and without any scene sensing and data analysis. In other words, we achieve motion visualization without the time delay incurred during sensing and computing.
本文提出了一种零延迟的三维运动可视化方法。该方法在不使用任何反馈机制的情况下,通过在运动物体上投射特殊的高频光模式来实现运动可视化。为此,我们重点研究了观测者感知系统中光线的时间积分。众所周知,人类观察者的视觉系统在一定的时间内整合光线。类似地,照相机中的图像传感器在曝光时也会将光线集成在一起。因此,我们的方法将多幅图像嵌入到一个时变光场中,使得时变光场的观察者根据场景的动态运动观察到完全不同的图像。基于这个概念,我们提出了一种产生特殊高频模式的投影灯的方法。用投影仪投影到目标物体上后,在目标上观察到的图像会随着物体的运动而自动变化,不需要任何场景感知和数据分析。换句话说,我们实现了运动可视化,而没有在传感和计算过程中产生的时间延迟。
{"title":"Active 3D Motion Visualization Based on Spatiotemporal Light-Ray Integration","authors":"Fumihiko Sakaue, J. Sato","doi":"10.1109/CVPR42600.2020.00205","DOIUrl":"https://doi.org/10.1109/CVPR42600.2020.00205","url":null,"abstract":"In this paper, we propose a method of visualizing 3D motion with zero latency. This method achieves motion visualization by projecting special high-frequency light patterns on moving objects without using any feedback mechanisms. For this objective, we focus on the time integration of light rays in the sensing system of observers. It is known that the visual system of human observers integrates light rays in a certain period. Similarly, the image sensor in a camera integrates light rays during the exposure time. Thus, our method embeds multiple images into a time-varying light field, such that the observer of the time-varying light field observes completely different images according to the dynamic motion of the scene. Based on this concept, we propose a method of generating special high-frequency patterns of projector lights. After projection onto target objects with projectors, the image observed on the target changes automatically depending on the motion of the objects and without any scene sensing and data analysis. In other words, we achieve motion visualization without the time delay incurred during sensing and computing.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"66 1","pages":"1977-1985"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86273408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Adaptive Dilated Network With Self-Correction Supervision for Counting 计数自校正监督的自适应扩张网络
Pub Date : 2020-06-01 DOI: 10.1109/cvpr42600.2020.00465
Shuai Bai, Zhiqun He, Y. Qiao, Hanzhe Hu, Wei Wu, Junjie Yan
The counting problem aims to estimate the number of objects in images. Due to large scale variation and labeling deviations, it remains a challenging task. The static density map supervised learning framework is widely used in existing methods, which uses the Gaussian kernel to generate a density map as the learning target and utilizes the Euclidean distance to optimize the model. However, the framework is intolerable to the labeling deviations and can not reflect the scale variation. In this paper, we propose an adaptive dilated convolution and a novel supervised learning framework named self-correction (SC) supervision. In the supervision level, the SC supervision utilizes the outputs of the model to iteratively correct the annotations and employs the SC loss to simultaneously optimize the model from both the whole and the individuals. In the feature level, the proposed adaptive dilated convolution predicts a continuous value as the specific dilation rate for each location, which adapts the scale variation better than a discrete and static dilation rate. Extensive experiments illustrate that our approach has achieved a consistent improvement on four challenging benchmarks. Especially, our approach achieves better performance than the state-of-the-art methods on all benchmark datasets.
计数问题的目的是估计图像中物体的数量。由于大规模的变化和标记偏差,这仍然是一项具有挑战性的任务。静态密度图监督学习框架在现有方法中被广泛使用,它以高斯核生成密度图作为学习目标,利用欧几里得距离对模型进行优化。然而,该框架对标注偏差是不能容忍的,不能反映尺度的变化。本文提出了一种自适应扩展卷积和一种新的监督学习框架——自校正监督。在监督层面,SC监督利用模型的输出对标注进行迭代修正,并利用SC损失从整体和个体两方面同时优化模型。在特征层面,提出的自适应扩张卷积预测了一个连续的值作为每个位置的特定扩张率,比离散和静态的扩张率更能适应尺度变化。大量的实验表明,我们的方法在四个具有挑战性的基准上取得了持续的改进。特别是,我们的方法在所有基准数据集上都比最先进的方法实现了更好的性能。
{"title":"Adaptive Dilated Network With Self-Correction Supervision for Counting","authors":"Shuai Bai, Zhiqun He, Y. Qiao, Hanzhe Hu, Wei Wu, Junjie Yan","doi":"10.1109/cvpr42600.2020.00465","DOIUrl":"https://doi.org/10.1109/cvpr42600.2020.00465","url":null,"abstract":"The counting problem aims to estimate the number of objects in images. Due to large scale variation and labeling deviations, it remains a challenging task. The static density map supervised learning framework is widely used in existing methods, which uses the Gaussian kernel to generate a density map as the learning target and utilizes the Euclidean distance to optimize the model. However, the framework is intolerable to the labeling deviations and can not reflect the scale variation. In this paper, we propose an adaptive dilated convolution and a novel supervised learning framework named self-correction (SC) supervision. In the supervision level, the SC supervision utilizes the outputs of the model to iteratively correct the annotations and employs the SC loss to simultaneously optimize the model from both the whole and the individuals. In the feature level, the proposed adaptive dilated convolution predicts a continuous value as the specific dilation rate for each location, which adapts the scale variation better than a discrete and static dilation rate. Extensive experiments illustrate that our approach has achieved a consistent improvement on four challenging benchmarks. Especially, our approach achieves better performance than the state-of-the-art methods on all benchmark datasets.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"32 1","pages":"4593-4602"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86330858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 125
Perspective Plane Program Induction From a Single Image 透视平面程序感应从一个单一的图像
Pub Date : 2020-06-01 DOI: 10.1109/cvpr42600.2020.00449
Yikai Li, Jiayuan Mao, Xiuming Zhang, W. Freeman, J. Tenenbaum, Jiajun Wu
We study the inverse graphics problem of inferring a holistic representation for natural images. Given an input image, our goal is to induce a neuro-symbolic, program-like representation that jointly models camera poses, object locations, and global scene structures. Such high-level, holistic scene representations further facilitate low-level image manipulation tasks such as inpainting. We formulate this problem as jointly finding the camera pose and scene structure that best describe the input image. The benefits of such joint inference are two-fold: scene regularity serves as a new cue for perspective correction, and in turn, correct perspective correction leads to a simplified scene structure, similar to how the correct shape leads to the most regular texture in shape from texture. Our proposed framework, Perspective Plane Program Induction (P3I), combines search-based and gradient-based algorithms to efficiently solve the problem. P3I outperforms a set of baselines on a collection of Internet images, across tasks including camera pose estimation, global structure inference, and down-stream image manipulation tasks.
我们研究了推断自然图像整体表示的逆图形问题。给定输入图像,我们的目标是诱导神经符号,类似程序的表示,共同建模相机姿势,物体位置和全局场景结构。这种高级的、整体的场景表示进一步促进了低级的图像处理任务,如涂漆。我们将这个问题表述为共同寻找最能描述输入图像的相机姿势和场景结构。这种联合推理的好处是双重的:场景的规律性作为视角校正的新线索,反过来,正确的视角校正导致简化的场景结构,类似于正确的形状导致最规则的纹理从纹理形状。我们提出的透视平面程序归纳(P3I)框架结合了基于搜索和基于梯度的算法来有效地解决问题。P3I在互联网图像集合上优于一组基线,包括相机姿态估计、全局结构推断和下游图像处理任务。
{"title":"Perspective Plane Program Induction From a Single Image","authors":"Yikai Li, Jiayuan Mao, Xiuming Zhang, W. Freeman, J. Tenenbaum, Jiajun Wu","doi":"10.1109/cvpr42600.2020.00449","DOIUrl":"https://doi.org/10.1109/cvpr42600.2020.00449","url":null,"abstract":"We study the inverse graphics problem of inferring a holistic representation for natural images. Given an input image, our goal is to induce a neuro-symbolic, program-like representation that jointly models camera poses, object locations, and global scene structures. Such high-level, holistic scene representations further facilitate low-level image manipulation tasks such as inpainting. We formulate this problem as jointly finding the camera pose and scene structure that best describe the input image. The benefits of such joint inference are two-fold: scene regularity serves as a new cue for perspective correction, and in turn, correct perspective correction leads to a simplified scene structure, similar to how the correct shape leads to the most regular texture in shape from texture. Our proposed framework, Perspective Plane Program Induction (P3I), combines search-based and gradient-based algorithms to efficiently solve the problem. P3I outperforms a set of baselines on a collection of Internet images, across tasks including camera pose estimation, global structure inference, and down-stream image manipulation tasks.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"11 1","pages":"4433-4442"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86346330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
期刊
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1