首页 > 最新文献

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)最新文献

英文 中文
Select, Supplement and Focus for RGB-D Saliency Detection RGB-D显著性检测的选择、补充和聚焦
Pub Date : 2020-06-01 DOI: 10.1109/CVPR42600.2020.00353
Miao Zhang, Weisong Ren, Yongri Piao, Zhengkun Rong, Huchuan Lu
Depth data containing a preponderance of discriminative power in location have been proven beneficial for accurate saliency prediction. However, RGB-D saliency detection methods are also negatively influenced by randomly distributed erroneous or missing regions on the depth map or along the object boundaries. This offers the possibility of achieving more effective inference by well designed models. In this paper, we propose a new framework for accurate RGB-D saliency detection taking account of local and global complementarities from two modalities. This is achieved by designing a complimentary interaction model discriminative enough to simultaneously select useful representation from RGB and depth data, and meanwhile to refine the object boundaries. Moreover, we proposed a compensation-aware loss to further process the information not being considered in the complimentary interaction model, leading to improvement of the generalization ability for challenging scenes. Experiments on six public datasets show that our method outperforms18state-of-the-art methods.
包含位置判别能力优势的深度数据已被证明有利于准确的显著性预测。然而,RGB-D显著性检测方法也会受到深度图上或物体边界上随机分布的错误或缺失区域的负面影响。这为通过设计良好的模型实现更有效的推理提供了可能性。在本文中,我们提出了一个精确的RGB-D显著性检测的新框架,考虑了两种模式的局部和全局互补性。这是通过设计一个互补的交互模型来实现的,该模型具有足够的判别性,可以同时从RGB和深度数据中选择有用的表示,同时可以细化对象边界。此外,我们提出了补偿感知损失来进一步处理互补交互模型中未考虑的信息,从而提高了对具有挑战性场景的泛化能力。在六个公共数据集上的实验表明,我们的方法优于18种最先进的方法。
{"title":"Select, Supplement and Focus for RGB-D Saliency Detection","authors":"Miao Zhang, Weisong Ren, Yongri Piao, Zhengkun Rong, Huchuan Lu","doi":"10.1109/CVPR42600.2020.00353","DOIUrl":"https://doi.org/10.1109/CVPR42600.2020.00353","url":null,"abstract":"Depth data containing a preponderance of discriminative power in location have been proven beneficial for accurate saliency prediction. However, RGB-D saliency detection methods are also negatively influenced by randomly distributed erroneous or missing regions on the depth map or along the object boundaries. This offers the possibility of achieving more effective inference by well designed models. In this paper, we propose a new framework for accurate RGB-D saliency detection taking account of local and global complementarities from two modalities. This is achieved by designing a complimentary interaction model discriminative enough to simultaneously select useful representation from RGB and depth data, and meanwhile to refine the object boundaries. Moreover, we proposed a compensation-aware loss to further process the information not being considered in the complimentary interaction model, leading to improvement of the generalization ability for challenging scenes. Experiments on six public datasets show that our method outperforms18state-of-the-art methods.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"27 1","pages":"3469-3478"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83986222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 147
Which Is Plagiarism: Fashion Image Retrieval Based on Regional Representation for Design Protection 哪是抄袭:基于地域表征的时尚图像检索设计保护
Pub Date : 2020-06-01 DOI: 10.1109/CVPR42600.2020.00267
Yining Lang, Yuan He, Fan Yang, Jianfeng Dong, Hui Xue
With the rapid growth of e-commerce and the popularity of online shopping, fashion retrieval has received considerable attention in the computer vision community. Different from the existing works that mainly focus on identical or similar fashion item retrieval, in this paper, we aim to study the plagiarized clothes retrieval which is somewhat ignored in the academic community while itself has great application value. One of the key challenges is that plagiarized clothes are usually modified in a certain region on the original design to escape the supervision by traditional retrieval methods. To relieve it, we propose a novel network named Plagiarized-Search-Net (PS-Net) based on regional representation, where we utilize the landmarks to guide the learning of regional representations and compare fashion items region by region. Besides, we propose a new dataset named Plagiarized Fashion for plagiarized clothes retrieval, which provides a meaningful complement to the existing fashion retrieval field. Experiments on Plagiarized Fashion dataset verify that our approach is superior to other instance-level counterparts for plagiarized clothes retrieval, showing a promising result for original design protection. Moreover, our PS-Net can also be adapted to traditional fashion retrieval and landmark estimation tasks and achieves the state-of-the-art performance on the DeepFashion and DeepFashion2 datasets.
随着电子商务的快速发展和网上购物的普及,时尚检索在计算机视觉界受到了相当大的关注。与现有的研究工作主要集中在相同或相似的时尚单品检索不同,本文的研究对象是学术界忽视的抄袭服装检索,而抄袭服装检索本身具有很大的应用价值。其中一个关键的挑战是,抄袭的服装通常会在原设计的一定区域进行修改,以逃避传统检索方法的监督。为了缓解这一问题,我们提出了一个基于区域表征的新型网络——抄袭搜索网(PS-Net),利用地标来指导区域表征的学习,并对不同地区的时尚商品进行比较。此外,我们提出了一个新的数据集剽窃者服装检索,为现有的服装检索领域提供了有意义的补充。在抄袭时装数据集上的实验验证了我们的方法优于其他实例级的抄袭服装检索方法,显示了对原创设计保护的良好结果。此外,我们的PS-Net还可以适应传统的时尚检索和地标估计任务,并在DeepFashion和DeepFashion2数据集上实现最先进的性能。
{"title":"Which Is Plagiarism: Fashion Image Retrieval Based on Regional Representation for Design Protection","authors":"Yining Lang, Yuan He, Fan Yang, Jianfeng Dong, Hui Xue","doi":"10.1109/CVPR42600.2020.00267","DOIUrl":"https://doi.org/10.1109/CVPR42600.2020.00267","url":null,"abstract":"With the rapid growth of e-commerce and the popularity of online shopping, fashion retrieval has received considerable attention in the computer vision community. Different from the existing works that mainly focus on identical or similar fashion item retrieval, in this paper, we aim to study the plagiarized clothes retrieval which is somewhat ignored in the academic community while itself has great application value. One of the key challenges is that plagiarized clothes are usually modified in a certain region on the original design to escape the supervision by traditional retrieval methods. To relieve it, we propose a novel network named Plagiarized-Search-Net (PS-Net) based on regional representation, where we utilize the landmarks to guide the learning of regional representations and compare fashion items region by region. Besides, we propose a new dataset named Plagiarized Fashion for plagiarized clothes retrieval, which provides a meaningful complement to the existing fashion retrieval field. Experiments on Plagiarized Fashion dataset verify that our approach is superior to other instance-level counterparts for plagiarized clothes retrieval, showing a promising result for original design protection. Moreover, our PS-Net can also be adapted to traditional fashion retrieval and landmark estimation tasks and achieves the state-of-the-art performance on the DeepFashion and DeepFashion2 datasets.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"91 1","pages":"2592-2601"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80529364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
HAMBox: Delving Into Mining High-Quality Anchors on Face Detection HAMBox:基于人脸检测的高质量锚点挖掘
Pub Date : 2020-06-01 DOI: 10.1109/cvpr42600.2020.01306
Yang Liu, Xu Tang, Junyu Han, Jingtuo Liu, Dinger Rui, Xiang Wu
Current face detectors utilize anchors to frame a multi-task learning problem which combines classification and bounding box regression. Effective anchor design and anchor matching strategy enable face detectors to localize faces under large pose and scale variations. However, we observe that, more than 80% correctly predicted bounding boxes are regressed from the unmatched anchors (the IoUs between anchors and target faces are lower than a threshold) in the inference phase. It indicates that these unmatched anchors perform excellent regression ability, but the existing methods neglect to learn from them. In this paper, we propose an Online High-quality Anchor Mining Strategy (HAMBox), which explicitly helps outer faces compensate with high-quality anchors. Our proposed HAMBox method could be a general strategy for anchor-based single-stage face detection. Experiments on various datasets, including WIDER FACE, FDDB, AFW and PASCAL Face, demonstrate the superiority of the proposed method.
目前的人脸检测器利用锚点来构建一个结合分类和边界盒回归的多任务学习问题。有效的锚点设计和锚点匹配策略使人脸检测器能够在大姿态和大尺度变化下对人脸进行定位。然而,我们观察到,在推理阶段,超过80%的正确预测的边界框是从不匹配的锚点(锚点和目标面之间的白条低于阈值)回归的。这说明这些不匹配锚点具有很好的回归能力,但现有方法忽略了对它们的学习。在本文中,我们提出了一种在线高质量锚点挖掘策略(HAMBox),该策略明确地帮助外表面补偿高质量锚点。我们提出的HAMBox方法可以作为基于锚点的单阶段人脸检测的通用策略。在wide FACE、FDDB、AFW和PASCAL FACE等数据集上的实验证明了该方法的优越性。
{"title":"HAMBox: Delving Into Mining High-Quality Anchors on Face Detection","authors":"Yang Liu, Xu Tang, Junyu Han, Jingtuo Liu, Dinger Rui, Xiang Wu","doi":"10.1109/cvpr42600.2020.01306","DOIUrl":"https://doi.org/10.1109/cvpr42600.2020.01306","url":null,"abstract":"Current face detectors utilize anchors to frame a multi-task learning problem which combines classification and bounding box regression. Effective anchor design and anchor matching strategy enable face detectors to localize faces under large pose and scale variations. However, we observe that, more than 80% correctly predicted bounding boxes are regressed from the unmatched anchors (the IoUs between anchors and target faces are lower than a threshold) in the inference phase. It indicates that these unmatched anchors perform excellent regression ability, but the existing methods neglect to learn from them. In this paper, we propose an Online High-quality Anchor Mining Strategy (HAMBox), which explicitly helps outer faces compensate with high-quality anchors. Our proposed HAMBox method could be a general strategy for anchor-based single-stage face detection. Experiments on various datasets, including WIDER FACE, FDDB, AFW and PASCAL Face, demonstrate the superiority of the proposed method.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"69 1","pages":"13043-13051"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80545054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
Towards Global Explanations of Convolutional Neural Networks With Concept Attribution 基于概念归因的卷积神经网络的全局解释
Pub Date : 2020-06-01 DOI: 10.1109/CVPR42600.2020.00868
Weibin Wu, Yuxin Su, Xixian Chen, Shenglin Zhao, Irwin King, M. Lyu, Yu-Wing Tai
With the growing prevalence of convolutional neural networks (CNNs), there is an urgent demand to explain their behaviors. Global explanations contribute to understanding model predictions on a whole category of samples, and thus have attracted increasing interest recently. However, existing methods overwhelmingly conduct separate input attribution or rely on local approximations of models, making them fail to offer faithful global explanations of CNNs. To overcome such drawbacks, we propose a novel two-stage framework, Attacking for Interpretability (AfI), which explains model decisions in terms of the importance of user-defined concepts. AfI first conducts a feature occlusion analysis, which resembles a process of attacking models to derive the category-wide importance of different features. We then map the feature importance to concept importance through ad-hoc semantic tasks. Experimental results confirm the effectiveness of AfI and its superiority in providing more accurate estimations of concept importance than existing proposals.
随着卷积神经网络(cnn)的日益普及,人们迫切需要解释其行为。全局解释有助于理解模型对整个样本类别的预测,因此最近引起了越来越多的兴趣。然而,现有的方法绝大多数进行单独的输入归因或依赖于模型的局部近似,这使得它们无法提供忠实的cnn全局解释。为了克服这些缺点,我们提出了一种新的两阶段框架——可解释性攻击(AfI),它根据用户定义概念的重要性来解释模型决策。AfI首先进行特征遮挡分析,这类似于攻击模型的过程,以得出不同特征在整个类别中的重要性。然后,我们通过特别的语义任务将特征重要性映射到概念重要性。实验结果证实了AfI的有效性和它在提供比现有方法更准确的概念重要性估计方面的优势。
{"title":"Towards Global Explanations of Convolutional Neural Networks With Concept Attribution","authors":"Weibin Wu, Yuxin Su, Xixian Chen, Shenglin Zhao, Irwin King, M. Lyu, Yu-Wing Tai","doi":"10.1109/CVPR42600.2020.00868","DOIUrl":"https://doi.org/10.1109/CVPR42600.2020.00868","url":null,"abstract":"With the growing prevalence of convolutional neural networks (CNNs), there is an urgent demand to explain their behaviors. Global explanations contribute to understanding model predictions on a whole category of samples, and thus have attracted increasing interest recently. However, existing methods overwhelmingly conduct separate input attribution or rely on local approximations of models, making them fail to offer faithful global explanations of CNNs. To overcome such drawbacks, we propose a novel two-stage framework, Attacking for Interpretability (AfI), which explains model decisions in terms of the importance of user-defined concepts. AfI first conducts a feature occlusion analysis, which resembles a process of attacking models to derive the category-wide importance of different features. We then map the feature importance to concept importance through ad-hoc semantic tasks. Experimental results confirm the effectiveness of AfI and its superiority in providing more accurate estimations of concept importance than existing proposals.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"86 1","pages":"8649-8658"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80790460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
Minimal Solvers for 3D Scan Alignment With Pairs of Intersecting Lines 具有对相交线的三维扫描对齐的最小解
Pub Date : 2020-06-01 DOI: 10.1109/cvpr42600.2020.00726
André Mateus, S. Ramalingam, Pedro Miraldo
We explore the possibility of using line intersection constraints for 3D scan registration. Typical 3D registration algorithms exploit point and plane correspondences, while line intersection constraints have not been used in the context of 3D scan registration before. Constraints from a match of pairs of intersecting lines in two 3D scans can be seen as two 3D line intersections, a plane correspondence, and a point correspondence. In this paper, we present minimal solvers that combine these different type of constraints: 1) three line intersections and one point match; 2) one line intersection and two point matches; 3) three line intersections and one plane match; 4) one line intersection and two plane matches; and 5) one line intersection, one point match, and one plane match. To use all the available solvers, we present a hybrid RANSAC loop. We propose a non-linear refinement technique using all the inliers obtained from the RANSAC. Vast experiments with simulated data and two real-data data-sets show that the use of these features and the combined solvers improve the accuracy. The code is available.
我们探索了使用线相交约束进行三维扫描配准的可能性。典型的三维配准算法利用点与平面的对应关系,而在三维扫描配准中尚未使用直线相交约束。在两个三维扫描中,对相交线的匹配约束可以被看作是两个三维线相交,一个平面对应,一个点对应。在本文中,我们提出了结合这些不同类型约束的最小解:1)三条线相交和一点匹配;2)一条直线相交,两点匹配;3)三条直线相交,一条平面匹配;4)一条直线相交,两个平面匹配;5)一条直线相交,一个点匹配,一个平面匹配。为了使用所有可用的求解器,我们提出了一个混合RANSAC循环。我们提出了一种利用从RANSAC获得的所有内层的非线性改进技术。大量模拟数据和两个实际数据集的实验表明,使用这些特征和组合求解器提高了精度。代码是可用的。
{"title":"Minimal Solvers for 3D Scan Alignment With Pairs of Intersecting Lines","authors":"André Mateus, S. Ramalingam, Pedro Miraldo","doi":"10.1109/cvpr42600.2020.00726","DOIUrl":"https://doi.org/10.1109/cvpr42600.2020.00726","url":null,"abstract":"We explore the possibility of using line intersection constraints for 3D scan registration. Typical 3D registration algorithms exploit point and plane correspondences, while line intersection constraints have not been used in the context of 3D scan registration before. Constraints from a match of pairs of intersecting lines in two 3D scans can be seen as two 3D line intersections, a plane correspondence, and a point correspondence. In this paper, we present minimal solvers that combine these different type of constraints: 1) three line intersections and one point match; 2) one line intersection and two point matches; 3) three line intersections and one plane match; 4) one line intersection and two plane matches; and 5) one line intersection, one point match, and one plane match. To use all the available solvers, we present a hybrid RANSAC loop. We propose a non-linear refinement technique using all the inliers obtained from the RANSAC. Vast experiments with simulated data and two real-data data-sets show that the use of these features and the combined solvers improve the accuracy. The code is available.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"90 1","pages":"7232-7242"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80321975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Private-kNN: Practical Differential Privacy for Computer Vision Private-kNN:计算机视觉的实用差分隐私
Pub Date : 2020-06-01 DOI: 10.1109/CVPR42600.2020.01187
Yuqing Zhu, Xiang Yu, Manmohan Chandraker, Yu-Xiang Wang
With increasing ethical and legal concerns on privacy for deep models in visual recognition, differential privacy has emerged as a mechanism to disguise membership of sensitive data in training datasets. Recent methods like Private Aggregation of Teacher Ensembles (PATE) leverage a large ensemble of teacher models trained on disjoint subsets of private data, to transfer knowledge to a student model with privacy guarantees. However, labeled vision data is often expensive and datasets, when split into many disjoint training sets, lead to significantly sub-optimal accuracy and thus hardly sustain good privacy bounds. We propose a practically data-efficient scheme based on private release of k-nearest neighbor (kNN) queries, which altogether avoids splitting the training dataset. Our approach allows the use of privacy-amplification by subsampling and iterative refinement of the kNN feature embedding. We rigorously analyze the theoretical properties of our method and demonstrate strong experimental performance on practical computer vision datasets for face attribute recognition and person reidentification. In particular, we achieve comparable or better accuracy than PATE while reducing more than 90% of the privacy loss, thereby providing the “most practical method to-date” for private deep learning in computer vision.
随着对视觉识别中深度模型隐私的伦理和法律关注的增加,差分隐私已经成为一种掩饰训练数据集中敏感数据的隶属关系的机制。最近的方法,如教师集合的私有聚合(PATE),利用在私有数据的不相交子集上训练的大量教师模型集合,将知识转移到具有隐私保证的学生模型中。然而,标记的视觉数据通常是昂贵的,当数据集被分成许多不相交的训练集时,会导致signiï非常不理想的准确性,因此很难维持良好的隐私界限。我们提出了一种基于k-最近邻(kNN)查询的私有发布的切实可行的data-efï - client方案,它完全避免了训练数据集的分裂。我们的方法允许通过子采样和迭代reï对kNN特征嵌入使用privacy-ampliï - cation。我们严格地分析了我们的方法的理论性质,并在实际的计算机视觉数据集上展示了强大的实验性能,用于人脸属性识别和人reidentiï识别。特别是,我们实现了与PATE相当或更好的准确率,同时减少了90%以上的隐私损失,从而为计算机视觉中的私人深度学习提供了“迄今为止最实用的方法”。
{"title":"Private-kNN: Practical Differential Privacy for Computer Vision","authors":"Yuqing Zhu, Xiang Yu, Manmohan Chandraker, Yu-Xiang Wang","doi":"10.1109/CVPR42600.2020.01187","DOIUrl":"https://doi.org/10.1109/CVPR42600.2020.01187","url":null,"abstract":"With increasing ethical and legal concerns on privacy for deep models in visual recognition, differential privacy has emerged as a mechanism to disguise membership of sensitive data in training datasets. Recent methods like Private Aggregation of Teacher Ensembles (PATE) leverage a large ensemble of teacher models trained on disjoint subsets of private data, to transfer knowledge to a student model with privacy guarantees. However, labeled vision data is often expensive and datasets, when split into many disjoint training sets, lead to significantly sub-optimal accuracy and thus hardly sustain good privacy bounds. We propose a practically data-efficient scheme based on private release of k-nearest neighbor (kNN) queries, which altogether avoids splitting the training dataset. Our approach allows the use of privacy-amplification by subsampling and iterative refinement of the kNN feature embedding. We rigorously analyze the theoretical properties of our method and demonstrate strong experimental performance on practical computer vision datasets for face attribute recognition and person reidentification. In particular, we achieve comparable or better accuracy than PATE while reducing more than 90% of the privacy loss, thereby providing the “most practical method to-date” for private deep learning in computer vision.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"88 1","pages":"11851-11859"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75866451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 55
DLWL: Improving Detection for Lowshot Classes With Weakly Labelled Data DLWL:改进对带有弱标记数据的低像素类的检测
Pub Date : 2020-06-01 DOI: 10.1109/cvpr42600.2020.00936
Vignesh Ramanathan, Rui Wang, D. Mahajan
Large detection datasets have a long tail of lowshot classes with very few bounding box annotations. We wish to improve detection for lowshot classes with weakly labelled web-scale datasets only having image-level labels. This requires a detection framework that can be jointly trained with limited number of bounding box annotated images and large number of weakly labelled images. Towards this end, we propose a modification to the FRCNN model to automatically infer label assignment for objects proposals from weakly labelled images during training. We pose this label assignment as a Linear Program with constraints on the number and overlap of object instances in an image. We show that this can be solved efficiently during training for weakly labelled images. Compared to just training with few annotated examples, augmenting with weakly labelled examples in our framework provides significant gains. We demonstrate this on the LVIS dataset 3.5 gain in AP as well as different lowshot variants of the COCO dataset. We provide a thorough analysis of the effect of amount of weakly labelled and fully labelled data required to train the detection model. Our DLWL framework can also outperform self-supervised baselines like omni-supervision for lowshot classes.
大型检测数据集具有低像素类的长尾,并且很少有边界框注释。我们希望通过仅具有图像级标签的弱标记web规模数据集来改进对低像素类的检测。这需要一个检测框架,它可以与有限数量的边界框注释图像和大量弱标记图像联合训练。为此,我们提出了对FRCNN模型的修改,以便在训练过程中从弱标记图像中自动推断对象建议的标签分配。我们将这种标签分配作为一个线性程序,对图像中对象实例的数量和重叠进行约束。我们证明这可以在弱标记图像的训练过程中有效地解决。与仅使用少量带注释的示例进行训练相比,在我们的框架中使用弱标记示例进行扩展提供了显着的增益。我们在LVIS数据集(AP中的3.5增益)以及COCO数据集的不同low - shot变体上证明了这一点。我们对训练检测模型所需的弱标记和完全标记数据量的影响进行了彻底的分析。我们的DLWL框架也可以超越自监督基线,比如低目标类的全监督。
{"title":"DLWL: Improving Detection for Lowshot Classes With Weakly Labelled Data","authors":"Vignesh Ramanathan, Rui Wang, D. Mahajan","doi":"10.1109/cvpr42600.2020.00936","DOIUrl":"https://doi.org/10.1109/cvpr42600.2020.00936","url":null,"abstract":"Large detection datasets have a long tail of lowshot classes with very few bounding box annotations. We wish to improve detection for lowshot classes with weakly labelled web-scale datasets only having image-level labels. This requires a detection framework that can be jointly trained with limited number of bounding box annotated images and large number of weakly labelled images. Towards this end, we propose a modification to the FRCNN model to automatically infer label assignment for objects proposals from weakly labelled images during training. We pose this label assignment as a Linear Program with constraints on the number and overlap of object instances in an image. We show that this can be solved efficiently during training for weakly labelled images. Compared to just training with few annotated examples, augmenting with weakly labelled examples in our framework provides significant gains. We demonstrate this on the LVIS dataset 3.5 gain in AP as well as different lowshot variants of the COCO dataset. We provide a thorough analysis of the effect of amount of weakly labelled and fully labelled data required to train the detection model. Our DLWL framework can also outperform self-supervised baselines like omni-supervision for lowshot classes.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"4 1","pages":"9339-9349"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81371611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Spherical Space Domain Adaptation With Robust Pseudo-Label Loss 具有鲁棒伪标签损失的球面空间域自适应
Pub Date : 2020-06-01 DOI: 10.1109/cvpr42600.2020.00912
Xiang Gu, Jian Sun, Zongben Xu
Adversarial domain adaptation (DA) has been an effective approach for learning domain-invariant features by adversarial training. In this paper, we propose a novel adversarial DA approach completely defined in spherical feature space, in which we define spherical classifier for label prediction and spherical domain discriminator for discriminating domain labels. To utilize pseudo-label robustly, we develop a robust pseudo-label loss in the spherical feature space, which weights the importance of estimated labels of target data by posterior probability of correct labeling, modeled by Gaussian-uniform mixture model in spherical feature space. Extensive experiments show that our method achieves state-of-the-art results, and also confirm effectiveness of spherical classifier, spherical discriminator and spherical robust pseudo-label loss.
对抗域自适应(DA)是一种通过对抗训练学习域不变特征的有效方法。本文提出了一种完全定义在球形特征空间中的对抗数据处理方法,其中定义了用于标签预测的球形分类器和用于识别领域标签的球形域判别器。为了鲁棒地利用伪标签,我们开发了一种鲁棒球形特征空间中的伪标签损失算法,该算法通过正确标记的后验概率对目标数据估计标签的重要性进行加权,该方法由球形特征空间中的高斯均匀混合模型建模。大量的实验表明,我们的方法达到了最先进的结果,也证实了球形分类器、球形鉴别器和球形鲁棒伪标签损失的有效性。
{"title":"Spherical Space Domain Adaptation With Robust Pseudo-Label Loss","authors":"Xiang Gu, Jian Sun, Zongben Xu","doi":"10.1109/cvpr42600.2020.00912","DOIUrl":"https://doi.org/10.1109/cvpr42600.2020.00912","url":null,"abstract":"Adversarial domain adaptation (DA) has been an effective approach for learning domain-invariant features by adversarial training. In this paper, we propose a novel adversarial DA approach completely defined in spherical feature space, in which we define spherical classifier for label prediction and spherical domain discriminator for discriminating domain labels. To utilize pseudo-label robustly, we develop a robust pseudo-label loss in the spherical feature space, which weights the importance of estimated labels of target data by posterior probability of correct labeling, modeled by Gaussian-uniform mixture model in spherical feature space. Extensive experiments show that our method achieves state-of-the-art results, and also confirm effectiveness of spherical classifier, spherical discriminator and spherical robust pseudo-label loss.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"16 1","pages":"9098-9107"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80893537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 97
Instance Guided Proposal Network for Person Search 基于实例的人物搜索建议网络
Pub Date : 2020-06-01 DOI: 10.1109/CVPR42600.2020.00266
Wenkai Dong, Zhaoxiang Zhang, Chunfeng Song, T. Tan
Person detection networks have been widely used in person search. These detectors discriminate persons from the background and generate proposals of all the persons from a gallery of scene images for each query. However, such a large number of proposals have a negative influence on the following identity matching process because many distractors are involved. In this paper, we propose a new detection network for person search, named Instance Guided Proposal Network (IGPN), which can learn the similarity between query persons and proposals. Thus, we can decrease proposals according to the similarity scores. To incorporate information of the query into the detection network, we introduce the Siamese region proposal network to Faster-RCNN and we propose improved cross-correlation layers to alleviate the imbalance of parameters distribution. Furthermore, we design a local relation block and a global relation branch to leverage the proposal-proposal relations and query-scene relations, respectively. Extensive experiments show that our method improves the person search performance through decreasing proposals and achieves competitive performance on two large person search benchmark datasets, CUHK-SYSU and PRW.
人体检测网络在人体搜索中得到了广泛的应用。这些检测器从背景中区分人物,并为每个查询从场景图像库中生成所有人物的建议。然而,如此大量的建议会对随后的身份匹配过程产生负面影响,因为涉及到许多干扰因素。本文提出了一种新的人物搜索检测网络——实例引导建议网络(Instance Guided Proposal network, IGPN),该网络可以学习查询人物与建议之间的相似度。因此,我们可以根据相似度分数来减少提案。为了将查询信息整合到检测网络中,我们在Faster-RCNN中引入了Siamese区域建议网络,并提出了改进的互相关层来缓解参数分布的不平衡。此外,我们设计了一个局部关系块和一个全局关系分支来分别利用提议-提议关系和查询-场景关系。大量的实验表明,我们的方法通过减少建议来提高人的搜索性能,并在中大-中山大学和PRW两个大型人的搜索基准数据集上取得了具有竞争力的性能。
{"title":"Instance Guided Proposal Network for Person Search","authors":"Wenkai Dong, Zhaoxiang Zhang, Chunfeng Song, T. Tan","doi":"10.1109/CVPR42600.2020.00266","DOIUrl":"https://doi.org/10.1109/CVPR42600.2020.00266","url":null,"abstract":"Person detection networks have been widely used in person search. These detectors discriminate persons from the background and generate proposals of all the persons from a gallery of scene images for each query. However, such a large number of proposals have a negative influence on the following identity matching process because many distractors are involved. In this paper, we propose a new detection network for person search, named Instance Guided Proposal Network (IGPN), which can learn the similarity between query persons and proposals. Thus, we can decrease proposals according to the similarity scores. To incorporate information of the query into the detection network, we introduce the Siamese region proposal network to Faster-RCNN and we propose improved cross-correlation layers to alleviate the imbalance of parameters distribution. Furthermore, we design a local relation block and a global relation branch to leverage the proposal-proposal relations and query-scene relations, respectively. Extensive experiments show that our method improves the person search performance through decreasing proposals and achieves competitive performance on two large person search benchmark datasets, CUHK-SYSU and PRW.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"487 1","pages":"2582-2591"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78829340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 66
Advancing High Fidelity Identity Swapping for Forgery Detection 推进高保真身份交换伪造检测
Pub Date : 2020-06-01 DOI: 10.1109/cvpr42600.2020.00512
Lingzhi Li, Jianmin Bao, Hao Yang, Dong Chen, Fang Wen
In this work, we study various existing benchmarks for deepfake detection researches. In particular, we examine a novel two-stage face swapping algorithm, called FaceShifter, for high fidelity and occlusion aware face swapping. Unlike many existing face swapping works that leverage only limited information from the target image when synthesizing the swapped face, FaceShifter generates the swapped face with high-fidelity by exploiting and integrating the target attributes thoroughly and adaptively. FaceShifter can handle facial occlusions with a second synthesis stage consisting of a Heuristic Error Acknowledging Refinement Network (HEAR-Net), which is trained to recover anomaly regions in a self-supervised way without any manual annotations. Experiments show that existing deepfake detection algorithm performs poorly with FaceShifter, since it achieves advantageous quality over all existing benchmarks. However, our newly developed Face X-Ray method can reliably detect forged images created by FaceShifter.
在这项工作中,我们研究了深度伪造检测研究的各种现有基准。特别是,我们研究了一种新的两阶段人脸交换算法,称为FaceShifter,用于高保真度和遮挡感知的人脸交换。与现有的人脸交换方法不同,FaceShifter算法在合成交换人脸时仅利用目标图像的有限信息,通过对目标属性的充分利用和自适应整合,生成高保真的交换人脸。FaceShifter可以通过由启发式错误识别改进网络(hearnet)组成的第二合成阶段处理面部遮挡,该网络经过训练以自监督的方式恢复异常区域,而无需任何手动注释。实验表明,现有的深度伪造检测算法在FaceShifter上表现不佳,因为它比所有现有的基准测试都具有优势。然而,我们新开发的面部x射线方法可以可靠地检测由FaceShifter创建的伪造图像。
{"title":"Advancing High Fidelity Identity Swapping for Forgery Detection","authors":"Lingzhi Li, Jianmin Bao, Hao Yang, Dong Chen, Fang Wen","doi":"10.1109/cvpr42600.2020.00512","DOIUrl":"https://doi.org/10.1109/cvpr42600.2020.00512","url":null,"abstract":"In this work, we study various existing benchmarks for deepfake detection researches. In particular, we examine a novel two-stage face swapping algorithm, called FaceShifter, for high fidelity and occlusion aware face swapping. Unlike many existing face swapping works that leverage only limited information from the target image when synthesizing the swapped face, FaceShifter generates the swapped face with high-fidelity by exploiting and integrating the target attributes thoroughly and adaptively. FaceShifter can handle facial occlusions with a second synthesis stage consisting of a Heuristic Error Acknowledging Refinement Network (HEAR-Net), which is trained to recover anomaly regions in a self-supervised way without any manual annotations. Experiments show that existing deepfake detection algorithm performs poorly with FaceShifter, since it achieves advantageous quality over all existing benchmarks. However, our newly developed Face X-Ray method can reliably detect forged images created by FaceShifter.","PeriodicalId":6715,"journal":{"name":"2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"59 1","pages":"5073-5082"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90824615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 133
期刊
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1