首页 > 最新文献

2013 IEEE Conference on Computer Vision and Pattern Recognition最新文献

英文 中文
Poselet Conditioned Pictorial Structures Poselet条件图像结构
Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.82
L. Pishchulin, Mykhaylo Andriluka, Peter Gehler, B. Schiele
In this paper we consider the challenging problem of articulated human pose estimation in still images. We observe that despite high variability of the body articulations, human motions and activities often simultaneously constrain the positions of multiple body parts. Modelling such higher order part dependencies seemingly comes at a cost of more expensive inference, which resulted in their limited use in state-of-the-art methods. In this paper we propose a model that incorporates higher order part dependencies while remaining efficient. We achieve this by defining a conditional model in which all body parts are connected a-priori, but which becomes a tractable tree-structured pictorial structures model once the image observations are available. In order to derive a set of conditioning variables we rely on the poselet-based features that have been shown to be effective for people detection but have so far found limited application for articulated human pose estimation. We demonstrate the effectiveness of our approach on three publicly available pose estimation benchmarks improving or being on-par with state of the art in each case.
本文研究了静止图像中人体姿态估计的挑战性问题。我们观察到,尽管身体关节的高度可变性,人类的运动和活动往往同时约束多个身体部位的位置。对这种高阶零件依赖关系进行建模似乎是以更昂贵的推理为代价的,这导致它们在最先进的方法中的使用受到限制。在本文中,我们提出了一个包含高阶部分依赖而保持效率的模型。我们通过定义一个条件模型来实现这一点,其中所有身体部位都是先验连接的,但一旦图像观测可用,它就变成了一个易于处理的树状结构图像结构模型。为了导出一组条件变量,我们依赖于基于姿态的特征,这些特征已被证明对人的检测是有效的,但到目前为止,在关节人体姿态估计方面的应用有限。我们在三个公开可用的姿态估计基准上证明了我们的方法的有效性,在每种情况下都改进或与最先进的状态保持一致。
{"title":"Poselet Conditioned Pictorial Structures","authors":"L. Pishchulin, Mykhaylo Andriluka, Peter Gehler, B. Schiele","doi":"10.1109/CVPR.2013.82","DOIUrl":"https://doi.org/10.1109/CVPR.2013.82","url":null,"abstract":"In this paper we consider the challenging problem of articulated human pose estimation in still images. We observe that despite high variability of the body articulations, human motions and activities often simultaneously constrain the positions of multiple body parts. Modelling such higher order part dependencies seemingly comes at a cost of more expensive inference, which resulted in their limited use in state-of-the-art methods. In this paper we propose a model that incorporates higher order part dependencies while remaining efficient. We achieve this by defining a conditional model in which all body parts are connected a-priori, but which becomes a tractable tree-structured pictorial structures model once the image observations are available. In order to derive a set of conditioning variables we rely on the poselet-based features that have been shown to be effective for people detection but have so far found limited application for articulated human pose estimation. We demonstrate the effectiveness of our approach on three publicly available pose estimation benchmarks improving or being on-par with state of the art in each case.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80838125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 337
Gauging Association Patterns of Chromosome Territories via Chromatic Median 利用染色中位数测定染色体区域的关联模式
Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.171
Hu Ding, B. Stojković, R. Berezney, Jinhui Xu
Computing accurate and robust organizational patterns of chromosome territories inside the cell nucleus is critical for understanding several fundamental genomic processes, such as co-regulation of gene activation, gene silencing, X chromosome inactivation, and abnormal chromosome rearrangement in cancer cells. The usage of advanced fluorescence labeling and image processing techniques has enabled researchers to investigate interactions of chromosome territories at large spatial resolution. The resulting high volume of generated data demands for high-throughput and automated image analysis methods. In this paper, we introduce a novel algorithmic tool for investigating association patterns of chromosome territories in a population of cells. Our method takes as input a set of graphs, one for each cell, containing information about spatial interaction of chromosome territories, and yields a single graph that contains essential information for the whole population and stands as its structural representative. We formulate this combinatorial problem as a semi-definite programming and present novel techniques to efficiently solve it. We validate our approach on both artificial and real biological data, the experimental results suggest that our approach yields a near-optimal solution, and can handle large-size datasets, which are significant improvements over existing techniques.
计算细胞核内染色体区域的准确和稳健的组织模式对于理解几个基本的基因组过程至关重要,例如基因激活的共同调节、基因沉默、X染色体失活和癌细胞中异常的染色体重排。先进的荧光标记和图像处理技术的使用使研究人员能够在大空间分辨率下研究染色体区域的相互作用。由此产生的大量数据需要高通量和自动化图像分析方法。在本文中,我们介绍了一种新的算法工具,用于研究细胞群体中染色体区域的关联模式。我们的方法以一组图形作为输入,每个细胞一个,包含有关染色体区域空间相互作用的信息,并产生一个包含整个种群基本信息的单个图形,并作为其结构代表。本文将这一组合问题表述为半定规划问题,并提出了有效求解该组合问题的新方法。我们在人工和真实的生物数据上验证了我们的方法,实验结果表明我们的方法产生了接近最优的解决方案,并且可以处理大型数据集,这是对现有技术的重大改进。
{"title":"Gauging Association Patterns of Chromosome Territories via Chromatic Median","authors":"Hu Ding, B. Stojković, R. Berezney, Jinhui Xu","doi":"10.1109/CVPR.2013.171","DOIUrl":"https://doi.org/10.1109/CVPR.2013.171","url":null,"abstract":"Computing accurate and robust organizational patterns of chromosome territories inside the cell nucleus is critical for understanding several fundamental genomic processes, such as co-regulation of gene activation, gene silencing, X chromosome inactivation, and abnormal chromosome rearrangement in cancer cells. The usage of advanced fluorescence labeling and image processing techniques has enabled researchers to investigate interactions of chromosome territories at large spatial resolution. The resulting high volume of generated data demands for high-throughput and automated image analysis methods. In this paper, we introduce a novel algorithmic tool for investigating association patterns of chromosome territories in a population of cells. Our method takes as input a set of graphs, one for each cell, containing information about spatial interaction of chromosome territories, and yields a single graph that contains essential information for the whole population and stands as its structural representative. We formulate this combinatorial problem as a semi-definite programming and present novel techniques to efficiently solve it. We validate our approach on both artificial and real biological data, the experimental results suggest that our approach yields a near-optimal solution, and can handle large-size datasets, which are significant improvements over existing techniques.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77214201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Augmenting CRFs with Boltzmann Machine Shape Priors for Image Labeling 基于玻尔兹曼机器形状先验的crf图像标注
Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.263
Andrew Kae, Kihyuk Sohn, Honglak Lee, E. Learned-Miller
Conditional random fields (CRFs) provide powerful tools for building models to label image segments. They are particularly well-suited to modeling local interactions among adjacent regions (e.g., super pixels). However, CRFs are limited in dealing with complex, global (long-range) interactions between regions. Complementary to this, restricted Boltzmann machines (RBMs) can be used to model global shapes produced by segmentation models. In this work, we present a new model that uses the combined power of these two network types to build a state-of-the-art labeler. Although the CRF is a good baseline labeler, we show how an RBM can be added to the architecture to provide a global shape bias that complements the local modeling provided by the CRF. We demonstrate its labeling performance for the parts of complex face images from the Labeled Faces in the Wild data set. This hybrid model produces results that are both quantitatively and qualitatively better than the CRF alone. In addition to high-quality labeling results, we demonstrate that the hidden units in the RBM portion of our model can be interpreted as face attributes that have been learned without any attribute-level supervision.
条件随机场(CRFs)为建立模型标记图像片段提供了强大的工具。它们特别适合于模拟相邻区域之间的局部相互作用(例如,超级像素)。然而,CRFs在处理区域之间复杂的全局(远程)相互作用方面是有限的。与此相辅相成的是,受限玻尔兹曼机(rbm)可用于建模由分割模型产生的全局形状。在这项工作中,我们提出了一个新的模型,该模型使用这两种网络类型的综合能力来构建最先进的标注器。尽管CRF是一个很好的基线标记器,但我们将展示如何将RBM添加到体系结构中,以提供全局形状偏差,以补充CRF提供的局部建模。我们展示了它对野生数据集中标记的复杂人脸图像部分的标记性能。这种混合模型产生的结果在数量和质量上都优于单独的CRF。除了高质量的标记结果外,我们还证明了我们模型中RBM部分的隐藏单元可以被解释为在没有任何属性级监督的情况下学习到的人脸属性。
{"title":"Augmenting CRFs with Boltzmann Machine Shape Priors for Image Labeling","authors":"Andrew Kae, Kihyuk Sohn, Honglak Lee, E. Learned-Miller","doi":"10.1109/CVPR.2013.263","DOIUrl":"https://doi.org/10.1109/CVPR.2013.263","url":null,"abstract":"Conditional random fields (CRFs) provide powerful tools for building models to label image segments. They are particularly well-suited to modeling local interactions among adjacent regions (e.g., super pixels). However, CRFs are limited in dealing with complex, global (long-range) interactions between regions. Complementary to this, restricted Boltzmann machines (RBMs) can be used to model global shapes produced by segmentation models. In this work, we present a new model that uses the combined power of these two network types to build a state-of-the-art labeler. Although the CRF is a good baseline labeler, we show how an RBM can be added to the architecture to provide a global shape bias that complements the local modeling provided by the CRF. We demonstrate its labeling performance for the parts of complex face images from the Labeled Faces in the Wild data set. This hybrid model produces results that are both quantitatively and qualitatively better than the CRF alone. In addition to high-quality labeling results, we demonstrate that the hidden units in the RBM portion of our model can be interpreted as face attributes that have been learned without any attribute-level supervision.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82334283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 186
Unnatural L0 Sparse Representation for Natural Image Deblurring 自然图像去模糊的非自然L0稀疏表示
Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.147
Li Xu, Shicheng Zheng, Jiaya Jia
We show in this paper that the success of previous maximum a posterior (MAP) based blur removal methods partly stems from their respective intermediate steps, which implicitly or explicitly create an unnatural representation containing salient image structures. We propose a generalized and mathematically sound L0 sparse expression, together with a new effective method, for motion deblurring. Our system does not require extra filtering during optimization and demonstrates fast energy decreasing, making a small number of iterations enough for convergence. It also provides a unified framework for both uniform and non-uniform motion deblurring. We extensively validate our method and show comparison with other approaches with respect to convergence speed, running time, and result quality.
我们在本文中表明,基于先前最大后验(MAP)的模糊去除方法的成功部分源于它们各自的中间步骤,这些步骤隐式或显式地创建了包含显著图像结构的非自然表示。我们提出了一种广义的、数学上合理的L0稀疏表达式,以及一种新的有效的运动去模糊方法。我们的系统在优化过程中不需要额外的滤波,并且显示出快速的能量下降,使得少量的迭代足以收敛。它还为均匀和非均匀运动去模糊提供了统一的框架。我们广泛地验证了我们的方法,并展示了与其他方法在收敛速度、运行时间和结果质量方面的比较。
{"title":"Unnatural L0 Sparse Representation for Natural Image Deblurring","authors":"Li Xu, Shicheng Zheng, Jiaya Jia","doi":"10.1109/CVPR.2013.147","DOIUrl":"https://doi.org/10.1109/CVPR.2013.147","url":null,"abstract":"We show in this paper that the success of previous maximum a posterior (MAP) based blur removal methods partly stems from their respective intermediate steps, which implicitly or explicitly create an unnatural representation containing salient image structures. We propose a generalized and mathematically sound L0 sparse expression, together with a new effective method, for motion deblurring. Our system does not require extra filtering during optimization and demonstrates fast energy decreasing, making a small number of iterations enough for convergence. It also provides a unified framework for both uniform and non-uniform motion deblurring. We extensively validate our method and show comparison with other approaches with respect to convergence speed, running time, and result quality.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86819959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 989
Illumination Estimation Based on Bilayer Sparse Coding 基于双层稀疏编码的光照估计
Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.187
Bing Li, Weihua Xiong, Weiming Hu, Houwen Peng
Computational color constancy is a very important topic in computer vision and has attracted many researchers' attention. Recently, lots of research has shown the effects of using high level visual content cues for improving illumination estimation. However, nearly all the existing methods are essentially combinational strategies in which image's content analysis is only used to guide the combination or selection from a variety of individual illumination estimation methods. In this paper, we propose a novel bilayer sparse coding model for illumination estimation that considers image similarity in terms of both low level color distribution and high level image scene content simultaneously. For the purpose, the image's scene content information is integrated with its color distribution to obtain optimal illumination estimation model. The experimental results on real-world image sets show that our algorithm is superior to some prevailing illumination estimation methods, even better than some combinational methods.
计算色彩恒常性是计算机视觉中一个非常重要的研究课题,引起了许多研究者的关注。近年来,大量的研究表明,使用高水平的视觉内容线索可以提高照明估计的效果。然而,几乎所有现有的方法本质上都是组合策略,仅利用图像的内容分析来指导各种单独的照度估计方法的组合或选择。在本文中,我们提出了一种新的双层稀疏编码模型用于照明估计,该模型同时考虑了低层次颜色分布和高层次图像场景内容的图像相似性。为此,将图像的场景内容信息与其颜色分布相结合,得到最优的照度估计模型。在真实图像集上的实验结果表明,该算法优于一些流行的照度估计方法,甚至优于一些组合方法。
{"title":"Illumination Estimation Based on Bilayer Sparse Coding","authors":"Bing Li, Weihua Xiong, Weiming Hu, Houwen Peng","doi":"10.1109/CVPR.2013.187","DOIUrl":"https://doi.org/10.1109/CVPR.2013.187","url":null,"abstract":"Computational color constancy is a very important topic in computer vision and has attracted many researchers' attention. Recently, lots of research has shown the effects of using high level visual content cues for improving illumination estimation. However, nearly all the existing methods are essentially combinational strategies in which image's content analysis is only used to guide the combination or selection from a variety of individual illumination estimation methods. In this paper, we propose a novel bilayer sparse coding model for illumination estimation that considers image similarity in terms of both low level color distribution and high level image scene content simultaneously. For the purpose, the image's scene content information is integrated with its color distribution to obtain optimal illumination estimation model. The experimental results on real-world image sets show that our algorithm is superior to some prevailing illumination estimation methods, even better than some combinational methods.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88251139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Harry Potter's Marauder's Map: Localizing and Tracking Multiple Persons-of-Interest by Nonnegative Discretization 哈利波特的活点地图:用非负离散化方法定位和跟踪多个感兴趣的人
Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.476
Shoou-I Yu, Yi Yang, Alexander Hauptmann
A device just like Harry Potter's Marauder's Map, which pinpoints the location of each person-of-interest at all times, provides invaluable information for analysis of surveillance videos. To make this device real, a system would be required to perform robust person localization and tracking in real world surveillance scenarios, especially for complex indoor environments with many walls causing occlusion and long corridors with sparse surveillance camera coverage. We propose a tracking-by-detection approach with nonnegative discretization to tackle this problem. Given a set of person detection outputs, our framework takes advantage of all important cues such as color, person detection, face recognition and non-background information to perform tracking. Local learning approaches are used to uncover the manifold structure in the appearance space with spatio-temporal constraints. Nonnegative discretization is used to enforce the mutual exclusion constraint, which guarantees a person detection output to only belong to exactly one individual. Experiments show that our algorithm performs robust localization and tracking of persons-of-interest not only in outdoor scenes, but also in a complex indoor real-world nursing home environment.
像《哈利波特》里的活点地图一样的设备,可以随时精确定位每个嫌疑人的位置,为分析监控视频提供宝贵的信息。为了使该设备成为现实,需要一个系统在现实世界的监视场景中执行强大的人员定位和跟踪,特别是对于具有许多墙壁造成遮挡的复杂室内环境和监视摄像机覆盖稀疏的长走廊。我们提出了一种非负离散化的检测跟踪方法来解决这个问题。给定一组人员检测输出,我们的框架利用所有重要的线索,如颜色、人员检测、人脸识别和非背景信息来执行跟踪。局部学习方法用于在时空约束下揭示外观空间中的流形结构。采用非负离散化来加强互斥约束,保证一个人检测输出只属于一个人。实验表明,我们的算法不仅在室外场景中,而且在复杂的室内现实养老院环境中,都能对感兴趣的人进行鲁棒的定位和跟踪。
{"title":"Harry Potter's Marauder's Map: Localizing and Tracking Multiple Persons-of-Interest by Nonnegative Discretization","authors":"Shoou-I Yu, Yi Yang, Alexander Hauptmann","doi":"10.1109/CVPR.2013.476","DOIUrl":"https://doi.org/10.1109/CVPR.2013.476","url":null,"abstract":"A device just like Harry Potter's Marauder's Map, which pinpoints the location of each person-of-interest at all times, provides invaluable information for analysis of surveillance videos. To make this device real, a system would be required to perform robust person localization and tracking in real world surveillance scenarios, especially for complex indoor environments with many walls causing occlusion and long corridors with sparse surveillance camera coverage. We propose a tracking-by-detection approach with nonnegative discretization to tackle this problem. Given a set of person detection outputs, our framework takes advantage of all important cues such as color, person detection, face recognition and non-background information to perform tracking. Local learning approaches are used to uncover the manifold structure in the appearance space with spatio-temporal constraints. Nonnegative discretization is used to enforce the mutual exclusion constraint, which guarantees a person detection output to only belong to exactly one individual. Experiments show that our algorithm performs robust localization and tracking of persons-of-interest not only in outdoor scenes, but also in a complex indoor real-world nursing home environment.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86450131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 84
GRASP Recurring Patterns from a Single View 从单一视图中把握重复模式
Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.261
Jingchen Liu, Yanxi Liu
We propose a novel unsupervised method for discovering recurring patterns from a single view. A key contribution of our approach is the formulation and validation of a joint assignment optimization problem where multiple visual words and object instances of a potential recurring pattern are considered simultaneously. The optimization is achieved by a greedy randomized adaptive search procedure (GRASP) with moves specifically designed for fast convergence. We have quantified systematically the performance of our approach under stressed conditions of the input (missing features, geometric distortions). We demonstrate that our proposed algorithm outperforms state of the art methods for recurring pattern discovery on a diverse set of 400+ real world and synthesized test images.
我们提出了一种新的无监督方法,用于从单个视图中发现重复模式。我们的方法的一个关键贡献是联合分配优化问题的制定和验证,其中同时考虑潜在重复模式的多个视觉单词和对象实例。优化是通过贪婪随机自适应搜索程序(GRASP)实现的,该程序的移动是专门为快速收敛而设计的。我们系统地量化了我们的方法在输入的压力条件下的性能(缺失特征,几何扭曲)。我们证明了我们提出的算法在400多个真实世界和合成测试图像的不同集合上的重复模式发现优于最先进的方法。
{"title":"GRASP Recurring Patterns from a Single View","authors":"Jingchen Liu, Yanxi Liu","doi":"10.1109/CVPR.2013.261","DOIUrl":"https://doi.org/10.1109/CVPR.2013.261","url":null,"abstract":"We propose a novel unsupervised method for discovering recurring patterns from a single view. A key contribution of our approach is the formulation and validation of a joint assignment optimization problem where multiple visual words and object instances of a potential recurring pattern are considered simultaneously. The optimization is achieved by a greedy randomized adaptive search procedure (GRASP) with moves specifically designed for fast convergence. We have quantified systematically the performance of our approach under stressed conditions of the input (missing features, geometric distortions). We demonstrate that our proposed algorithm outperforms state of the art methods for recurring pattern discovery on a diverse set of 400+ real world and synthesized test images.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86473073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
Accurate Localization of 3D Objects from RGB-D Data Using Segmentation Hypotheses 基于分割假设的RGB-D数据三维目标精确定位
Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.409
Byung-soo Kim, Shili Xu, S. Savarese
In this paper we focus on the problem of detecting objects in 3D from RGB-D images. We propose a novel framework that explores the compatibility between segmentation hypotheses of the object in the image and the corresponding 3D map. Our framework allows to discover the optimal location of the object using a generalization of the structural latent SVM formulation in 3D as well as the definition of a new loss function defined over the 3D space in training. We evaluate our method using two existing RGB-D datasets. Extensive quantitative and qualitative experimental results show that our proposed approach outperforms state-of-the-art as methods well as a number of baseline approaches for both 3D and 2D object recognition tasks.
本文主要研究从RGB-D图像中检测三维物体的问题。我们提出了一个新的框架,探索图像中物体的分割假设与相应的3D地图之间的兼容性。我们的框架允许使用三维结构潜在支持向量机公式的泛化以及在训练中在三维空间上定义的新损失函数的定义来发现目标的最佳位置。我们使用两个现有的RGB-D数据集来评估我们的方法。广泛的定量和定性实验结果表明,我们提出的方法优于最先进的方法以及一些3D和2D物体识别任务的基线方法。
{"title":"Accurate Localization of 3D Objects from RGB-D Data Using Segmentation Hypotheses","authors":"Byung-soo Kim, Shili Xu, S. Savarese","doi":"10.1109/CVPR.2013.409","DOIUrl":"https://doi.org/10.1109/CVPR.2013.409","url":null,"abstract":"In this paper we focus on the problem of detecting objects in 3D from RGB-D images. We propose a novel framework that explores the compatibility between segmentation hypotheses of the object in the image and the corresponding 3D map. Our framework allows to discover the optimal location of the object using a generalization of the structural latent SVM formulation in 3D as well as the definition of a new loss function defined over the 3D space in training. We evaluate our method using two existing RGB-D datasets. Extensive quantitative and qualitative experimental results show that our proposed approach outperforms state-of-the-art as methods well as a number of baseline approaches for both 3D and 2D object recognition tasks.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86240719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 74
From Local Similarity to Global Coding: An Application to Image Classification 从局部相似到全局编码:在图像分类中的应用
Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.360
Amirreza Shaban, H. Rabiee, Mehrdad Farajtabar, Marjan Ghazvininejad
Bag of words models for feature extraction have demonstrated top-notch performance in image classification. These representations are usually accompanied by a coding method. Recently, methods that code a descriptor giving regard to its nearby bases have proved efficacious. These methods take into account the nonlinear structure of descriptors, since local similarities are a good approximation of global similarities. However, they confine their usage of the global similarities to nearby bases. In this paper, we propose a coding scheme that brings into focus the manifold structure of descriptors, and devise a method to compute the global similarities of descriptors to the bases. Given a local similarity measure between bases, a global measure is computed. Exploiting the local similarity of a descriptor and its nearby bases, a global measure of association of a descriptor to all the bases is computed. Unlike the locality-based and sparse coding methods, the proposed coding varies smoothly with respect to the underlying manifold. Experiments on benchmark image classification datasets substantiate the superiority of the proposed method over its locality and sparsity based rivals.
用于特征提取的词袋模型在图像分类中表现出了一流的性能。这些表示通常伴随着编码方法。最近,对描述符进行编码的方法已被证明是有效的。这些方法考虑了描述符的非线性结构,因为局部相似度是全局相似度的良好近似。然而,他们将全球相似性的使用限制在附近的基地。本文提出了一种关注描述子流形结构的编码方案,并设计了一种计算描述子与基的全局相似度的方法。给定碱基之间的局部相似性度量,计算全局度量。利用描述子及其附近基的局部相似性,计算描述子与所有基的关联的全局度量。与基于位置和稀疏的编码方法不同,所提出的编码相对于底层流形平滑地变化。在基准图像分类数据集上的实验证明了该方法优于基于局部性和稀疏性的同类方法。
{"title":"From Local Similarity to Global Coding: An Application to Image Classification","authors":"Amirreza Shaban, H. Rabiee, Mehrdad Farajtabar, Marjan Ghazvininejad","doi":"10.1109/CVPR.2013.360","DOIUrl":"https://doi.org/10.1109/CVPR.2013.360","url":null,"abstract":"Bag of words models for feature extraction have demonstrated top-notch performance in image classification. These representations are usually accompanied by a coding method. Recently, methods that code a descriptor giving regard to its nearby bases have proved efficacious. These methods take into account the nonlinear structure of descriptors, since local similarities are a good approximation of global similarities. However, they confine their usage of the global similarities to nearby bases. In this paper, we propose a coding scheme that brings into focus the manifold structure of descriptors, and devise a method to compute the global similarities of descriptors to the bases. Given a local similarity measure between bases, a global measure is computed. Exploiting the local similarity of a descriptor and its nearby bases, a global measure of association of a descriptor to all the bases is computed. Unlike the locality-based and sparse coding methods, the proposed coding varies smoothly with respect to the underlying manifold. Experiments on benchmark image classification datasets substantiate the superiority of the proposed method over its locality and sparsity based rivals.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86331072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
Discriminative Color Descriptors 鉴别颜色描述符
Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.369
Rahat Khan, Joost van de Weijer, F. Khan, Damien Muselet, C. Ducottet, C. Barat
Color description is a challenging task because of large variations in RGB values which occur due to scene accidental events, such as shadows, shading, specularities, illuminant color changes, and changes in viewing geometry. Traditionally, this challenge has been addressed by capturing the variations in physics-based models, and deriving invariants for the undesired variations. The drawback of this approach is that sets of distinguishable colors in the original color space are mapped to the same value in the photometric invariant space. This results in a drop of discriminative power of the color description. In this paper we take an information theoretic approach to color description. We cluster color values together based on their discriminative power in a classification problem. The clustering has the explicit objective to minimize the drop of mutual information of the final representation. We show that such a color description automatically learns a certain degree of photometric invariance. We also show that a universal color representation, which is based on other data sets than the one at hand, can obtain competing performance. Experiments show that the proposed descriptor outperforms existing photometric invariants. Furthermore, we show that combined with shape description these color descriptors obtain excellent results on four challenging datasets, namely, PASCAL VOC 2007, Flowers-102, Stanford dogs-120 and Birds-200.
颜色描述是一项具有挑战性的任务,因为RGB值的巨大变化是由于场景偶然事件,如阴影,阴影,镜面,光源颜色变化和观看几何形状的变化而发生的。传统上,这一挑战是通过捕获基于物理的模型中的变化,并为不期望的变化推导不变量来解决的。这种方法的缺点是原始颜色空间中的可区分颜色集被映射到光度不变空间中的相同值。这导致颜色描述的辨别能力下降。本文采用信息论的方法对颜色进行描述。在分类问题中,我们根据颜色值的判别能力将它们聚在一起。聚类的明确目标是最小化最终表示的互信息的下降。我们发现这种颜色描述自动学习了一定程度的光度不变性。我们还表明,基于其他数据集而不是现有数据集的通用颜色表示可以获得竞争性能。实验表明,所提出的描述符优于现有的光度不变量。此外,我们表明,结合形状描述,这些颜色描述符在四个具有挑战性的数据集上获得了出色的结果,即PASCAL VOC 2007, Flowers-102, Stanford dogs-120和Birds-200。
{"title":"Discriminative Color Descriptors","authors":"Rahat Khan, Joost van de Weijer, F. Khan, Damien Muselet, C. Ducottet, C. Barat","doi":"10.1109/CVPR.2013.369","DOIUrl":"https://doi.org/10.1109/CVPR.2013.369","url":null,"abstract":"Color description is a challenging task because of large variations in RGB values which occur due to scene accidental events, such as shadows, shading, specularities, illuminant color changes, and changes in viewing geometry. Traditionally, this challenge has been addressed by capturing the variations in physics-based models, and deriving invariants for the undesired variations. The drawback of this approach is that sets of distinguishable colors in the original color space are mapped to the same value in the photometric invariant space. This results in a drop of discriminative power of the color description. In this paper we take an information theoretic approach to color description. We cluster color values together based on their discriminative power in a classification problem. The clustering has the explicit objective to minimize the drop of mutual information of the final representation. We show that such a color description automatically learns a certain degree of photometric invariance. We also show that a universal color representation, which is based on other data sets than the one at hand, can obtain competing performance. Experiments show that the proposed descriptor outperforms existing photometric invariants. Furthermore, we show that combined with shape description these color descriptors obtain excellent results on four challenging datasets, namely, PASCAL VOC 2007, Flowers-102, Stanford dogs-120 and Birds-200.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82724154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 128
期刊
2013 IEEE Conference on Computer Vision and Pattern Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1