首页 > 最新文献

2011 International Conference on Computer Vision最新文献

英文 中文
Actively selecting annotations among objects and attributes 主动选择对象和属性之间的注释
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126395
Adriana Kovashka, Sudheendra Vijayanarasimhan, K. Grauman
We present an active learning approach to choose image annotation requests among both object category labels and the objects' attribute labels. The goal is to solicit those labels that will best use human effort when training a multi-class object recognition model. In contrast to previous work in active visual category learning, our approach directly exploits the dependencies between human-nameable visual attributes and the objects they describe, shifting its requests in either label space accordingly. We adopt a discriminative latent model that captures object-attribute and attribute-attribute relationships, and then define a suitable entropy reduction selection criterion to predict the influence a new label might have throughout those connections. On three challenging datasets, we demonstrate that the method can more successfully accelerate object learning relative to both passive learning and traditional active learning approaches.
我们提出了一种主动学习方法,在对象类别标签和对象属性标签之间选择图像标注请求。目标是在训练多类对象识别模型时,征求那些最能利用人力的标签。与之前在主动视觉类别学习方面的工作相比,我们的方法直接利用了人类可命名的视觉属性和它们所描述的对象之间的依赖关系,相应地在两个标签空间中转移其请求。我们采用了一个判别潜模型来捕获对象-属性和属性-属性关系,然后定义一个合适的熵降选择标准来预测一个新标签在这些连接中可能产生的影响。在三个具有挑战性的数据集上,我们证明了相对于被动学习和传统主动学习方法,该方法可以更成功地加速对象学习。
{"title":"Actively selecting annotations among objects and attributes","authors":"Adriana Kovashka, Sudheendra Vijayanarasimhan, K. Grauman","doi":"10.1109/ICCV.2011.6126395","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126395","url":null,"abstract":"We present an active learning approach to choose image annotation requests among both object category labels and the objects' attribute labels. The goal is to solicit those labels that will best use human effort when training a multi-class object recognition model. In contrast to previous work in active visual category learning, our approach directly exploits the dependencies between human-nameable visual attributes and the objects they describe, shifting its requests in either label space accordingly. We adopt a discriminative latent model that captures object-attribute and attribute-attribute relationships, and then define a suitable entropy reduction selection criterion to predict the influence a new label might have throughout those connections. On three challenging datasets, we demonstrate that the method can more successfully accelerate object learning relative to both passive learning and traditional active learning approaches.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90227381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 75
Superpixels via pseudo-Boolean optimization 通过伪布尔优化的超像素
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126393
Yuhang Zhang, R. Hartley, J. Mashford, S. Burn
We propose an algorithm for creating superpixels. The major step in our algorithm is simply minimizing two pseudo-Boolean functions. The processing time of our algorithm on images of moderate size is only half a second. Experiments on a benchmark dataset show that our method produces superpixels of comparable quality with existing algorithms. Last but not least, the speed of our algorithm is independent of the number of superpixels, which is usually the bottle-neck for the traditional algorithms of superpixel creation.
我们提出了一种创建超像素的算法。我们算法的主要步骤是简单地最小化两个伪布尔函数。对于中等大小的图像,我们的算法处理时间仅为半秒。在一个基准数据集上的实验表明,我们的方法产生了与现有算法相当质量的超像素。最后但并非最不重要的是,我们的算法的速度与超像素的数量无关,这通常是传统的超像素创建算法的瓶颈。
{"title":"Superpixels via pseudo-Boolean optimization","authors":"Yuhang Zhang, R. Hartley, J. Mashford, S. Burn","doi":"10.1109/ICCV.2011.6126393","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126393","url":null,"abstract":"We propose an algorithm for creating superpixels. The major step in our algorithm is simply minimizing two pseudo-Boolean functions. The processing time of our algorithm on images of moderate size is only half a second. Experiments on a benchmark dataset show that our method produces superpixels of comparable quality with existing algorithms. Last but not least, the speed of our algorithm is independent of the number of superpixels, which is usually the bottle-neck for the traditional algorithms of superpixel creation.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89898213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 101
Who Blocks Who: Simultaneous clothing segmentation for grouping images 谁阻挡谁:同时服装分割分组图像
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126412
Nan Wang, H. Ai
Clothing is one of the most informative cues of human appearance. In this paper, we propose a novel multi-person clothing segmentation algorithm for highly occluded images. The key idea is combining blocking models to address the person-wise occlusions. In contrary to the traditional layered model that tries to solve the full layer ranking problem, the proposed blocking model partitions the problem into a series of pair-wise ones and then determines the local blocking relationship based on individual and contextual information. Thus, it is capable of dealing with cases with a large number of people. Additionally, we propose a layout model formulated as Markov Network which incorporates the blocking relationship to pursue an approximately optimal clothing layout for group people. Experiments demonstrated on a group images dataset show the effectiveness of our algorithm.
衣着是人类外表最具信息量的线索之一。在本文中,我们提出了一种新的针对高度遮挡图像的多人服装分割算法。关键思想是结合阻塞模型来解决个人闭塞问题。与传统分层模型试图解决全层排序问题不同,本文提出的阻塞模型将问题划分为一系列成对的问题,然后根据个体和上下文信息确定局部阻塞关系。因此,它能够处理涉及大量人员的案件。此外,我们提出了一个包含阻塞关系的马尔可夫网络的布局模型,以追求群体人群的近似最优服装布局。在一组图像数据集上的实验证明了该算法的有效性。
{"title":"Who Blocks Who: Simultaneous clothing segmentation for grouping images","authors":"Nan Wang, H. Ai","doi":"10.1109/ICCV.2011.6126412","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126412","url":null,"abstract":"Clothing is one of the most informative cues of human appearance. In this paper, we propose a novel multi-person clothing segmentation algorithm for highly occluded images. The key idea is combining blocking models to address the person-wise occlusions. In contrary to the traditional layered model that tries to solve the full layer ranking problem, the proposed blocking model partitions the problem into a series of pair-wise ones and then determines the local blocking relationship based on individual and contextual information. Thus, it is capable of dealing with cases with a large number of people. Additionally, we propose a layout model formulated as Markov Network which incorporates the blocking relationship to pursue an approximately optimal clothing layout for group people. Experiments demonstrated on a group images dataset show the effectiveness of our algorithm.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83951528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 74
Tabula rasa: Model transfer for object category detection 表格:对象类别检测的模型转移
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126504
Y. Aytar, Andrew Zisserman
Our objective is transfer training of a discriminatively trained object category detector, in order to reduce the number of training images required. To this end we propose three transfer learning formulations where a template learnt previously for other categories is used to regularize the training of a new category. All the formulations result in convex optimization problems. Experiments (on PASCAL VOC) demonstrate significant performance gains by transfer learning from one class to another (e.g. motorbike to bicycle), including one-shot learning, specialization from class to a subordinate class (e.g. from quadruped to horse) and transfer using multiple components. In the case of multiple training samples it is shown that a detection performance approaching that of the state of the art can be achieved with substantially fewer training samples.
我们的目标是对一个判别训练的对象类别检测器进行迁移训练,以减少所需的训练图像数量。为此,我们提出了三种迁移学习公式,其中使用先前为其他类别学习的模板来规范新类别的训练。所有的公式都会导致凸优化问题。实验(在PASCAL VOC上)表明,通过从一个类迁移到另一个类(例如从摩托车到自行车),包括一次性学习,从类到从属类的专业化(例如从四足动物到马)以及使用多个组件进行迁移,可以显著提高性能。在多个训练样本的情况下,表明可以用更少的训练样本实现接近当前技术水平的检测性能。
{"title":"Tabula rasa: Model transfer for object category detection","authors":"Y. Aytar, Andrew Zisserman","doi":"10.1109/ICCV.2011.6126504","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126504","url":null,"abstract":"Our objective is transfer training of a discriminatively trained object category detector, in order to reduce the number of training images required. To this end we propose three transfer learning formulations where a template learnt previously for other categories is used to regularize the training of a new category. All the formulations result in convex optimization problems. Experiments (on PASCAL VOC) demonstrate significant performance gains by transfer learning from one class to another (e.g. motorbike to bicycle), including one-shot learning, specialization from class to a subordinate class (e.g. from quadruped to horse) and transfer using multiple components. In the case of multiple training samples it is shown that a detection performance approaching that of the state of the art can be achieved with substantially fewer training samples.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89092045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 367
Learning nonlinear distance functions using neural network for regression with application to robust human age estimation 用神经网络学习非线性距离函数进行回归,并应用于稳健的人类年龄估计
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126249
N. Fan
In this paper, a robust regression method is proposed for human age estimation, in which, outlier samples are corrected by their neighbors, through asymptotically increasing the correlation coefficients between the desired distances and the distances of sample labels. As another extension, we adopt a nonlinear distance function and approximate it by neural network. For fair comparison, we also experiment on the regression problem of age estimation from face images, and the results are very competitive among the state of the art.
本文提出了一种用于人类年龄估计的鲁棒回归方法,该方法通过渐近增加期望距离与样本标签距离之间的相关系数,对离群样本进行邻域校正。作为另一种扩展,我们采用了非线性距离函数,并用神经网络对其进行近似。为了公平比较,我们还对人脸图像年龄估计的回归问题进行了实验,结果在目前的研究中具有很强的竞争力。
{"title":"Learning nonlinear distance functions using neural network for regression with application to robust human age estimation","authors":"N. Fan","doi":"10.1109/ICCV.2011.6126249","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126249","url":null,"abstract":"In this paper, a robust regression method is proposed for human age estimation, in which, outlier samples are corrected by their neighbors, through asymptotically increasing the correlation coefficients between the desired distances and the distances of sample labels. As another extension, we adopt a nonlinear distance function and approximate it by neural network. For fair comparison, we also experiment on the regression problem of age estimation from face images, and the results are very competitive among the state of the art.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76198381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Semi-supervised learning and optimization for hypergraph matching 超图匹配的半监督学习与优化
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126507
Marius Leordeanu, Andrei Zanfir, C. Sminchisescu
Graph and hypergraph matching are important problems in computer vision. They are successfully used in many applications requiring 2D or 3D feature matching, such as 3D reconstruction and object recognition. While graph matching is limited to using pairwise relationships, hypergraph matching permits the use of relationships between sets of features of any order. Consequently, it carries the promise to make matching more robust to changes in scale, deformations and outliers. In this paper we make two contributions. First, we present a first semi-supervised algorithm for learning the parameters that control the hypergraph matching model and demonstrate experimentally that it significantly improves the performance of current state-of-the-art methods. Second, we propose a novel efficient hypergraph matching algorithm, which outperforms the state-of-the-art, and, when used in combination with other higher-order matching algorithms, it consistently improves their performance.
图和超图匹配是计算机视觉中的重要问题。它们成功地应用于许多需要2D或3D特征匹配的应用中,例如3D重建和物体识别。图匹配仅限于使用成对关系,而超图匹配允许使用任意顺序的特征集之间的关系。因此,它承诺使匹配对规模、变形和异常值的变化更加稳健。在本文中,我们做了两个贡献。首先,我们提出了第一种半监督算法,用于学习控制超图匹配模型的参数,并通过实验证明它显着提高了当前最先进方法的性能。其次,我们提出了一种新的高效超图匹配算法,该算法优于最先进的超图匹配算法,并且当与其他高阶匹配算法结合使用时,它始终提高了它们的性能。
{"title":"Semi-supervised learning and optimization for hypergraph matching","authors":"Marius Leordeanu, Andrei Zanfir, C. Sminchisescu","doi":"10.1109/ICCV.2011.6126507","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126507","url":null,"abstract":"Graph and hypergraph matching are important problems in computer vision. They are successfully used in many applications requiring 2D or 3D feature matching, such as 3D reconstruction and object recognition. While graph matching is limited to using pairwise relationships, hypergraph matching permits the use of relationships between sets of features of any order. Consequently, it carries the promise to make matching more robust to changes in scale, deformations and outliers. In this paper we make two contributions. First, we present a first semi-supervised algorithm for learning the parameters that control the hypergraph matching model and demonstrate experimentally that it significantly improves the performance of current state-of-the-art methods. Second, we propose a novel efficient hypergraph matching algorithm, which outperforms the state-of-the-art, and, when used in combination with other higher-order matching algorithms, it consistently improves their performance.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76732263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 69
Robust and efficient parametric face alignment 鲁棒、高效的参数化人脸对齐
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126452
Georgios Tzimiropoulos, S. Zafeiriou, M. Pantic
We propose a correlation-based approach to parametric object alignment particularly suitable for face analysis applications which require efficiency and robustness against occlusions and illumination changes. Our algorithm registers two images by iteratively maximizing their correlation coefficient using gradient ascent. We compute this correlation coefficient from complex gradients which capture the orientation of image structures rather than pixel intensities. The maximization of this gradient correlation coefficient results in an algorithm which is as computationally efficient as ℓ2 norm-based algorithms, can be extended within the inverse compositional framework (without the need for Hessian re-computation) and is robust to outliers. To the best of our knowledge, no other algorithm has been proposed so far having all three features. We show the robustness of our algorithm for the problem of face alignment in the presence of occlusions and non-uniform illumination changes. The code that reproduces the results of our paper can be found at http://ibug.doc.ic.ac.uk/resources.
我们提出了一种基于相关性的参数对象对齐方法,特别适用于需要对遮挡和光照变化具有效率和鲁棒性的人脸分析应用。我们的算法通过使用梯度上升迭代最大化它们的相关系数来注册两幅图像。我们从捕获图像结构方向而不是像素强度的复杂梯度计算该相关系数。该梯度相关系数的最大化使算法的计算效率与基于2范数的算法一样高,可以在逆组合框架内扩展(不需要Hessian重新计算),并且对异常值具有鲁棒性。据我们所知,到目前为止还没有其他算法能同时具备这三个特征。我们展示了我们的算法在存在遮挡和非均匀光照变化的情况下的人脸对齐问题的鲁棒性。可以在http://ibug.doc.ic.ac.uk/resources上找到重现我们论文结果的代码。
{"title":"Robust and efficient parametric face alignment","authors":"Georgios Tzimiropoulos, S. Zafeiriou, M. Pantic","doi":"10.1109/ICCV.2011.6126452","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126452","url":null,"abstract":"We propose a correlation-based approach to parametric object alignment particularly suitable for face analysis applications which require efficiency and robustness against occlusions and illumination changes. Our algorithm registers two images by iteratively maximizing their correlation coefficient using gradient ascent. We compute this correlation coefficient from complex gradients which capture the orientation of image structures rather than pixel intensities. The maximization of this gradient correlation coefficient results in an algorithm which is as computationally efficient as ℓ2 norm-based algorithms, can be extended within the inverse compositional framework (without the need for Hessian re-computation) and is robust to outliers. To the best of our knowledge, no other algorithm has been proposed so far having all three features. We show the robustness of our algorithm for the problem of face alignment in the presence of occlusions and non-uniform illumination changes. The code that reproduces the results of our paper can be found at http://ibug.doc.ic.ac.uk/resources.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72679098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 57
Dynamic texture classification using dynamic fractal analysis 基于动态分形分析的动态纹理分类
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126372
Yong Xu, Yuhui Quan, Haibin Ling, Hui Ji
In this paper, we developed a novel tool called dynamic fractal analysis for dynamic texture (DT) classification, which not only provides a rich description of DT but also has strong robustness to environmental changes. The resulting dynamic fractal spectrum (DFS) for DT sequences consists of two components: One is the volumetric dynamic fractal spectrum component (V-DFS) that captures the stochastic self-similarities of DT sequences as 3D volume datasets; the other is the multi-slice dynamic fractal spectrum component (S-DFS) that encodes fractal structures of DT sequences on 2D slices along different views of the 3D volume. Various types of measures of DT sequences are collected in our approach to analyze DT sequences from different perspectives. The experimental evaluation is conducted on three widely used benchmark datasets. In all the experiments, our method demonstrated excellent performance in comparison with state-of-the-art approaches.
在本文中,我们开发了一种新的动态分形分析工具用于动态纹理(DT)分类,该工具不仅提供了丰富的DT描述,而且对环境变化具有很强的鲁棒性。得到的DT序列动态分形谱(DFS)由两部分组成:一是体积动态分形谱(V-DFS),它捕捉DT序列作为三维体积数据集的随机自相似性;另一种是多片动态分形谱分量(S-DFS),它沿三维体的不同视图编码二维切片上DT序列的分形结构。我们的方法收集了DT序列的各种类型的度量,从不同的角度分析DT序列。在三个广泛使用的基准数据集上进行了实验评估。在所有的实验中,与最先进的方法相比,我们的方法表现出优异的性能。
{"title":"Dynamic texture classification using dynamic fractal analysis","authors":"Yong Xu, Yuhui Quan, Haibin Ling, Hui Ji","doi":"10.1109/ICCV.2011.6126372","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126372","url":null,"abstract":"In this paper, we developed a novel tool called dynamic fractal analysis for dynamic texture (DT) classification, which not only provides a rich description of DT but also has strong robustness to environmental changes. The resulting dynamic fractal spectrum (DFS) for DT sequences consists of two components: One is the volumetric dynamic fractal spectrum component (V-DFS) that captures the stochastic self-similarities of DT sequences as 3D volume datasets; the other is the multi-slice dynamic fractal spectrum component (S-DFS) that encodes fractal structures of DT sequences on 2D slices along different views of the 3D volume. Various types of measures of DT sequences are collected in our approach to analyze DT sequences from different perspectives. The experimental evaluation is conducted on three widely used benchmark datasets. In all the experiments, our method demonstrated excellent performance in comparison with state-of-the-art approaches.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79889611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 106
Modeling spatial layout with fisher vectors for image categorization 利用fisher向量建模空间布局,用于图像分类
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126406
Josip Krapac, J. Verbeek, F. Jurie
We introduce an extension of bag-of-words image representations to encode spatial layout. Using the Fisher kernel framework we derive a representation that encodes the spatial mean and the variance of image regions associated with visual words. We extend this representation by using a Gaussian mixture model to encode spatial layout, and show that this model is related to a soft-assign version of the spatial pyramid representation. We also combine our representation of spatial layout with the use of Fisher kernels to encode the appearance of local features. Through an extensive experimental evaluation, we show that our representation yields state-of-the-art image categorization results, while being more compact than spatial pyramid representations. In particular, using Fisher kernels to encode both appearance and spatial layout results in an image representation that is computationally efficient, compact, and yields excellent performance while using linear classifiers.
我们引入了一种扩展的词袋图像表示来编码空间布局。利用Fisher核框架,我们得到了一种编码与视觉词相关的图像区域的空间均值和方差的表示。我们通过使用高斯混合模型对空间布局进行编码来扩展这种表示,并表明该模型与空间金字塔表示的软分配版本相关。我们还将空间布局的表示与使用Fisher核来编码局部特征的外观结合起来。通过广泛的实验评估,我们表明我们的表示产生了最先进的图像分类结果,同时比空间金字塔表示更紧凑。特别是,使用Fisher核编码外观和空间布局会产生计算效率高、紧凑的图像表示,并且在使用线性分类器时产生出色的性能。
{"title":"Modeling spatial layout with fisher vectors for image categorization","authors":"Josip Krapac, J. Verbeek, F. Jurie","doi":"10.1109/ICCV.2011.6126406","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126406","url":null,"abstract":"We introduce an extension of bag-of-words image representations to encode spatial layout. Using the Fisher kernel framework we derive a representation that encodes the spatial mean and the variance of image regions associated with visual words. We extend this representation by using a Gaussian mixture model to encode spatial layout, and show that this model is related to a soft-assign version of the spatial pyramid representation. We also combine our representation of spatial layout with the use of Fisher kernels to encode the appearance of local features. Through an extensive experimental evaluation, we show that our representation yields state-of-the-art image categorization results, while being more compact than spatial pyramid representations. In particular, using Fisher kernels to encode both appearance and spatial layout results in an image representation that is computationally efficient, compact, and yields excellent performance while using linear classifiers.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73333703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 211
Learning component-level sparse representation using histogram information for image classification 学习使用直方图信息进行图像分类的组件级稀疏表示
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126410
Chen-Kuo Chiang, Chih-Hsueh Duan, S. Lai, Shih-Fu Chang
A novel component-level dictionary learning framework which exploits image group characteristics within sparse coding is introduced in this work. Unlike previous methods, which select the dictionaries that best reconstruct the data, we present an energy minimization formulation that jointly optimizes the learning of both sparse dictionary and component level importance within one unified framework to give a discriminative representation for image groups. The importance measures how well each feature component represents the image group property with the dictionary by using histogram information. Then, dictionaries are updated iteratively to reduce the influence of unimportant components, thus refining the sparse representation for each image group. In the end, by keeping the top K important components, a compact representation is derived for the sparse coding dictionary. Experimental results on several public datasets are shown to demonstrate the superior performance of the proposed algorithm compared to the-state-of-the-art methods.
本文介绍了一种利用稀疏编码中图像组特征的构件级字典学习框架。与以往选择最能重构数据的字典的方法不同,我们提出了一种能量最小化公式,该公式在一个统一的框架内共同优化了稀疏字典和组件级别重要性的学习,从而为图像组提供了判别表示。重要性通过使用直方图信息来衡量每个特征组件在字典中表示图像组属性的程度。然后,迭代更新字典以减少不重要组件的影响,从而细化每个图像组的稀疏表示。最后,通过保留前K个重要组件,推导出稀疏编码字典的紧凑表示。在几个公共数据集上的实验结果表明,与最先进的方法相比,所提出的算法具有优越的性能。
{"title":"Learning component-level sparse representation using histogram information for image classification","authors":"Chen-Kuo Chiang, Chih-Hsueh Duan, S. Lai, Shih-Fu Chang","doi":"10.1109/ICCV.2011.6126410","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126410","url":null,"abstract":"A novel component-level dictionary learning framework which exploits image group characteristics within sparse coding is introduced in this work. Unlike previous methods, which select the dictionaries that best reconstruct the data, we present an energy minimization formulation that jointly optimizes the learning of both sparse dictionary and component level importance within one unified framework to give a discriminative representation for image groups. The importance measures how well each feature component represents the image group property with the dictionary by using histogram information. Then, dictionaries are updated iteratively to reduce the influence of unimportant components, thus refining the sparse representation for each image group. In the end, by keeping the top K important components, a compact representation is derived for the sparse coding dictionary. Experimental results on several public datasets are shown to demonstrate the superior performance of the proposed algorithm compared to the-state-of-the-art methods.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87010103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
期刊
2011 International Conference on Computer Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1