2011 International Conference on Computer Vision最新文献

英文中文

Sparse dictionary-based representation and recognition of action attributes 基于稀疏字典的动作属性表示与识别

2011 International Conference on Computer Vision

Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126307

Qiang Qiu, Zhuolin Jiang, R. Chellappa

We present an approach for dictionary learning of action attributes via information maximization. We unify the class distribution and appearance information into an objective function for learning a sparse dictionary of action attributes. The objective function maximizes the mutual information between what has been learned and what remains to be learned in terms of appearance information and class distribution for each dictionary item. We propose a Gaussian Process (GP) model for sparse representation to optimize the dictionary objective function. The sparse coding property allows a kernel with a compact support in GP to realize a very efficient dictionary learning process. Hence we can describe an action video by a set of compact and discriminative action attributes. More importantly, we can recognize modeled action categories in a sparse feature space, which can be generalized to unseen and unmodeled action categories. Experimental results demonstrate the effectiveness of our approach in action recognition applications.

提出了一种基于信息最大化的动作属性字典学习方法。我们将类分布和外观信息统一到一个目标函数中，用于学习稀疏的动作属性字典。目标函数在每个字典项的外观信息和类分布方面最大化了已经学习的内容和有待学习的内容之间的互信息。我们提出了一个高斯过程(GP)模型用于稀疏表示，以优化字典目标函数。稀疏编码的特性使得在GP中具有紧凑支持的内核能够实现非常高效的字典学习过程。因此，我们可以用一组紧凑和判别的动作属性来描述动作视频。更重要的是，我们可以在稀疏的特征空间中识别建模的动作类别，这可以推广到未见过和未建模的动作类别。实验结果证明了该方法在动作识别应用中的有效性。

引用次数: 161

Illumination demultiplexing from a single image 从单个图像进行照明解复用

2011 International Conference on Computer Vision

Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126220

Christine Chen, D. Vaquero, M. Turk

A class of techniques in computer vision and graphics is based on capturing multiple images of a scene under different illumination conditions. These techniques explore variations in illumination from image to image to extract interesting information about the scene. However, their applicability to dynamic environments is limited due to the need for robust motion compensation algorithms. To overcome this issue, we propose a method to separate multiple illuminants from a single image. Given an image of a scene simultaneously illuminated by multiple light sources, our method generates individual images as if they had been illuminated by each of the light sources separately. To facilitate the illumination separation process, we encode each light source with a distinct sinusoidal pattern, strategically selected given the relative position of each light with respect to the camera, such that the observed sinusoids become independent of the scene geometry. The individual illuminants are then demultiplexed by analyzing local frequencies. We show applications of our approach in image-based relighting, photometric stereo, and multiflash imaging.

计算机视觉和图形学中的一类技术是基于在不同照明条件下捕获场景的多个图像。这些技术探索从图像到图像的照明变化，以提取有关场景的有趣信息。然而，由于需要鲁棒运动补偿算法，它们在动态环境中的适用性受到限制。为了克服这个问题，我们提出了一种从单幅图像中分离多个光源的方法。给定一个场景同时被多个光源照亮的图像，我们的方法生成单独的图像，就好像它们分别被每个光源照亮一样。为了方便照明分离过程，我们用不同的正弦模式编码每个光源，策略性地选择每个光源相对于相机的相对位置，这样观察到的正弦波就独立于场景几何形状。然后通过分析本地频率对单个光源进行解复用。我们展示了我们的方法在基于图像的重照明、光度立体和多闪光灯成像中的应用。

引用次数: 3

Linear dependency modeling for feature fusion 特征融合的线性依赖建模

2011 International Conference on Computer Vision

Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126477

A. J. Ma, P. Yuen

This paper addresses the independent assumption issue in fusion process. In the last decade, dependency modeling techniques were developed under a specific distribution of classifiers. This paper proposes a new framework to model the dependency between features without any assumption on feature/classifier distribution. In this paper, we prove that feature dependency can be modeled by a linear combination of the posterior probabilities under some mild assumptions. Based on the linear combination property, two methods, namely Linear Classifier Dependency Modeling (LCDM) and Linear Feature Dependency Modeling (LFDM), are derived and developed for dependency modeling in classifier level and feature level, respectively. The optimal models for LCDM and LFDM are learned by maximizing the margin between the genuine and imposter posterior probabilities. Both synthetic data and real datasets are used for experiments. Experimental results show that LFDM outperforms all existing combination methods.

本文解决了核聚变过程中的独立假设问题。在过去十年中，依赖关系建模技术是在分类器的特定分布下开发的。本文提出了一个新的框架来建模特征之间的依赖关系，而不需要假设特征/分类器的分布。在本文中，我们证明了在一些温和的假设下，特征依赖可以用后验概率的线性组合来建模。基于线性组合特性，分别推导和发展了分类器级和特征级依赖建模的线性分类器依赖建模(LCDM)和线性特征依赖建模(LFDM)方法。LCDM和LFDM的最优模型是通过最大化真实和冒牌后验概率之间的余量来学习的。实验采用了合成数据和真实数据集。实验结果表明，LFDM优于现有的组合方法。

引用次数: 15

Compact correlation coding for visual object categorization 用于视觉对象分类的紧凑相关编码

2011 International Conference on Computer Vision

Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126425

Nobuyuki Morioka, S. Satoh

Spatial relationships between local features are thought to play a vital role in representing object categories. However, learning a compact set of higher-order spatial features based on visual words, e.g., doublets and triplets, remains a challenging problem as possible combinations of visual words grow exponentially. While the local pairwise codebook achieves a compact codebook of pairs of spatially close local features without feature selection, its formulation is not scale invariant and is only suitable for densely sampled local features. In contrast, the proximity distribution kernel is a scale-invariant and robust representation capturing rich spatial proximity information between local features, but its representation grows quadratically in the number of visual words. Inspired by the two abovementioned techniques, this paper presents the compact correlation coding that combines the strengths of the two. Our method achieves a compact representation that is scaleinvariant and robust against object deformation. In addition, we adopt sparse coding instead of k-means clustering during the codebook construction to increase the discriminative power of our method. We systematically evaluate our method against both the local pairwise codebook and proximity distribution kernel on several challenging object categorization datasets to show performance improvements.

局部特征之间的空间关系被认为在表示对象类别方面起着至关重要的作用。然而，随着视觉词的可能组合呈指数级增长，学习基于视觉词的紧凑的高阶空间特征集仍然是一个具有挑战性的问题，例如，双联体和三联体。而局部两两码本在不进行特征选择的情况下实现了空间上紧密的局部特征对的紧凑码本，但其表述不是尺度不变的，只适用于密集采样的局部特征。邻近分布核是一种尺度不变的鲁棒表示，可捕获局部特征之间丰富的空间邻近信息，但其表示在视觉词数上呈二次增长。受上述两种技术的启发，本文提出了结合两者优点的紧凑相关编码。我们的方法实现了一种紧凑的表示，它是尺度不变的，并且对物体变形具有鲁棒性。此外，在码本构建过程中，我们采用稀疏编码代替k-means聚类来提高我们方法的判别能力。我们在几个具有挑战性的目标分类数据集上系统地评估了我们的方法对局部成对码本和邻近分布核的性能改进。

{"title":"Compact correlation coding for visual object categorization","authors":"Nobuyuki Morioka, S. Satoh","doi":"10.1109/ICCV.2011.6126425","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126425","url":null,"abstract":"Spatial relationships between local features are thought to play a vital role in representing object categories. However, learning a compact set of higher-order spatial features based on visual words, e.g., doublets and triplets, remains a challenging problem as possible combinations of visual words grow exponentially. While the local pairwise codebook achieves a compact codebook of pairs of spatially close local features without feature selection, its formulation is not scale invariant and is only suitable for densely sampled local features. In contrast, the proximity distribution kernel is a scale-invariant and robust representation capturing rich spatial proximity information between local features, but its representation grows quadratically in the number of visual words. Inspired by the two abovementioned techniques, this paper presents the compact correlation coding that combines the strengths of the two. Our method achieves a compact representation that is scaleinvariant and robust against object deformation. In addition, we adopt sparse coding instead of k-means clustering during the codebook construction to increase the discriminative power of our method. We systematically evaluate our method against both the local pairwise codebook and proximity distribution kernel on several challenging object categorization datasets to show performance improvements.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"146 1","pages":"1639-1646"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80541833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

Multiclass transfer learning from unconstrained priors 基于无约束先验的多类迁移学习

2011 International Conference on Computer Vision

Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126454

Jie Luo, T. Tommasi, B. Caputo

The vast majority of transfer learning methods proposed in the visual recognition domain over the last years addresses the problem of object category detection, assuming a strong control over the priors from which transfer is done. This is a strict condition, as it concretely limits the use of this type of approach in several settings: for instance, it does not allow in general to use off-the-shelf models as priors. Moreover, the lack of a multiclass formulation for most of the existing transfer learning algorithms prevents using them for object categorization problems, where their use might be beneficial, especially when the number of categories grows and it becomes harder to get enough annotated data for training standard learning methods. This paper presents a multiclass transfer learning algorithm that allows to take advantage of priors built over different features and with different learning methods than the one used for learning the new task. We use the priors as experts, and transfer their outputs to the new incoming samples as additional information. We cast the learning problem within the Multi Kernel Learning framework. The resulting formulation solves efficiently a joint optimization problem that determines from where and how much to transfer, with a principled multiclass formulation. Extensive experiments illustrate the value of this approach.

过去几年在视觉识别领域提出的绝大多数迁移学习方法都解决了对象类别检测的问题，假设对进行迁移的先验有很强的控制。这是一个严格的条件，因为它具体地限制了这种方法在几种情况下的使用:例如，它通常不允许使用现成的模型作为先验。此外，大多数现有的迁移学习算法缺乏多类公式，因此无法将它们用于对象分类问题，而在这些问题中，它们的使用可能是有益的，特别是当类别数量增长并且难以获得足够的注释数据来训练标准学习方法时。本文提出了一种多类迁移学习算法，该算法允许利用基于不同特征和不同学习方法构建的先验，而不是用于学习新任务的先验。我们使用先验作为专家，并将其输出作为附加信息传递给新的传入样本。我们将学习问题置于多核学习框架中。由此产生的公式有效地解决了一个联合优化问题，该问题确定了从哪里转移和转移多少，并具有原则性的多类公式。大量的实验证明了这种方法的价值。

{"title":"Multiclass transfer learning from unconstrained priors","authors":"Jie Luo, T. Tommasi, B. Caputo","doi":"10.1109/ICCV.2011.6126454","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126454","url":null,"abstract":"The vast majority of transfer learning methods proposed in the visual recognition domain over the last years addresses the problem of object category detection, assuming a strong control over the priors from which transfer is done. This is a strict condition, as it concretely limits the use of this type of approach in several settings: for instance, it does not allow in general to use off-the-shelf models as priors. Moreover, the lack of a multiclass formulation for most of the existing transfer learning algorithms prevents using them for object categorization problems, where their use might be beneficial, especially when the number of categories grows and it becomes harder to get enough annotated data for training standard learning methods. This paper presents a multiclass transfer learning algorithm that allows to take advantage of priors built over different features and with different learning methods than the one used for learning the new task. We use the priors as experts, and transfer their outputs to the new incoming samples as additional information. We cast the learning problem within the Multi Kernel Learning framework. The resulting formulation solves efficiently a joint optimization problem that determines from where and how much to transfer, with a principled multiclass formulation. Extensive experiments illustrate the value of this approach.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"1 1","pages":"1863-1870"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89473765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 111

Are spatial and global constraints really necessary for segmentation? 空间和全局约束对于分割真的是必要的吗?

2011 International Conference on Computer Vision

Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126219

Aurélien Lucchi, Yunpeng Li, X. Boix, Kevin Smith, P. Fua

Many state-of-the-art segmentation algorithms rely on Markov or Conditional Random Field models designed to enforce spatial and global consistency constraints. This is often accomplished by introducing additional latent variables to the model, which can greatly increase its complexity. As a result, estimating the model parameters or computing the best maximum a posteriori (MAP) assignment becomes a computationally expensive task. In a series of experiments on the PASCAL and the MSRC datasets, we were unable to find evidence of a significant performance increase attributed to the introduction of such constraints. On the contrary, we found that similar levels of performance can be achieved using a much simpler design that essentially ignores these constraints. This more simple approach makes use of the same local and global features to leverage evidence from the image, but instead directly biases the preferences of individual pixels. While our investigation does not prove that spatial and consistency constraints are not useful in principle, it points to the conclusion that they should be validated in a larger context.

许多最先进的分割算法依赖于马尔可夫或条件随机场模型，旨在加强空间和全局一致性约束。这通常是通过向模型引入额外的潜在变量来实现的，这会大大增加模型的复杂性。因此，估计模型参数或计算最佳最大后验(MAP)分配成为一项计算成本很高的任务。在PASCAL和MSRC数据集上的一系列实验中，我们无法找到由于引入此类约束而显着提高性能的证据。相反，我们发现可以使用一个更简单的设计来实现类似的性能水平，这种设计基本上忽略了这些约束。这种更简单的方法利用相同的局部和全局特征来利用图像中的证据，但直接影响单个像素的偏好。虽然我们的调查并没有证明空间和一致性约束在原则上是无用的，但它指出了一个结论，即它们应该在更大的背景下得到验证。

{"title":"Are spatial and global constraints really necessary for segmentation?","authors":"Aurélien Lucchi, Yunpeng Li, X. Boix, Kevin Smith, P. Fua","doi":"10.1109/ICCV.2011.6126219","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126219","url":null,"abstract":"Many state-of-the-art segmentation algorithms rely on Markov or Conditional Random Field models designed to enforce spatial and global consistency constraints. This is often accomplished by introducing additional latent variables to the model, which can greatly increase its complexity. As a result, estimating the model parameters or computing the best maximum a posteriori (MAP) assignment becomes a computationally expensive task. In a series of experiments on the PASCAL and the MSRC datasets, we were unable to find evidence of a significant performance increase attributed to the introduction of such constraints. On the contrary, we found that similar levels of performance can be achieved using a much simpler design that essentially ignores these constraints. This more simple approach makes use of the same local and global features to leverage evidence from the image, but instead directly biases the preferences of individual pixels. While our investigation does not prove that spatial and consistency constraints are not useful in principle, it points to the conclusion that they should be validated in a larger context.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"47 1","pages":"9-16"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89839707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 72

Contour Code: Robust and efficient multispectral palmprint encoding for human recognition 轮廓码:用于人类识别的鲁棒和高效的多光谱掌纹编码

2011 International Conference on Computer Vision

Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126463

Zohaib Khan, A. Mian, Yiqun Hu

We propose ‘Contour Code’, a novel representation and binary hash table encoding for multispectral palmprint recognition. We first present a reliable technique for the extraction of a region of interest (ROI) from palm images acquired with non-contact sensors. The Contour Code representation is then derived from the Nonsubsampled Contourlet Transform. A uniscale pyramidal filter is convolved with the ROI followed by the application of a directional filter bank. The dominant directional subband establishes the orientation at each pixel and the index corresponding to this subband is encoded in the Contour Code representation. Unlike existing representations which extract orientation features directly from the palm images, the Contour Code uses a two stage filtering to extract robust orientation features. The Contour Code is binarized into an efficient hash table structure that only requires indexing and summation operations for simultaneous one-to-many matching with an embedded score level fusion of multiple bands. We quantitatively evaluate the accuracy of the ROI extraction by comparison with a manually produced ground truth. Multispectral palmprint verification results on the PolyU and CASIA databases show that the Contour Code achieves an EER reduction upto 50%, compared to state-of-the-art methods.

我们提出了“轮廓码”，这是一种用于多光谱掌纹识别的新颖表示和二进制哈希表编码。我们首先提出了一种可靠的技术，用于从非接触式传感器获取的手掌图像中提取感兴趣区域(ROI)。然后从非下采样Contourlet变换中得到轮廓码表示。将一个非标锥体滤波器与ROI进行卷积，然后应用一个方向滤波器组。主导方向子带在每个像素处建立方向，对应于该子带的索引在轮廓码表示中编码。与直接从手掌图像中提取方向特征的现有表示不同，轮廓代码使用两阶段滤波来提取稳健的方向特征。轮廓码被二值化成一个高效的哈希表结构，只需要索引和求和操作，即可同时进行一对多匹配，并嵌入多个频带的分数级融合。我们定量地评估ROI提取的准确性，通过与人工产生的地面真值进行比较。在理大和CASIA数据库的多光谱掌纹验证结果显示，与最先进的方法相比，轮廓码的EER降低了50%。

{"title":"Contour Code: Robust and efficient multispectral palmprint encoding for human recognition","authors":"Zohaib Khan, A. Mian, Yiqun Hu","doi":"10.1109/ICCV.2011.6126463","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126463","url":null,"abstract":"We propose ‘Contour Code’, a novel representation and binary hash table encoding for multispectral palmprint recognition. We first present a reliable technique for the extraction of a region of interest (ROI) from palm images acquired with non-contact sensors. The Contour Code representation is then derived from the Nonsubsampled Contourlet Transform. A uniscale pyramidal filter is convolved with the ROI followed by the application of a directional filter bank. The dominant directional subband establishes the orientation at each pixel and the index corresponding to this subband is encoded in the Contour Code representation. Unlike existing representations which extract orientation features directly from the palm images, the Contour Code uses a two stage filtering to extract robust orientation features. The Contour Code is binarized into an efficient hash table structure that only requires indexing and summation operations for simultaneous one-to-many matching with an embedded score level fusion of multiple bands. We quantitatively evaluate the accuracy of the ROI extraction by comparison with a manually produced ground truth. Multispectral palmprint verification results on the PolyU and CASIA databases show that the Contour Code achieves an EER reduction upto 50%, compared to state-of-the-art methods.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"42 1","pages":"1935-1942"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90672924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 75

Delta-Dual Hierarchical Dirichlet Processes: A pragmatic abnormal behaviour detector Delta-Dual分层狄利克雷过程:一种实用的异常行为检测器

2011 International Conference on Computer Vision

Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126497

T. Haines, T. Xiang

In the security domain a key problem is identifying rare behaviours of interest. Training examples for these behaviours may or may not exist, and if they do exist there will be few examples, quite probably one. We present a novel weakly supervised algorithm that can detect behaviours that either have never before been seen or for which there are few examples. Global context is modelled, allowing the detection of abnormal behaviours that in isolation appear normal. Pragmatic aspects are considered, such that no parameter tuning is required and real time performance is achieved.

在安全领域，一个关键问题是识别感兴趣的罕见行为。这些行为的训练例子可能存在，也可能不存在，即使存在，也很少有例子，很可能只有一个。我们提出了一种新的弱监督算法，可以检测以前从未见过或很少有例子的行为。对全局上下文进行建模，从而可以检测到孤立出现的正常异常行为。考虑了实用方面，因此不需要参数调优并实现实时性能。

引用次数: 24

The NBNN kernel NBNN内核

2011 International Conference on Computer Vision

Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126449

T. Tuytelaars, Mario Fritz, Kate Saenko, Trevor Darrell

Naive Bayes Nearest Neighbor (NBNN) has recently been proposed as a powerful, non-parametric approach for object classification, that manages to achieve remarkably good results thanks to the avoidance of a vector quantization step and the use of image-to-class comparisons, yielding good generalization. In this paper, we introduce a kernelized version of NBNN. This way, we can learn the classifier in a discriminative setting. Moreover, it then becomes straightforward to combine it with other kernels. In particular, we show that our NBNN kernel is complementary to standard bag-of-features based kernels, focussing on local generalization as opposed to global image composition. By combining them, we achieve state-of-the-art results on Caltech101 and 15 Scenes datasets. As a side contribution, we also investigate how to speed up the NBNN computations.

朴素贝叶斯最近邻(NBNN)最近被提出作为一种强大的非参数对象分类方法，由于避免了矢量量化步骤和使用图像到类别的比较，它获得了非常好的结果，产生了良好的泛化。在本文中，我们介绍了一个核化版本的NBNN。这样，我们可以在判别设置中学习分类器。而且，将它与其他内核结合起来也变得很简单。特别是，我们证明了我们的NBNN内核是标准的基于特征袋的内核的补充，它专注于局部泛化，而不是全局图像合成。通过结合它们，我们在Caltech101和15个场景数据集上获得了最先进的结果。此外，我们还研究了如何加快NBNN的计算速度。

引用次数: 124

Large-scale image annotation using visual synset 基于视觉同义词集的大规模图像标注

2011 International Conference on Computer Vision

Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126295

David Tsai, Yushi Jing, Yi Liu, H. Rowley, Sergey Ioffe, James M. Rehg

We address the problem of large-scale annotation of web images. Our approach is based on the concept of visual synset, which is an organization of images which are visually-similar and semantically-related. Each visual synset represents a single prototypical visual concept, and has an associated set of weighted annotations. Linear SVM's are utilized to predict the visual synset membership for unseen image examples, and a weighted voting rule is used to construct a ranked list of predicted annotations from a set of visual synsets. We demonstrate that visual synsets lead to better performance than standard methods on a new annotation database containing more than 200 million im- ages and 300 thousand annotations, which is the largest ever reported

我们解决了网络图像的大规模标注问题。我们的方法是基于视觉同义词集的概念，它是视觉相似和语义相关的图像的组织。每个视觉同义词集代表一个单一的原型视觉概念，并具有一组相关的加权注释。利用线性支持向量机预测未见图像样本的视觉同义词集隶属度，并使用加权投票规则从一组视觉同义词集构建预测注释的排序列表。我们证明了在包含超过2亿个图像和30万个注释的新注释数据库上，视觉同义词集比标准方法具有更好的性能，这是迄今为止报道的最大的注释数据库

引用次数: 66

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2011 International Conference on Computer Vision

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀