首页 > 最新文献

2014 IEEE Conference on Computer Vision and Pattern Recognition最新文献

英文 中文
Subspace Clustering for Sequential Data 序列数据的子空间聚类
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.134
Stephen Tierney, Junbin Gao, Yi Guo
We propose Ordered Subspace Clustering (OSC) to segment data drawn from a sequentially ordered union of subspaces. Current subspace clustering techniques learn the relationships within a set of data and then use a separate clustering algorithm such as NCut for final segmentation. In contrast our technique, under certain conditions, is capable of segmenting clusters intrinsically without providing the number of clusters as a parameter. Similar to Sparse Subspace Clustering (SSC) we formulate the problem as one of finding a sparse representation but include a new penalty term to take care of sequential data. We test our method on data drawn from infrared hyper spectral data, video sequences and face images. Our experiments show that our method, OSC, outperforms the state of the art methods: Spatial Subspace Clustering (SpatSC), Low-Rank Representation (LRR) and SSC.
我们提出了有序子空间聚类(OSC)来分割从子空间的顺序有序联合中提取的数据。当前的子空间聚类技术学习一组数据之间的关系,然后使用单独的聚类算法(如NCut)进行最终分割。相比之下,我们的技术在一定条件下,能够在不提供簇数作为参数的情况下对簇进行本质上的分割。与稀疏子空间聚类(SSC)类似,我们将问题表述为寻找稀疏表示之一,但包含一个新的惩罚项来照顾顺序数据。我们对红外高光谱数据、视频序列和人脸图像进行了测试。我们的实验表明,我们的方法OSC优于最先进的方法:空间子空间聚类(SpatSC),低秩表示(LRR)和SSC。
{"title":"Subspace Clustering for Sequential Data","authors":"Stephen Tierney, Junbin Gao, Yi Guo","doi":"10.1109/CVPR.2014.134","DOIUrl":"https://doi.org/10.1109/CVPR.2014.134","url":null,"abstract":"We propose Ordered Subspace Clustering (OSC) to segment data drawn from a sequentially ordered union of subspaces. Current subspace clustering techniques learn the relationships within a set of data and then use a separate clustering algorithm such as NCut for final segmentation. In contrast our technique, under certain conditions, is capable of segmenting clusters intrinsically without providing the number of clusters as a parameter. Similar to Sparse Subspace Clustering (SSC) we formulate the problem as one of finding a sparse representation but include a new penalty term to take care of sequential data. We test our method on data drawn from infrared hyper spectral data, video sequences and face images. Our experiments show that our method, OSC, outperforms the state of the art methods: Spatial Subspace Clustering (SpatSC), Low-Rank Representation (LRR) and SSC.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127773441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 102
Random Laplace Feature Maps for Semigroup Kernels on Histograms 直方图上半群核的随机拉普拉斯特征映射
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.129
Jiyan Yang, Vikas Sindhwani, Quanfu Fan, H. Avron, Michael W. Mahoney
With the goal of accelerating the training and testing complexity of nonlinear kernel methods, several recent papers have proposed explicit embeddings of the input data into low-dimensional feature spaces, where fast linear methods can instead be used to generate approximate solutions. Analogous to random Fourier feature maps to approximate shift-invariant kernels, such as the Gaussian kernel, on Rd, we develop a new randomized technique called random Laplace features, to approximate a family of kernel functions adapted to the semigroup structure of R+d. This is the natural algebraic structure on the set of histograms and other non-negative data representations. We provide theoretical results on the uniform convergence of random Laplace features. Empirical analyses on image classification and surveillance event detection tasks demonstrate the attractiveness of using random Laplace features relative to several other feature maps proposed in the literature.
为了加速非线性核方法的训练和测试复杂性,最近的几篇论文提出了将输入数据显式嵌入到低维特征空间中,在低维特征空间中,可以使用快速线性方法来生成近似解。类似于随机傅立叶特征映射到近似移不变核,如高斯核,在Rd上,我们开发了一种新的随机化技术,称为随机拉普拉斯特征,以近似一组核函数适应于R+d的半群结构。这是直方图和其他非负数据表示集合上的自然代数结构。我们给出了随机拉普拉斯特征一致收敛的理论结果。对图像分类和监控事件检测任务的实证分析表明,相对于文献中提出的其他几种特征映射,使用随机拉普拉斯特征具有吸引力。
{"title":"Random Laplace Feature Maps for Semigroup Kernels on Histograms","authors":"Jiyan Yang, Vikas Sindhwani, Quanfu Fan, H. Avron, Michael W. Mahoney","doi":"10.1109/CVPR.2014.129","DOIUrl":"https://doi.org/10.1109/CVPR.2014.129","url":null,"abstract":"With the goal of accelerating the training and testing complexity of nonlinear kernel methods, several recent papers have proposed explicit embeddings of the input data into low-dimensional feature spaces, where fast linear methods can instead be used to generate approximate solutions. Analogous to random Fourier feature maps to approximate shift-invariant kernels, such as the Gaussian kernel, on Rd, we develop a new randomized technique called random Laplace features, to approximate a family of kernel functions adapted to the semigroup structure of R+d. This is the natural algebraic structure on the set of histograms and other non-negative data representations. We provide theoretical results on the uniform convergence of random Laplace features. Empirical analyses on image classification and surveillance event detection tasks demonstrate the attractiveness of using random Laplace features relative to several other feature maps proposed in the literature.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132806997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 48
Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks 使用卷积神经网络学习和传输中级图像表示
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.222
M. Oquab, L. Bottou, I. Laptev, Josef Sivic
Convolutional neural networks (CNN) have recently shown outstanding image classification performance in the large- scale visual recognition challenge (ILSVRC2012). The success of CNNs is attributed to their ability to learn rich mid-level image representations as opposed to hand-designed low-level features used in other image classification methods. Learning CNNs, however, amounts to estimating millions of parameters and requires a very large number of annotated image samples. This property currently prevents application of CNNs to problems with limited training data. In this work we show how image representations learned with CNNs on large-scale annotated datasets can be efficiently transferred to other visual recognition tasks with limited amount of training data. We design a method to reuse layers trained on the ImageNet dataset to compute mid-level image representation for images in the PASCAL VOC dataset. We show that despite differences in image statistics and tasks in the two datasets, the transferred representation leads to significantly improved results for object and action classification, outperforming the current state of the art on Pascal VOC 2007 and 2012 datasets. We also show promising results for object and action localization.
卷积神经网络(CNN)最近在大规模视觉识别挑战(ILSVRC2012)中显示出出色的图像分类性能。cnn的成功归功于它们能够学习丰富的中级图像表示,而不是在其他图像分类方法中使用手工设计的低级特征。然而,学习cnn相当于估计数百万个参数,并且需要非常大量的带注释的图像样本。这个属性目前阻碍了cnn在训练数据有限的问题上的应用。在这项工作中,我们展示了如何将cnn在大规模注释数据集上学习到的图像表示有效地转移到具有有限训练数据量的其他视觉识别任务中。我们设计了一种方法来重用在ImageNet数据集上训练的层,以计算PASCAL VOC数据集中图像的中级图像表示。我们表明,尽管两个数据集中的图像统计和任务存在差异,但转移的表示导致对象和动作分类的结果显着改善,优于Pascal VOC 2007和2012数据集上的当前技术状态。我们也展示了对象和动作定位的有希望的结果。
{"title":"Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks","authors":"M. Oquab, L. Bottou, I. Laptev, Josef Sivic","doi":"10.1109/CVPR.2014.222","DOIUrl":"https://doi.org/10.1109/CVPR.2014.222","url":null,"abstract":"Convolutional neural networks (CNN) have recently shown outstanding image classification performance in the large- scale visual recognition challenge (ILSVRC2012). The success of CNNs is attributed to their ability to learn rich mid-level image representations as opposed to hand-designed low-level features used in other image classification methods. Learning CNNs, however, amounts to estimating millions of parameters and requires a very large number of annotated image samples. This property currently prevents application of CNNs to problems with limited training data. In this work we show how image representations learned with CNNs on large-scale annotated datasets can be efficiently transferred to other visual recognition tasks with limited amount of training data. We design a method to reuse layers trained on the ImageNet dataset to compute mid-level image representation for images in the PASCAL VOC dataset. We show that despite differences in image statistics and tasks in the two datasets, the transferred representation leads to significantly improved results for object and action classification, outperforming the current state of the art on Pascal VOC 2007 and 2012 datasets. We also show promising results for object and action localization.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133563587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3051
Hash-SVM: Scalable Kernel Machines for Large-Scale Visual Classification 哈希支持向量机:大规模视觉分类的可扩展核机
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.130
Yadong Mu, G. Hua, Wei Fan, Shih-Fu Chang
This paper presents a novel algorithm which uses compact hash bits to greatly improve the efficiency of non-linear kernel SVM in very large scale visual classification problems. Our key idea is to represent each sample with compact hash bits, over which an inner product is defined to serve as the surrogate of the original nonlinear kernels. Then the problem of solving the nonlinear SVM can be transformed into solving a linear SVM over the hash bits. The proposed Hash-SVM enjoys dramatic storage cost reduction owing to the compact binary representation, as well as a (sub-)linear training complexity via linear SVM. As a critical component of Hash-SVM, we propose a novel hashing scheme for arbitrary non-linear kernels via random subspace projection in reproducing kernel Hilbert space. Our comprehensive analysis reveals a well behaved theoretic bound of the deviation between the proposed hashing-based kernel approximation and the original kernel function. We also derive requirements on the hash bits for achieving a satisfactory accuracy level. Several experiments on large-scale visual classification benchmarks are conducted, including one with over 1 million images. The results show that Hash-SVM greatly reduces the computational complexity (more than ten times faster in many cases) while keeping comparable accuracies.
本文提出了一种新颖的算法,利用紧凑的哈希位大大提高了非线性核支持向量机在超大规模视觉分类问题中的效率。我们的关键思想是用紧凑的哈希位表示每个样本,在其上定义一个内积作为原始非线性核的代理。然后将求解非线性支持向量机问题转化为求解哈希位上的线性支持向量机问题。所提出的哈希支持向量机由于其紧凑的二进制表示而大大降低了存储成本,并且通过线性支持向量机的(亚)线性训练复杂度。作为哈希支持向量机的关键组成部分,我们提出了一种基于随机子空间投影的任意非线性核的哈希算法。我们的综合分析揭示了所提出的基于哈希的核近似与原始核函数之间偏差的一个良好的理论边界。我们还推导了对哈希位的要求,以达到令人满意的精度水平。进行了几个大规模视觉分类基准实验,其中包括一个超过100万张图像的实验。结果表明,哈希支持向量机大大降低了计算复杂度(在许多情况下超过十倍),同时保持了相当的准确性。
{"title":"Hash-SVM: Scalable Kernel Machines for Large-Scale Visual Classification","authors":"Yadong Mu, G. Hua, Wei Fan, Shih-Fu Chang","doi":"10.1109/CVPR.2014.130","DOIUrl":"https://doi.org/10.1109/CVPR.2014.130","url":null,"abstract":"This paper presents a novel algorithm which uses compact hash bits to greatly improve the efficiency of non-linear kernel SVM in very large scale visual classification problems. Our key idea is to represent each sample with compact hash bits, over which an inner product is defined to serve as the surrogate of the original nonlinear kernels. Then the problem of solving the nonlinear SVM can be transformed into solving a linear SVM over the hash bits. The proposed Hash-SVM enjoys dramatic storage cost reduction owing to the compact binary representation, as well as a (sub-)linear training complexity via linear SVM. As a critical component of Hash-SVM, we propose a novel hashing scheme for arbitrary non-linear kernels via random subspace projection in reproducing kernel Hilbert space. Our comprehensive analysis reveals a well behaved theoretic bound of the deviation between the proposed hashing-based kernel approximation and the original kernel function. We also derive requirements on the hash bits for achieving a satisfactory accuracy level. Several experiments on large-scale visual classification benchmarks are conducted, including one with over 1 million images. The results show that Hash-SVM greatly reduces the computational complexity (more than ten times faster in many cases) while keeping comparable accuracies.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133713244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 56
Analysis by Synthesis: 3D Object Recognition by Object Reconstruction 综合分析:基于物体重建的三维物体识别
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.314
Mohsen Hejrati, Deva Ramanan
We introduce a new approach for recognizing and reconstructing 3D objects in images. Our approach is based on an analysis by synthesis strategy. A forward synthesis model constructs possible geometric interpretations of the world, and then selects the interpretation that best agrees with the measured visual evidence. The forward model synthesizes visual templates defined on invariant (HOG) features. These visual templates are discriminatively trained to be accurate for inverse estimation. We introduce an efficient "brute-force" approach to inference that searches through a large number of candidate reconstructions, returning the optimal one. One benefit of such an approach is that recognition is inherently (re)constructive. We show state of the art performance for detection and reconstruction on two challenging 3D object recognition datasets of cars and cuboids.
我们提出了一种新的图像中三维物体的识别和重建方法。我们的方法是基于综合分析策略。正演综合模型构建世界可能的几何解释,然后选择最符合测量视觉证据的解释。前向模型综合了基于不变特征(HOG)定义的可视化模板。这些视觉模板是判别训练,以准确的反估计。我们引入了一种高效的“蛮力”推理方法,通过大量候选重构进行搜索,返回最优重构。这种方法的一个好处是,承认本质上是(重新)建设性的。我们展示了在汽车和长方体两个具有挑战性的3D物体识别数据集上检测和重建的最先进性能。
{"title":"Analysis by Synthesis: 3D Object Recognition by Object Reconstruction","authors":"Mohsen Hejrati, Deva Ramanan","doi":"10.1109/CVPR.2014.314","DOIUrl":"https://doi.org/10.1109/CVPR.2014.314","url":null,"abstract":"We introduce a new approach for recognizing and reconstructing 3D objects in images. Our approach is based on an analysis by synthesis strategy. A forward synthesis model constructs possible geometric interpretations of the world, and then selects the interpretation that best agrees with the measured visual evidence. The forward model synthesizes visual templates defined on invariant (HOG) features. These visual templates are discriminatively trained to be accurate for inverse estimation. We introduce an efficient \"brute-force\" approach to inference that searches through a large number of candidate reconstructions, returning the optimal one. One benefit of such an approach is that recognition is inherently (re)constructive. We show state of the art performance for detection and reconstruction on two challenging 3D object recognition datasets of cars and cuboids.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"5 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131829746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 59
Full-Angle Quaternions for Robustly Matching Vectors of 3D Rotations 三维旋转向量鲁棒匹配的全角四元数
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.21
Stephan Liwicki, Minh-Tri Pham, S. Zafeiriou, M. Pantic, B. Stenger
In this paper we introduce a new distance for robustly matching vectors of 3D rotations. A special representation of 3D rotations, which we coin full-angle quaternion (FAQ), allows us to express this distance as Euclidean. We apply the distance to the problems of 3D shape recognition from point clouds and 2D object tracking in color video. For the former, we introduce a hashing scheme for scale and translation which outperforms the previous state-of-the-art approach on a public dataset. For the latter, we incorporate online subspace learning with the proposed FAQ representation to highlight the benefits of the new representation.
本文引入了一种新的三维旋转向量鲁棒匹配距离。三维旋转的一个特殊表示,我们称之为全角度四元数(FAQ),允许我们将这个距离表示为欧几里得距离。我们将距离应用于彩色视频中点云的三维形状识别和二维目标跟踪问题。对于前者,我们引入了一种用于规模和转换的哈希方案,该方案在公共数据集上优于先前最先进的方法。对于后者,我们将在线子空间学习与提出的FAQ表示结合起来,以突出新表示的好处。
{"title":"Full-Angle Quaternions for Robustly Matching Vectors of 3D Rotations","authors":"Stephan Liwicki, Minh-Tri Pham, S. Zafeiriou, M. Pantic, B. Stenger","doi":"10.1109/CVPR.2014.21","DOIUrl":"https://doi.org/10.1109/CVPR.2014.21","url":null,"abstract":"In this paper we introduce a new distance for robustly matching vectors of 3D rotations. A special representation of 3D rotations, which we coin full-angle quaternion (FAQ), allows us to express this distance as Euclidean. We apply the distance to the problems of 3D shape recognition from point clouds and 2D object tracking in color video. For the former, we introduce a hashing scheme for scale and translation which outperforms the previous state-of-the-art approach on a public dataset. For the latter, we incorporate online subspace learning with the proposed FAQ representation to highlight the benefits of the new representation.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"272 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132113242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Very Fast Solution to the PnP Problem with Algebraic Outlier Rejection 具有代数离群值拒绝的PnP问题的快速解
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.71
Luis Ferraz, Xavier Binefa, F. Moreno-Noguer
We propose a real-time, robust to outliers and accurate solution to the Perspective-n-Point (PnP) problem. The main advantages of our solution are twofold: first, it in- tegrates the outlier rejection within the pose estimation pipeline with a negligible computational overhead, and sec- ond, its scalability to arbitrarily large number of correspon- dences. Given a set of 3D-to-2D matches, we formulate pose estimation problem as a low-rank homogeneous sys- tem where the solution lies on its 1D null space. Outlier correspondences are those rows of the linear system which perturb the null space and are progressively detected by projecting them on an iteratively estimated solution of the null space. Since our outlier removal process is based on an algebraic criterion which does not require computing the full-pose and reprojecting back all 3D points on the image plane at each step, we achieve speed gains of more than 100× compared to RANSAC strategies. An extensive exper- imental evaluation will show that our solution yields accu- rate results in situations with up to 50% of outliers, and can process more than 1000 correspondences in less than 5ms.
我们提出了一种实时的、对异常值的鲁棒性和精确的视角-n-点(PnP)问题的解决方案。我们的解决方案的主要优点有两个:首先,它在姿态估计管道中集成了异常值抑制,计算开销可以忽略不计;其次,它的可扩展性可以用于任意数量的对应。给定一组三维到二维匹配,我们将姿态估计问题表述为一个解位于其一维零空间的低秩齐次系统。离群对应是那些干扰零空间的线性系统的行,通过将它们投影到零空间的迭代估计解上来逐步检测。由于我们的异常值去除过程是基于代数准则,不需要在每一步计算全姿态并重新投影图像平面上的所有3D点,因此我们实现了超过100倍的速度增益;与RANSAC策略相比。广泛的实验评估将表明,我们的解决方案在高达50%的异常值的情况下产生准确的结果,并且可以在不到5ms的时间内处理超过1000个对应。
{"title":"Very Fast Solution to the PnP Problem with Algebraic Outlier Rejection","authors":"Luis Ferraz, Xavier Binefa, F. Moreno-Noguer","doi":"10.1109/CVPR.2014.71","DOIUrl":"https://doi.org/10.1109/CVPR.2014.71","url":null,"abstract":"We propose a real-time, robust to outliers and accurate solution to the Perspective-n-Point (PnP) problem. The main advantages of our solution are twofold: first, it in- tegrates the outlier rejection within the pose estimation pipeline with a negligible computational overhead, and sec- ond, its scalability to arbitrarily large number of correspon- dences. Given a set of 3D-to-2D matches, we formulate pose estimation problem as a low-rank homogeneous sys- tem where the solution lies on its 1D null space. Outlier correspondences are those rows of the linear system which perturb the null space and are progressively detected by projecting them on an iteratively estimated solution of the null space. Since our outlier removal process is based on an algebraic criterion which does not require computing the full-pose and reprojecting back all 3D points on the image plane at each step, we achieve speed gains of more than 100× compared to RANSAC strategies. An extensive exper- imental evaluation will show that our solution yields accu- rate results in situations with up to 50% of outliers, and can process more than 1000 correspondences in less than 5ms.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134542096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 158
A Multigraph Representation for Improved Unsupervised/Semi-supervised Learning of Human Actions 改进的人类行为无监督/半监督学习的多图表示
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.110
Simon Jones, Ling Shao
Graph-based methods are a useful class of methods for improving the performance of unsupervised and semi-supervised machine learning tasks, such as clustering or information retrieval. However, the performance of existing graph-based methods is highly dependent on how well the affinity graph reflects the original data structure. We propose that multimedia such as images or videos consist of multiple separate components, and therefore more than one graph is required to fully capture the relationship between them. Accordingly, we present a new spectral method - the Feature Grouped Spectral Multigraph (FGSM) - which comprises the following steps. First, mutually independent subsets of the original feature space are generated through feature clustering. Secondly, a separate graph is generated from each feature subset. Finally, a spectral embedding is calculated on each graph, and the embeddings are scaled/aggregated into a single representation. Using this representation, a variety of experiments are performed on three learning tasks - clustering, retrieval and recognition - on human action datasets, demonstrating considerably better performance than the state-of-the-art.
基于图的方法是一类有用的方法,用于提高无监督和半监督机器学习任务的性能,例如聚类或信息检索。然而,现有的基于图的方法的性能高度依赖于关联图对原始数据结构的反映程度。我们提出,图像或视频等多媒体由多个独立的组件组成,因此需要多个图形来完全捕捉它们之间的关系。因此,我们提出了一种新的光谱方法-特征分组光谱多图(FGSM),该方法包括以下步骤。首先,通过特征聚类生成原始特征空间相互独立的子集;其次,从每个特征子集生成一个单独的图。最后,在每个图上计算谱嵌入,并将嵌入缩放/聚合为单个表示。使用这种表示,在人类动作数据集上对三个学习任务(聚类、检索和识别)进行了各种各样的实验,显示出比最先进的性能要好得多。
{"title":"A Multigraph Representation for Improved Unsupervised/Semi-supervised Learning of Human Actions","authors":"Simon Jones, Ling Shao","doi":"10.1109/CVPR.2014.110","DOIUrl":"https://doi.org/10.1109/CVPR.2014.110","url":null,"abstract":"Graph-based methods are a useful class of methods for improving the performance of unsupervised and semi-supervised machine learning tasks, such as clustering or information retrieval. However, the performance of existing graph-based methods is highly dependent on how well the affinity graph reflects the original data structure. We propose that multimedia such as images or videos consist of multiple separate components, and therefore more than one graph is required to fully capture the relationship between them. Accordingly, we present a new spectral method - the Feature Grouped Spectral Multigraph (FGSM) - which comprises the following steps. First, mutually independent subsets of the original feature space are generated through feature clustering. Secondly, a separate graph is generated from each feature subset. Finally, a spectral embedding is calculated on each graph, and the embeddings are scaled/aggregated into a single representation. Using this representation, a variety of experiments are performed on three learning tasks - clustering, retrieval and recognition - on human action datasets, demonstrating considerably better performance than the state-of-the-art.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133898185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
L0 Norm Based Dictionary Learning by Proximal Methods with Global Convergence 全局收敛的基于L0范数的近端方法字典学习
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.493
Chenglong Bao, Hui Ji, Yuhui Quan, Zuowei Shen
Sparse coding and dictionary learning have seen their applications in many vision tasks, which usually is formulated as a non-convex optimization problem. Many iterative methods have been proposed to tackle such an optimization problem. However, it remains an open problem to have a method that is not only practically fast but also is globally convergent. In this paper, we proposed a fast proximal method for solving ℓ0 norm based dictionary learning problems, and we proved that the whole sequence generated by the proposed method converges to a stationary point with sub-linear convergence rate. The benefit of having a fast and convergent dictionary learning method is demonstrated in the applications of image recovery and face recognition.
稀疏编码和字典学习已经在许多视觉任务中得到了应用,这些任务通常被表述为非凸优化问题。人们提出了许多迭代方法来解决这样的优化问题。然而,如何找到一种既能快速又能全局收敛的方法仍然是一个有待解决的问题。本文提出了求解基于l0范数的字典学习问题的一种快速逼近方法,并证明了该方法生成的整个序列收敛到一个具有次线性收敛速率的平稳点。在图像恢复和人脸识别的应用中证明了快速收敛的字典学习方法的好处。
{"title":"L0 Norm Based Dictionary Learning by Proximal Methods with Global Convergence","authors":"Chenglong Bao, Hui Ji, Yuhui Quan, Zuowei Shen","doi":"10.1109/CVPR.2014.493","DOIUrl":"https://doi.org/10.1109/CVPR.2014.493","url":null,"abstract":"Sparse coding and dictionary learning have seen their applications in many vision tasks, which usually is formulated as a non-convex optimization problem. Many iterative methods have been proposed to tackle such an optimization problem. However, it remains an open problem to have a method that is not only practically fast but also is globally convergent. In this paper, we proposed a fast proximal method for solving ℓ0 norm based dictionary learning problems, and we proved that the whole sequence generated by the proposed method converges to a stationary point with sub-linear convergence rate. The benefit of having a fast and convergent dictionary learning method is demonstrated in the applications of image recovery and face recognition.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"39 11-12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114013897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 88
Saliency Optimization from Robust Background Detection 基于鲁棒背景检测的显著性优化
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.360
Wangjiang Zhu, Shuang Liang, Yichen Wei, Jian Sun
Recent progresses in salient object detection have exploited the boundary prior, or background information, to assist other saliency cues such as contrast, achieving state-of-the-art results. However, their usage of boundary prior is very simple, fragile, and the integration with other cues is mostly heuristic. In this work, we present new methods to address these issues. First, we propose a robust background measure, called boundary connectivity. It characterizes the spatial layout of image regions with respect to image boundaries and is much more robust. It has an intuitive geometrical interpretation and presents unique benefits that are absent in previous saliency measures. Second, we propose a principled optimization framework to integrate multiple low level cues, including our background measure, to obtain clean and uniform saliency maps. Our formulation is intuitive, efficient and achieves state-of-the-art results on several benchmark datasets.
在显著目标检测方面的最新进展是利用边界先验或背景信息来辅助其他显著性线索,如对比度,从而获得最先进的结果。然而,它们对边界先验的使用非常简单、脆弱,与其他线索的整合大多是启发式的。在这项工作中,我们提出了解决这些问题的新方法。首先,我们提出了一种鲁棒的背景度量,称为边界连通性。它描述了图像区域相对于图像边界的空间布局,并且更加鲁棒。它具有直观的几何解释,并呈现出以前显著性措施所没有的独特优势。其次,我们提出了一个原则性的优化框架来整合多个低水平线索,包括我们的背景测量,以获得干净和统一的显著性地图。我们的配方直观,高效,并在几个基准数据集上实现了最先进的结果。
{"title":"Saliency Optimization from Robust Background Detection","authors":"Wangjiang Zhu, Shuang Liang, Yichen Wei, Jian Sun","doi":"10.1109/CVPR.2014.360","DOIUrl":"https://doi.org/10.1109/CVPR.2014.360","url":null,"abstract":"Recent progresses in salient object detection have exploited the boundary prior, or background information, to assist other saliency cues such as contrast, achieving state-of-the-art results. However, their usage of boundary prior is very simple, fragile, and the integration with other cues is mostly heuristic. In this work, we present new methods to address these issues. First, we propose a robust background measure, called boundary connectivity. It characterizes the spatial layout of image regions with respect to image boundaries and is much more robust. It has an intuitive geometrical interpretation and presents unique benefits that are absent in previous saliency measures. Second, we propose a principled optimization framework to integrate multiple low level cues, including our background measure, to obtain clean and uniform saliency maps. Our formulation is intuitive, efficient and achieves state-of-the-art results on several benchmark datasets.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115081177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1240
期刊
2014 IEEE Conference on Computer Vision and Pattern Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1