首页 > 最新文献

2014 IEEE Conference on Computer Vision and Pattern Recognition最新文献

英文 中文
Interval Tracker: Tracking by Interval Analysis 区间跟踪器:通过区间分析跟踪
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.447
Junseok Kwon, Kyoung Mu Lee
This paper proposes a robust tracking method that uses interval analysis. Any single posterior model necessarily includes a modeling uncertainty (error), and thus, the posterior should be represented as an interval of probability. Then, the objective of visual tracking becomes to find the best state that maximizes the posterior and minimizes its interval simultaneously. By minimizing the interval of the posterior, our method can reduce the modeling uncertainty in the posterior. In this paper, the aforementioned objective is achieved by using the M4 estimation, which combines the Maximum a Posterior (MAP) estimation with Minimum Mean-Square Error (MMSE), Maximum Likelihood (ML), and Minimum Interval Length (MIL) estimations. In the M4 estimation, our method maximizes the posterior over the state obtained by the MMSE estimation. The method also minimizes interval of the posterior by reducing the gap between the lower and upper bounds of the posterior. The gap is reduced when the likelihood is maximized by the ML estimation and the interval length of the state is minimized by the MIL estimation. The experimental results demonstrate that M4 estimation can be easily integrated into conventional tracking methods and can greatly enhance their tracking accuracy. In several challenging datasets, our method outperforms state-of-the-art tracking methods.
本文提出了一种基于区间分析的鲁棒跟踪方法。任何单一的后验模型都必然包含建模不确定性(误差),因此,后验应该表示为概率区间。那么,视觉跟踪的目标就变成了寻找最大后验和最小后验间隔的最佳状态。该方法通过最小化后验区间,降低了后验模型的不确定性。在本文中,上述目标是通过使用M4估计来实现的,M4估计将最大后验(MAP)估计与最小均方误差(MMSE)、最大似然(ML)和最小间隔长度(MIL)估计相结合。在M4估计中,我们的方法使MMSE估计得到的状态的后验最大化。该方法还通过减小后验下界和上界之间的间隙来最小化后验间隔。通过ML估计使似然最大化,通过MIL估计使状态的间隔长度最小化,从而减小间隙。实验结果表明,M4估计可以很容易地集成到常规跟踪方法中,大大提高了传统跟踪方法的跟踪精度。在一些具有挑战性的数据集中,我们的方法优于最先进的跟踪方法。
{"title":"Interval Tracker: Tracking by Interval Analysis","authors":"Junseok Kwon, Kyoung Mu Lee","doi":"10.1109/CVPR.2014.447","DOIUrl":"https://doi.org/10.1109/CVPR.2014.447","url":null,"abstract":"This paper proposes a robust tracking method that uses interval analysis. Any single posterior model necessarily includes a modeling uncertainty (error), and thus, the posterior should be represented as an interval of probability. Then, the objective of visual tracking becomes to find the best state that maximizes the posterior and minimizes its interval simultaneously. By minimizing the interval of the posterior, our method can reduce the modeling uncertainty in the posterior. In this paper, the aforementioned objective is achieved by using the M4 estimation, which combines the Maximum a Posterior (MAP) estimation with Minimum Mean-Square Error (MMSE), Maximum Likelihood (ML), and Minimum Interval Length (MIL) estimations. In the M4 estimation, our method maximizes the posterior over the state obtained by the MMSE estimation. The method also minimizes interval of the posterior by reducing the gap between the lower and upper bounds of the posterior. The gap is reduced when the likelihood is maximized by the ML estimation and the interval length of the state is minimized by the MIL estimation. The experimental results demonstrate that M4 estimation can be easily integrated into conventional tracking methods and can greatly enhance their tracking accuracy. In several challenging datasets, our method outperforms state-of-the-art tracking methods.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130907411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Human Action Recognition across Datasets by Foreground-Weighted Histogram Decomposition 基于前景加权直方图分解的跨数据集人体动作识别
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.103
Waqas Sultani, Imran Saleemi
This paper attempts to address the problem of recognizing human actions while training and testing on distinct datasets, when test videos are neither labeled nor available during training. In this scenario, learning of a joint vocabulary, or domain transfer techniques are not applicable. We first explore reasons for poor classifier performance when tested on novel datasets, and quantify the effect of scene backgrounds on action representations and recognition. Using only the background features and partitioning of gist feature space, we show that the background scenes in recent datasets are quite discriminative and can be used classify an action with reasonable accuracy. We then propose a new process to obtain a measure of confidence in each pixel of the video being a foreground region, using motion, appearance, and saliency together in a 3D MRF based framework. We also propose multiple ways to exploit the foreground confidence: to improve bag-of-words vocabulary, histogram representation of a video, and a novel histogram decomposition based representation and kernel. We used these foreground confidences to recognize actions trained on one data set and test on a different data set. We have performed extensive experiments on several datasets that improve cross dataset recognition accuracy as compared to baseline methods.
本文试图解决在训练和测试不同数据集时识别人类行为的问题,当测试视频在训练期间既没有标记也不可用时。在这种情况下,学习联合词汇表或领域转移技术是不适用的。我们首先探讨了在新数据集上测试分类器性能差的原因,并量化了场景背景对动作表示和识别的影响。仅使用背景特征和主旨特征空间的划分,我们表明,在最近的数据集中,背景场景具有很强的判别能力,可以以合理的精度对动作进行分类。然后,我们提出了一个新的过程,以获得置信度在视频的每个像素是前景区域,使用运动,外观和显著性一起在一个基于3D磁共振成像的框架。我们还提出了多种方法来利用前景置信度:改进词袋词汇,视频的直方图表示,以及一种新的基于直方图分解的表示和核。我们使用这些前景置信度来识别在一个数据集上训练的动作,并在不同的数据集上进行测试。与基线方法相比,我们在几个数据集上进行了广泛的实验,提高了跨数据集识别的准确性。
{"title":"Human Action Recognition across Datasets by Foreground-Weighted Histogram Decomposition","authors":"Waqas Sultani, Imran Saleemi","doi":"10.1109/CVPR.2014.103","DOIUrl":"https://doi.org/10.1109/CVPR.2014.103","url":null,"abstract":"This paper attempts to address the problem of recognizing human actions while training and testing on distinct datasets, when test videos are neither labeled nor available during training. In this scenario, learning of a joint vocabulary, or domain transfer techniques are not applicable. We first explore reasons for poor classifier performance when tested on novel datasets, and quantify the effect of scene backgrounds on action representations and recognition. Using only the background features and partitioning of gist feature space, we show that the background scenes in recent datasets are quite discriminative and can be used classify an action with reasonable accuracy. We then propose a new process to obtain a measure of confidence in each pixel of the video being a foreground region, using motion, appearance, and saliency together in a 3D MRF based framework. We also propose multiple ways to exploit the foreground confidence: to improve bag-of-words vocabulary, histogram representation of a video, and a novel histogram decomposition based representation and kernel. We used these foreground confidences to recognize actions trained on one data set and test on a different data set. We have performed extensive experiments on several datasets that improve cross dataset recognition accuracy as compared to baseline methods.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132831210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 65
Weighted Nuclear Norm Minimization with Application to Image Denoising 加权核范数最小化在图像去噪中的应用
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.366
Shuhang Gu, Lei Zhang, W. Zuo, Xiangchu Feng
As a convex relaxation of the low rank matrix factorization problem, the nuclear norm minimization has been attracting significant research interest in recent years. The standard nuclear norm minimization regularizes each singular value equally to pursue the convexity of the objective function. However, this greatly restricts its capability and flexibility in dealing with many practical problems (e.g., denoising), where the singular values have clear physical meanings and should be treated differently. In this paper we study the weighted nuclear norm minimization (WNNM) problem, where the singular values are assigned different weights. The solutions of the WNNM problem are analyzed under different weighting conditions. We then apply the proposed WNNM algorithm to image denoising by exploiting the image nonlocal self-similarity. Experimental results clearly show that the proposed WNNM algorithm outperforms many state-of-the-art denoising algorithms such as BM3D in terms of both quantitative measure and visual perception quality.
作为低秩矩阵分解问题的一种凸松弛,核范数最小化问题近年来引起了广泛的研究兴趣。标准核范数最小化对每个奇异值进行同等正则化,以追求目标函数的凸性。然而,这极大地限制了它在处理许多实际问题(如去噪)时的能力和灵活性,在这些实际问题中,奇异值具有明确的物理意义,应该区别对待。本文研究了加权核范数最小化问题,其中奇异值被赋予不同的权重。分析了不同加权条件下WNNM问题的解。然后,利用图像的非局部自相似性,将所提出的WNNM算法应用于图像去噪。实验结果清楚地表明,所提出的WNNM算法在定量度量和视觉感知质量方面都优于BM3D等许多最先进的去噪算法。
{"title":"Weighted Nuclear Norm Minimization with Application to Image Denoising","authors":"Shuhang Gu, Lei Zhang, W. Zuo, Xiangchu Feng","doi":"10.1109/CVPR.2014.366","DOIUrl":"https://doi.org/10.1109/CVPR.2014.366","url":null,"abstract":"As a convex relaxation of the low rank matrix factorization problem, the nuclear norm minimization has been attracting significant research interest in recent years. The standard nuclear norm minimization regularizes each singular value equally to pursue the convexity of the objective function. However, this greatly restricts its capability and flexibility in dealing with many practical problems (e.g., denoising), where the singular values have clear physical meanings and should be treated differently. In this paper we study the weighted nuclear norm minimization (WNNM) problem, where the singular values are assigned different weights. The solutions of the WNNM problem are analyzed under different weighting conditions. We then apply the proposed WNNM algorithm to image denoising by exploiting the image nonlocal self-similarity. Experimental results clearly show that the proposed WNNM algorithm outperforms many state-of-the-art denoising algorithms such as BM3D in terms of both quantitative measure and visual perception quality.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133000473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1707
Bayesian Active Contours with Affine-Invariant, Elastic Shape Prior 具有仿射不变弹性形状先验的贝叶斯活动轮廓
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.47
Darshan W. Bryner, Anuj Srivastava
Active contour, especially in conjunction with prior-shape models, has become an important tool in image segmentation. However, most contour methods use shape priors based on similarity-shape analysis, i.e. analysis that is invariant to rotation, translation, and scale. In practice, the training shapes used for prior-shape models may be collected from viewing angles different from those for the test images and require invariance to a larger class of transformation. Using an elastic, affine-invariant shape modeling of planar curves, we propose an active contour algorithm in which the training and test shapes can be at arbitrary affine transformations, and the resulting segmentation is robust to perspective skews. We construct a shape space of affine-standardized curves and derive a statistical model for capturing class-specific shape variability. The active contour is then driven by the true gradient of a total energy composed of a data term, a smoothing term, and an affine-invariant shape-prior term. This framework is demonstrated using a number of examples involving the segmentation of occluded or noisy images of targets subject to perspective skew.
活动轮廓,特别是与先验形状模型相结合,已成为图像分割的重要工具。然而,大多数轮廓方法使用基于相似形状分析的形状先验,即对旋转、平移和尺度不变的分析。在实践中,用于先验形状模型的训练形状可能是从不同于测试图像的视角收集的,并且需要对更大类别的变换具有不变性。利用平面曲线的弹性仿射不变形状建模,我们提出了一种主动轮廓算法,其中训练和测试形状可以在任意仿射变换下进行,并且所得到的分割对透视倾斜具有鲁棒性。我们构造了仿射标准化曲线的形状空间,并推导了捕获类特定形状变异性的统计模型。活动轮廓由一个数据项、一个平滑项和一个仿射不变形状先验项组成的总能量的真梯度驱动。该框架演示了使用一些例子,涉及分割遮挡或噪声图像的目标受到透视倾斜。
{"title":"Bayesian Active Contours with Affine-Invariant, Elastic Shape Prior","authors":"Darshan W. Bryner, Anuj Srivastava","doi":"10.1109/CVPR.2014.47","DOIUrl":"https://doi.org/10.1109/CVPR.2014.47","url":null,"abstract":"Active contour, especially in conjunction with prior-shape models, has become an important tool in image segmentation. However, most contour methods use shape priors based on similarity-shape analysis, i.e. analysis that is invariant to rotation, translation, and scale. In practice, the training shapes used for prior-shape models may be collected from viewing angles different from those for the test images and require invariance to a larger class of transformation. Using an elastic, affine-invariant shape modeling of planar curves, we propose an active contour algorithm in which the training and test shapes can be at arbitrary affine transformations, and the resulting segmentation is robust to perspective skews. We construct a shape space of affine-standardized curves and derive a statistical model for capturing class-specific shape variability. The active contour is then driven by the true gradient of a total energy composed of a data term, a smoothing term, and an affine-invariant shape-prior term. This framework is demonstrated using a number of examples involving the segmentation of occluded or noisy images of targets subject to perspective skew.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133013878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Separation of Line Drawings Based on Split Faces for 3D Object Reconstruction 三维物体重建中基于分割面的线图分离
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.94
C. Zou, Heng Yang, Jianzhuang Liu
Reconstructing 3D objects from single line drawings is often desirable in computer vision and graphics applications. If the line drawing of a complex 3D object is decomposed into primitives of simple shape, the object can be easily reconstructed. We propose an effective method to conduct the line drawing separation and turn a complex line drawing into parametric 3D models. This is achieved by recursively separating the line drawing using two types of split faces. Our experiments show that the proposed separation method can generate more basic and simple line drawings, and its combination with the example-based reconstruction can robustly recover wider range of complex parametric 3D objects than previous methods.
在计算机视觉和图形应用中,从单线图重建3D对象通常是需要的。将复杂的三维物体的线条图分解成简单形状的原语,可以很容易地对物体进行重构。提出了一种有效的线条分离方法,将复杂的线条分离成参数化的三维模型。这是通过使用两种类型的分割面递归地分离线条来实现的。实验表明,所提出的分离方法可以生成更基本、更简单的线条图,并且与基于实例的重建方法相结合,可以鲁棒地恢复更大范围的复杂参数三维物体。
{"title":"Separation of Line Drawings Based on Split Faces for 3D Object Reconstruction","authors":"C. Zou, Heng Yang, Jianzhuang Liu","doi":"10.1109/CVPR.2014.94","DOIUrl":"https://doi.org/10.1109/CVPR.2014.94","url":null,"abstract":"Reconstructing 3D objects from single line drawings is often desirable in computer vision and graphics applications. If the line drawing of a complex 3D object is decomposed into primitives of simple shape, the object can be easily reconstructed. We propose an effective method to conduct the line drawing separation and turn a complex line drawing into parametric 3D models. This is achieved by recursively separating the line drawing using two types of split faces. Our experiments show that the proposed separation method can generate more basic and simple line drawings, and its combination with the example-based reconstruction can robustly recover wider range of complex parametric 3D objects than previous methods.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133079797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Quasi Real-Time Summarization for Consumer Videos 面向消费者视频的准实时摘要
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.322
Bin Zhao, E. Xing
With the widespread availability of video cameras, we are facing an ever-growing enormous collection of unedited and unstructured video data. Due to lack of an automatic way to generate summaries from this large collection of consumer videos, they can be tedious and time consuming to index or search. In this work, we propose online video highlighting, a principled way of generating short video summarizing the most important and interesting contents of an unedited and unstructured video, costly both time-wise and financially for manual processing. Specifically, our method learns a dictionary from given video using group sparse coding, and updates atoms in the dictionary on-the-fly. A summary video is then generated by combining segments that cannot be sparsely reconstructed using the learned dictionary. The online fashion of our proposed method enables it to process arbitrarily long videos and start generating summaries before seeing the end of the video. Moreover, the processing time required by our proposed method is close to the original video length, achieving quasi real-time summarization speed. Theoretical analysis, together with experimental results on more than 12 hours of surveillance and YouTube videos are provided, demonstrating the effectiveness of online video highlighting.
随着摄像机的广泛使用,我们正面临着不断增长的大量未经编辑和非结构化的视频数据。由于缺乏从大量消费者视频中自动生成摘要的方法,因此索引或搜索它们可能是乏味且耗时的。在这项工作中,我们提出了在线视频突出显示,这是一种原则性的方法,可以生成短视频,总结未编辑和非结构化视频中最重要和最有趣的内容,人工处理在时间和经济上都很昂贵。具体来说,我们的方法使用组稀疏编码从给定的视频中学习字典,并动态更新字典中的原子。然后,将无法使用学习字典稀疏重建的片段组合在一起,生成摘要视频。我们提出的方法的在线方式使其能够处理任意长的视频,并在看到视频结束之前开始生成摘要。此外,该方法所需的处理时间接近原始视频长度,实现了准实时的摘要速度。理论分析,以及对超过12小时的监控和YouTube视频的实验结果,证明了在线视频突出的有效性。
{"title":"Quasi Real-Time Summarization for Consumer Videos","authors":"Bin Zhao, E. Xing","doi":"10.1109/CVPR.2014.322","DOIUrl":"https://doi.org/10.1109/CVPR.2014.322","url":null,"abstract":"With the widespread availability of video cameras, we are facing an ever-growing enormous collection of unedited and unstructured video data. Due to lack of an automatic way to generate summaries from this large collection of consumer videos, they can be tedious and time consuming to index or search. In this work, we propose online video highlighting, a principled way of generating short video summarizing the most important and interesting contents of an unedited and unstructured video, costly both time-wise and financially for manual processing. Specifically, our method learns a dictionary from given video using group sparse coding, and updates atoms in the dictionary on-the-fly. A summary video is then generated by combining segments that cannot be sparsely reconstructed using the learned dictionary. The online fashion of our proposed method enables it to process arbitrarily long videos and start generating summaries before seeing the end of the video. Moreover, the processing time required by our proposed method is close to the original video length, achieving quasi real-time summarization speed. Theoretical analysis, together with experimental results on more than 12 hours of surveillance and YouTube videos are provided, demonstrating the effectiveness of online video highlighting.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133356289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 235
Attributed Graph Mining and Matching: An Attempt to Define and Extract Soft Attributed Patterns 属性图挖掘与匹配:软属性模式定义与提取的尝试
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.181
Quanshi Zhang, Xuan Song, Xiaowei Shao, Huijing Zhao, R. Shibasaki
Graph matching and graph mining are two typical areas in artificial intelligence. In this paper, we define the soft attributed pattern (SAP) to describe the common subgraph pattern among a set of attributed relational graphs (ARGs), considering both the graphical structure and graph attributes. We propose a direct solution to extract the SAP with the maximal graph size without node enumeration. Given an initial graph template and a number of ARGs, we modify the graph template into the maximal SAP among the ARGs in an unsupervised fashion. The maximal SAP extraction is equivalent to learning a graphical model (i.e. an object model) from large ARGs (i.e. cluttered RGB/RGB-D images) for graph matching, which extends the concept of "unsupervised learning for graph matching." Furthermore, this study can be also regarded as the first known approach to formulating "maximal graph mining" in the graph domain of ARGs. Our method exhibits superior performance on RGB and RGB-D images.
图匹配和图挖掘是人工智能研究的两个典型领域。本文定义了软属性模式(SAP)来描述一组属性关系图(arg)之间的公共子图模式,同时考虑了图的结构和属性。我们提出了一种无需节点枚举的直接提取图大小最大的SAP的解决方案。给定初始图模板和一些arg,我们以无监督的方式将图模板修改为arg之间的最大SAP。最大SAP提取相当于从大型arg(即杂乱的RGB/RGB- d图像)中学习图形模型(即对象模型)进行图形匹配,这扩展了“无监督学习进行图形匹配”的概念。此外,该研究也可以被认为是已知的第一个在ARGs图域中制定“最大图挖掘”的方法。我们的方法在RGB和RGB- d图像上表现出优异的性能。
{"title":"Attributed Graph Mining and Matching: An Attempt to Define and Extract Soft Attributed Patterns","authors":"Quanshi Zhang, Xuan Song, Xiaowei Shao, Huijing Zhao, R. Shibasaki","doi":"10.1109/CVPR.2014.181","DOIUrl":"https://doi.org/10.1109/CVPR.2014.181","url":null,"abstract":"Graph matching and graph mining are two typical areas in artificial intelligence. In this paper, we define the soft attributed pattern (SAP) to describe the common subgraph pattern among a set of attributed relational graphs (ARGs), considering both the graphical structure and graph attributes. We propose a direct solution to extract the SAP with the maximal graph size without node enumeration. Given an initial graph template and a number of ARGs, we modify the graph template into the maximal SAP among the ARGs in an unsupervised fashion. The maximal SAP extraction is equivalent to learning a graphical model (i.e. an object model) from large ARGs (i.e. cluttered RGB/RGB-D images) for graph matching, which extends the concept of \"unsupervised learning for graph matching.\" Furthermore, this study can be also regarded as the first known approach to formulating \"maximal graph mining\" in the graph domain of ARGs. Our method exhibits superior performance on RGB and RGB-D images.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134456791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Semi-supervised Spectral Clustering for Image Set Classification 用于图像集分类的半监督光谱聚类
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.23
A. Mahmood, A. Mian, R. Owens
We present an image set classification algorithm based on unsupervised clustering of labeled training and unlabeled test data where labels are only used in the stopping criterion. The probability distribution of each class over the set of clusters is used to define a true set based similarity measure. To this end, we propose an iterative sparse spectral clustering algorithm. In each iteration, a proximity matrix is efficiently recomputed to better represent the local subspace structure. Initial clusters capture the global data structure and finer clusters at the later stages capture the subtle class differences not visible at the global scale. Image sets are compactly represented with multiple Grassmannian manifolds which are subsequently embedded in Euclidean space with the proposed spectral clustering algorithm. We also propose an efficient eigenvector solver which not only reduces the computational cost of spectral clustering by many folds but also improves the clustering quality and final classification results. Experiments on five standard datasets and comparison with seven existing techniques show the efficacy of our algorithm.
我们提出了一种基于无监督聚类的图像集分类算法,其中标记仅用于停止准则。每个类在聚类集上的概率分布用于定义基于真集的相似性度量。为此,我们提出了一种迭代稀疏谱聚类算法。在每次迭代中,有效地重新计算邻近矩阵,以更好地表示局部子空间结构。初始集群捕获全局数据结构,后期阶段的精细集群捕获全局尺度上不可见的细微类差异。图像集由多个格拉斯曼流形紧凑表示,然后用谱聚类算法将这些流形嵌入到欧几里德空间中。我们还提出了一种高效的特征向量求解器,它不仅大大降低了谱聚类的计算成本,而且提高了聚类质量和最终的分类结果。在5个标准数据集上进行了实验,并与现有的7种方法进行了比较,结果表明了算法的有效性。
{"title":"Semi-supervised Spectral Clustering for Image Set Classification","authors":"A. Mahmood, A. Mian, R. Owens","doi":"10.1109/CVPR.2014.23","DOIUrl":"https://doi.org/10.1109/CVPR.2014.23","url":null,"abstract":"We present an image set classification algorithm based on unsupervised clustering of labeled training and unlabeled test data where labels are only used in the stopping criterion. The probability distribution of each class over the set of clusters is used to define a true set based similarity measure. To this end, we propose an iterative sparse spectral clustering algorithm. In each iteration, a proximity matrix is efficiently recomputed to better represent the local subspace structure. Initial clusters capture the global data structure and finer clusters at the later stages capture the subtle class differences not visible at the global scale. Image sets are compactly represented with multiple Grassmannian manifolds which are subsequently embedded in Euclidean space with the proposed spectral clustering algorithm. We also propose an efficient eigenvector solver which not only reduces the computational cost of spectral clustering by many folds but also improves the clustering quality and final classification results. Experiments on five standard datasets and comparison with seven existing techniques show the efficacy of our algorithm.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115557657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 53
Real-Time Simultaneous Pose and Shape Estimation for Articulated Objects Using a Single Depth Camera 基于单深度相机的铰接物体实时同步姿态和形状估计
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.301
Mao Ye, Ruigang Yang
In this paper we present a novel real-time algorithm for simultaneous pose and shape estimation for articulated objects, such as human beings and animals. The key of our pose estimation component is to embed the articulated deformation model with exponential-maps-based parametrization into a Gaussian Mixture Model. Benefiting from the probabilistic measurement model, our algorithm requires no explicit point correspondences as opposed to most existing methods. Consequently, our approach is less sensitive to local minimum and well handles fast and complex motions. Extensive evaluations on publicly available datasets demonstrate that our method outperforms most state-of-art pose estimation algorithms with large margin, especially in the case of challenging motions. Moreover, our novel shape adaptation algorithm based on the same probabilistic model automatically captures the shape of the subjects during the dynamic pose estimation process. Experiments show that our shape estimation method achieves comparable accuracy with state of the arts, yet requires neither parametric model nor extra calibration procedure.
在本文中,我们提出了一种新的实时算法,用于同时估计关节物体的姿态和形状,如人类和动物。姿态估计组件的关键是将基于指数映射的参数化铰接变形模型嵌入到高斯混合模型中。得益于概率测量模型,与大多数现有方法不同,我们的算法不需要明确的点对应。因此,我们的方法对局部最小值的敏感性较低,并且可以很好地处理快速和复杂的运动。对公开可用数据集的广泛评估表明,我们的方法在很大程度上优于大多数最先进的姿态估计算法,特别是在具有挑战性的运动情况下。此外,基于相同概率模型的形状自适应算法在动态姿态估计过程中自动捕获被摄体的形状。实验结果表明,该方法既不需要参数化模型,也不需要额外的标定过程,具有与现有方法相当的精度。
{"title":"Real-Time Simultaneous Pose and Shape Estimation for Articulated Objects Using a Single Depth Camera","authors":"Mao Ye, Ruigang Yang","doi":"10.1109/CVPR.2014.301","DOIUrl":"https://doi.org/10.1109/CVPR.2014.301","url":null,"abstract":"In this paper we present a novel real-time algorithm for simultaneous pose and shape estimation for articulated objects, such as human beings and animals. The key of our pose estimation component is to embed the articulated deformation model with exponential-maps-based parametrization into a Gaussian Mixture Model. Benefiting from the probabilistic measurement model, our algorithm requires no explicit point correspondences as opposed to most existing methods. Consequently, our approach is less sensitive to local minimum and well handles fast and complex motions. Extensive evaluations on publicly available datasets demonstrate that our method outperforms most state-of-art pose estimation algorithms with large margin, especially in the case of challenging motions. Moreover, our novel shape adaptation algorithm based on the same probabilistic model automatically captures the shape of the subjects during the dynamic pose estimation process. Experiments show that our shape estimation method achieves comparable accuracy with state of the arts, yet requires neither parametric model nor extra calibration procedure.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115763878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
Efficient Computation of Relative Pose for Multi-camera Systems 多相机系统中相对姿态的高效计算
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.64
L. Kneip, Hongdong Li
We present a novel solution to compute the relative pose of a generalized camera. Existing solutions are either not general, have too high computational complexity, or require too many correspondences, which impedes an efficient or accurate usage within Ransac schemes. We factorize the problem as a low-dimensional, iterative optimization over relative rotation only, directly derived from well-known epipolar constraints. Common generalized cameras often consist of camera clusters, and give rise to omni-directional landmark observations. We prove that our iterative scheme performs well in such practically relevant situations, eventually resulting in computational efficiency similar to linear solvers, and accuracy close to bundle adjustment, while using less correspondences. Experiments on both virtual and real multi-camera systems prove superior overall performance for robust, real-time multi-camera motion-estimation.
提出了一种计算广义相机相对位姿的新方法。现有的解决方案要么不通用,要么计算复杂性太高,要么需要太多的通信,这阻碍了Ransac方案中有效或准确的使用。我们把这个问题分解为一个低维的,迭代优化的相对旋转,直接从众所周知的极缘约束。普通的广义相机通常由相机集群组成,并产生全方位的地标观测。我们证明了我们的迭代方案在这些实际相关的情况下表现良好,最终导致计算效率与线性求解器相似,精度接近束调整,同时使用较少的对应。在虚拟和真实多摄像机系统上的实验证明,该方法具有较好的鲁棒、实时多摄像机运动估计性能。
{"title":"Efficient Computation of Relative Pose for Multi-camera Systems","authors":"L. Kneip, Hongdong Li","doi":"10.1109/CVPR.2014.64","DOIUrl":"https://doi.org/10.1109/CVPR.2014.64","url":null,"abstract":"We present a novel solution to compute the relative pose of a generalized camera. Existing solutions are either not general, have too high computational complexity, or require too many correspondences, which impedes an efficient or accurate usage within Ransac schemes. We factorize the problem as a low-dimensional, iterative optimization over relative rotation only, directly derived from well-known epipolar constraints. Common generalized cameras often consist of camera clusters, and give rise to omni-directional landmark observations. We prove that our iterative scheme performs well in such practically relevant situations, eventually resulting in computational efficiency similar to linear solvers, and accuracy close to bundle adjustment, while using less correspondences. Experiments on both virtual and real multi-camera systems prove superior overall performance for robust, real-time multi-camera motion-estimation.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115847373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 60
期刊
2014 IEEE Conference on Computer Vision and Pattern Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1