首页 > 最新文献

2014 IEEE Conference on Computer Vision and Pattern Recognition最新文献

英文 中文
Congruency-Based Reranking Congruency-Based Reranking
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.270
Itai Ben-Shalom, Noga Levy, Lior Wolf, N. Dershowitz, Adiel Ben-Shalom, Roni Shweka, Y. Choueka, Tamir Hazan, Yaniv Bar
We present a tool for re-ranking the results of a specific query by considering the (n+1) × (n+1) matrix of pairwise similarities among the elements of the set of n retrieved results and the query itself. The re-ranking thus makes use of the similarities between the various results and does not employ additional sources of information. The tool is based on graphical Bayesian models, which reinforce retrieved items strongly linked to other retrievals, and on repeated clustering to measure the stability of the obtained associations. The utility of the tool is demonstrated within the context of visual search of documents from the Cairo Genizah and for retrieval of paintings by the same artist and in the same style.
我们提出了一个工具,通过考虑(n+1) &次来对特定查询的结果重新排序;(n+1) n个检索结果集合的元素与查询本身之间的成对相似度矩阵。因此,重新排序利用了各种结果之间的相似性,而不使用额外的信息来源。该工具基于图形贝叶斯模型,该模型强化了与其他检索强烈相关的检索项目,并通过重复聚类来测量获得的关联的稳定性。该工具的实用性是在视觉搜索Cairo Genizah的文件和检索同一艺术家和同一风格的绘画的背景下展示的。
{"title":"Congruency-Based Reranking","authors":"Itai Ben-Shalom, Noga Levy, Lior Wolf, N. Dershowitz, Adiel Ben-Shalom, Roni Shweka, Y. Choueka, Tamir Hazan, Yaniv Bar","doi":"10.1109/CVPR.2014.270","DOIUrl":"https://doi.org/10.1109/CVPR.2014.270","url":null,"abstract":"We present a tool for re-ranking the results of a specific query by considering the (n+1) × (n+1) matrix of pairwise similarities among the elements of the set of n retrieved results and the query itself. The re-ranking thus makes use of the similarities between the various results and does not employ additional sources of information. The tool is based on graphical Bayesian models, which reinforce retrieved items strongly linked to other retrievals, and on repeated clustering to measure the stability of the obtained associations. The utility of the tool is demonstrated within the context of visual search of documents from the Cairo Genizah and for retrieval of paintings by the same artist and in the same style.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115012185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Efficient Pruning LMI Conditions for Branch-and-Prune Rank and Chirality-Constrained Estimation of the Dual Absolute Quadric 对偶绝对二次函数分支-剪枝秩和手性约束估计的有效剪枝LMI条件
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.70
A. Habed, D. Paudel, C. Demonceaux, D. Fofi
We present a new globally optimal algorithm for self-calibrating a moving camera with constant parameters. Our method aims at estimating the Dual Absolute Quadric (DAQ) under the rank-3 and, optionally, camera centers chirality constraints. We employ the Branch-and-Prune paradigm and explore the space of only 5 parameters. Pruning in our method relies on solving Linear Matrix Inequality (LMI) feasibility and Generalized Eigenvalue (GEV) problems that solely depend upon the entries of the DAQ. These LMI and GEV problems are used to rule out branches in the search tree in which a quadric not satisfying the rank and chirality conditions on camera centers is guaranteed not to exist. The chirality LMI conditions are obtained by relying on the mild assumption that the camera undergoes a rotation of no more than 90 between consecutive views. Furthermore, our method does not rely on calculating bounds on any particular cost function and hence can virtually optimize any objective while achieving global optimality in a very competitive running-time.
提出了一种恒参数运动摄像机自标定的全局最优算法。我们的方法旨在估计在秩3和可选的相机中心手性约束下的双重绝对二次(DAQ)。我们采用分支和修剪范式,探索只有5个参数的空间。我们的方法中的剪枝依赖于解决线性矩阵不等式(LMI)的可行性和仅依赖于DAQ条目的广义特征值(GEV)问题。这些LMI和GEV问题用于排除搜索树中的分支,其中保证在相机中心不存在不满足秩和手性条件的二次曲线。手性LMI条件通过依赖于相机在连续视图之间旋转不超过90度的温和假设而获得。此外,我们的方法不依赖于计算任何特定成本函数的边界,因此实际上可以优化任何目标,同时在非常有竞争力的运行时间内实现全局最优。
{"title":"Efficient Pruning LMI Conditions for Branch-and-Prune Rank and Chirality-Constrained Estimation of the Dual Absolute Quadric","authors":"A. Habed, D. Paudel, C. Demonceaux, D. Fofi","doi":"10.1109/CVPR.2014.70","DOIUrl":"https://doi.org/10.1109/CVPR.2014.70","url":null,"abstract":"We present a new globally optimal algorithm for self-calibrating a moving camera with constant parameters. Our method aims at estimating the Dual Absolute Quadric (DAQ) under the rank-3 and, optionally, camera centers chirality constraints. We employ the Branch-and-Prune paradigm and explore the space of only 5 parameters. Pruning in our method relies on solving Linear Matrix Inequality (LMI) feasibility and Generalized Eigenvalue (GEV) problems that solely depend upon the entries of the DAQ. These LMI and GEV problems are used to rule out branches in the search tree in which a quadric not satisfying the rank and chirality conditions on camera centers is guaranteed not to exist. The chirality LMI conditions are obtained by relying on the mild assumption that the camera undergoes a rotation of no more than 90 between consecutive views. Furthermore, our method does not rely on calculating bounds on any particular cost function and hence can virtually optimize any objective while achieving global optimality in a very competitive running-time.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115141034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Blind Image Quality Assessment Using Semi-supervised Rectifier Networks 基于半监督整流网络的盲图像质量评估
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.368
Huixuan Tang, Neel Joshi, Ashish Kapoor
It is often desirable to evaluate images quality with a perceptually relevant measure that does not require a reference image. Recent approaches to this problem use human provided quality scores with machine learning to learn a measure. The biggest hurdles to these efforts are: 1) the difficulty of generalizing across diverse types of distortions and 2) collecting the enormity of human scored training data that is needed to learn the measure. We present a new blind image quality measure that addresses these difficulties by learning a robust, nonlinear kernel regression function using a rectifier neural network. The method is pre-trained with unlabeled data and fine-tuned with labeled data. It generalizes across a large set of images and distortion types without the need for a large amount of labeled data. We evaluate our approach on two benchmark datasets and show that it not only outperforms the current state of the art in blind image quality estimation, but also outperforms the state of the art in non-blind measures. Furthermore, we show that our semi-supervised approach is robust to using varying amounts of labeled data.
通常需要用不需要参考图像的感知相关度量来评估图像质量。最近解决这个问题的方法是使用人类提供的质量分数和机器学习来学习测量。这些努力的最大障碍是:1)难以在不同类型的扭曲中进行概括;2)收集学习测量所需的大量人类得分训练数据。我们提出了一种新的盲图像质量测量方法,通过使用整流神经网络学习鲁棒非线性核回归函数来解决这些困难。该方法使用未标记数据进行预训练,并使用标记数据进行微调。它泛化了大量的图像和失真类型,而不需要大量的标记数据。我们在两个基准数据集上评估了我们的方法,并表明它不仅在盲图像质量估计方面优于目前的技术水平,而且在非盲测量方面也优于目前的技术水平。此外,我们表明我们的半监督方法对于使用不同数量的标记数据具有鲁棒性。
{"title":"Blind Image Quality Assessment Using Semi-supervised Rectifier Networks","authors":"Huixuan Tang, Neel Joshi, Ashish Kapoor","doi":"10.1109/CVPR.2014.368","DOIUrl":"https://doi.org/10.1109/CVPR.2014.368","url":null,"abstract":"It is often desirable to evaluate images quality with a perceptually relevant measure that does not require a reference image. Recent approaches to this problem use human provided quality scores with machine learning to learn a measure. The biggest hurdles to these efforts are: 1) the difficulty of generalizing across diverse types of distortions and 2) collecting the enormity of human scored training data that is needed to learn the measure. We present a new blind image quality measure that addresses these difficulties by learning a robust, nonlinear kernel regression function using a rectifier neural network. The method is pre-trained with unlabeled data and fine-tuned with labeled data. It generalizes across a large set of images and distortion types without the need for a large amount of labeled data. We evaluate our approach on two benchmark datasets and show that it not only outperforms the current state of the art in blind image quality estimation, but also outperforms the state of the art in non-blind measures. Furthermore, we show that our semi-supervised approach is robust to using varying amounts of labeled data.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115153842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 106
Unsupervised One-Class Learning for Automatic Outlier Removal 自动离群值去除的无监督单类学习
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.483
W. Liu, G. Hua, John R. Smith
Outliers are pervasive in many computer vision and pattern recognition problems. Automatically eliminating outliers scattering among practical data collections becomes increasingly important, especially for Internet inspired vision applications. In this paper, we propose a novel one-class learning approach which is robust to contamination of input training data and able to discover the outliers that corrupt one class of data source. Our approach works under a fully unsupervised manner, differing from traditional one-class learning supervised by known positive labels. By design, our approach optimizes a kernel-based max-margin objective which jointly learns a large margin one-class classifier and a soft label assignment for inliers and outliers. An alternating optimization algorithm is then designed to iteratively refine the classifier and the labeling, achieving a provably convergent solution in only a few iterations. Extensive experiments conducted on four image datasets in the presence of artificial and real-world outliers demonstrate that the proposed approach is considerably superior to the state-of-the-arts in obliterating outliers from contaminated one class of images, exhibiting strong robustness at a high outlier proportion up to 60%.
异常值在许多计算机视觉和模式识别问题中普遍存在。自动消除实际数据收集中的异常值散射变得越来越重要,特别是对于受互联网启发的视觉应用。在本文中,我们提出了一种新的单类学习方法,该方法对输入训练数据的污染具有鲁棒性,并且能够发现破坏一类数据源的异常值。我们的方法在完全无监督的方式下工作,与传统的由已知正标签监督的单类学习不同。通过设计,我们的方法优化了一个基于核的最大边际目标,该目标共同学习了一个大边际单类分类器和一个针对内线和离群点的软标签分配。然后设计一个交替优化算法来迭代地改进分类器和标记,仅在几次迭代中获得可证明的收敛解。在存在人工异常值和真实异常值的四个图像数据集上进行的大量实验表明,所提出的方法在消除受污染的一类图像中的异常值方面明显优于最先进的方法,在高达60%的高异常值比例下表现出很强的鲁棒性。
{"title":"Unsupervised One-Class Learning for Automatic Outlier Removal","authors":"W. Liu, G. Hua, John R. Smith","doi":"10.1109/CVPR.2014.483","DOIUrl":"https://doi.org/10.1109/CVPR.2014.483","url":null,"abstract":"Outliers are pervasive in many computer vision and pattern recognition problems. Automatically eliminating outliers scattering among practical data collections becomes increasingly important, especially for Internet inspired vision applications. In this paper, we propose a novel one-class learning approach which is robust to contamination of input training data and able to discover the outliers that corrupt one class of data source. Our approach works under a fully unsupervised manner, differing from traditional one-class learning supervised by known positive labels. By design, our approach optimizes a kernel-based max-margin objective which jointly learns a large margin one-class classifier and a soft label assignment for inliers and outliers. An alternating optimization algorithm is then designed to iteratively refine the classifier and the labeling, achieving a provably convergent solution in only a few iterations. Extensive experiments conducted on four image datasets in the presence of artificial and real-world outliers demonstrate that the proposed approach is considerably superior to the state-of-the-arts in obliterating outliers from contaminated one class of images, exhibiting strong robustness at a high outlier proportion up to 60%.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"272 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115597501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 100
Simplex-Based 3D Spatio-temporal Feature Description for Action Recognition 基于simplex的动作识别三维时空特征描述
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.265
Hao Zhang, Wenjun Zhou, Christopher M. Reardon, L. Parker
We present a novel feature description algorithm to describe 3D local spatio-temporal features for human action recognition. Our descriptor avoids the singularity and limited discrimination power issues of traditional 3D descriptors by quantizing and describing visual features in the simplex topological vector space. Specifically, given a feature's support region containing a set of 3D visual cues, we decompose the cues' orientation into three angles, transform the decomposed angles into the simplex space, and describe them in such a space. Then, quadrant decomposition is performed to improve discrimination, and a final feature vector is composed from the resulting histograms. We develop intuitive visualization tools for analyzing feature characteristics in the simplex topological vector space. Experimental results demonstrate that our novel simplex-based orientation decomposition (SOD) descriptor substantially outperforms traditional 3D descriptors for the KTH, UCF Sport, and Hollywood-2 benchmark action datasets. In addition, the results show that our SOD descriptor is a superior individual descriptor for action recognition.
提出了一种用于人体动作识别的三维局部时空特征描述算法。该描述符通过在单纯形拓扑向量空间中对视觉特征进行量化和描述,避免了传统三维描述符的奇异性和识别能力有限的问题。具体来说,给定一个包含一组三维视觉线索的特征支持区域,我们将线索的方向分解为三个角度,将分解的角度转换成单纯形空间,并在该空间中进行描述。然后,进行象限分解以提高识别,并由得到的直方图组成最终的特征向量。我们开发了直观的可视化工具来分析单纯形拓扑向量空间中的特征特征。实验结果表明,在KTH、UCF Sport和Hollywood-2基准动作数据集上,我们的基于简单体的定向分解(SOD)描述符大大优于传统的3D描述符。结果表明,SOD描述符是一种较好的动作识别个体描述符。
{"title":"Simplex-Based 3D Spatio-temporal Feature Description for Action Recognition","authors":"Hao Zhang, Wenjun Zhou, Christopher M. Reardon, L. Parker","doi":"10.1109/CVPR.2014.265","DOIUrl":"https://doi.org/10.1109/CVPR.2014.265","url":null,"abstract":"We present a novel feature description algorithm to describe 3D local spatio-temporal features for human action recognition. Our descriptor avoids the singularity and limited discrimination power issues of traditional 3D descriptors by quantizing and describing visual features in the simplex topological vector space. Specifically, given a feature's support region containing a set of 3D visual cues, we decompose the cues' orientation into three angles, transform the decomposed angles into the simplex space, and describe them in such a space. Then, quadrant decomposition is performed to improve discrimination, and a final feature vector is composed from the resulting histograms. We develop intuitive visualization tools for analyzing feature characteristics in the simplex topological vector space. Experimental results demonstrate that our novel simplex-based orientation decomposition (SOD) descriptor substantially outperforms traditional 3D descriptors for the KTH, UCF Sport, and Hollywood-2 benchmark action datasets. In addition, the results show that our SOD descriptor is a superior individual descriptor for action recognition.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115658567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
Reconstructing Evolving Tree Structures in Time Lapse Sequences 时间推移序列中进化树结构的重构
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.388
Przemyslaw Glowacki, M. Pinheiro, Engin Türetken, R. Sznitman, Daniel Lebrecht, J. Kybic, A. Holtmaat, P. Fua
We propose an approach to reconstructing tree structures that evolve over time in 2D images and 3D image stacks such as neuronal axons or plant branches. Instead of reconstructing structures in each image independently, we do so for all images simultaneously to take advantage of temporal-consistency constraints. We show that this problem can be formulated as a Quadratic Mixed Integer Program and solved efficiently. The outcome of our approach is a framework that provides substantial improvements in reconstructions over traditional single time-instance formulations. Furthermore, an added benefit of our approach is the ability to automatically detect places where significant changes have occurred over time, which is challenging when considering large amounts of data.
我们提出了一种重建二维图像和三维图像堆栈(如神经元轴突或植物分支)中随时间演变的树形结构的方法。我们不是在每张图像中独立重建结构,而是同时对所有图像进行重建,以利用时间一致性约束。我们证明了这个问题可以用一个二次混合整数规划来表述,并且可以有效地求解。我们的方法的结果是提供了一个框架,在重建方面比传统的单一时间实例公式有了实质性的改进。此外,我们的方法的另一个好处是能够自动检测随着时间的推移发生重大变化的地方,这在考虑大量数据时是具有挑战性的。
{"title":"Reconstructing Evolving Tree Structures in Time Lapse Sequences","authors":"Przemyslaw Glowacki, M. Pinheiro, Engin Türetken, R. Sznitman, Daniel Lebrecht, J. Kybic, A. Holtmaat, P. Fua","doi":"10.1109/CVPR.2014.388","DOIUrl":"https://doi.org/10.1109/CVPR.2014.388","url":null,"abstract":"We propose an approach to reconstructing tree structures that evolve over time in 2D images and 3D image stacks such as neuronal axons or plant branches. Instead of reconstructing structures in each image independently, we do so for all images simultaneously to take advantage of temporal-consistency constraints. We show that this problem can be formulated as a Quadratic Mixed Integer Program and solved efficiently. The outcome of our approach is a framework that provides substantial improvements in reconstructions over traditional single time-instance formulations. Furthermore, an added benefit of our approach is the ability to automatically detect places where significant changes have occurred over time, which is challenging when considering large amounts of data.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"399 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124440563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Three Guidelines of Online Learning for Large-Scale Visual Recognition 大规模视觉识别在线学习的三个准则
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.457
Y. Ushiku, Masatoshi Hidaka, T. Harada
In this paper, we would like to evaluate online learning algorithms for large-scale visual recognition using state-of-the-art features which are preselected and held fixed. Today, combinations of high-dimensional features and linear classifiers are widely used for large-scale visual recognition. Numerous so-called mid-level features have been developed and mutually compared on an experimental basis. Although various learning methods for linear classification have also been proposed in the machine learning and natural language processing literature, they have rarely been evaluated for visual recognition. Therefore, we give guidelines via investigations of state-of-the-art online learning methods of linear classifiers. Many methods have been evaluated using toy data and natural language processing problems such as document classification. Consequently, we gave those methods a unified interpretation from the viewpoint of visual recognition. Results of controlled comparisons indicate three guidelines that might change the pipeline for visual recognition.
在本文中,我们希望评估大规模视觉识别的在线学习算法,该算法使用预先选择并保持固定的最先进特征。目前,高维特征与线性分类器的结合被广泛应用于大规模视觉识别。许多所谓的中级特征已经被开发出来,并在实验的基础上相互比较。尽管在机器学习和自然语言处理文献中也提出了各种用于线性分类的学习方法,但它们很少用于视觉识别。因此,我们通过研究最先进的线性分类器在线学习方法给出指导方针。许多方法已经被评估使用玩具数据和自然语言处理问题,如文档分类。因此,我们从视觉识别的角度对这些方法进行了统一的解释。对照比较的结果表明,可能改变视觉识别管道的三个指导方针。
{"title":"Three Guidelines of Online Learning for Large-Scale Visual Recognition","authors":"Y. Ushiku, Masatoshi Hidaka, T. Harada","doi":"10.1109/CVPR.2014.457","DOIUrl":"https://doi.org/10.1109/CVPR.2014.457","url":null,"abstract":"In this paper, we would like to evaluate online learning algorithms for large-scale visual recognition using state-of-the-art features which are preselected and held fixed. Today, combinations of high-dimensional features and linear classifiers are widely used for large-scale visual recognition. Numerous so-called mid-level features have been developed and mutually compared on an experimental basis. Although various learning methods for linear classification have also been proposed in the machine learning and natural language processing literature, they have rarely been evaluated for visual recognition. Therefore, we give guidelines via investigations of state-of-the-art online learning methods of linear classifiers. Many methods have been evaluated using toy data and natural language processing problems such as document classification. Consequently, we gave those methods a unified interpretation from the viewpoint of visual recognition. Results of controlled comparisons indicate three guidelines that might change the pipeline for visual recognition.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125238758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Joint Motion Segmentation and Background Estimation in Dynamic Scenes 动态场景中关节运动分割与背景估计
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.54
Adeel Mumtaz, Weichen Zhang, Antoni B. Chan
We propose a joint foreground-background mixture model (FBM) that simultaneously performs background estimation and motion segmentation in complex dynamic scenes. Our FBM consist of a set of location-specific dynamic texture (DT) components, for modeling local background motion, and set of global DT components, for modeling consistent foreground motion. We derive an EM algorithm for estimating the parameters of the FBM. We also apply spatial constraints to the FBM using an Markov random field grid, and derive a corresponding variational approximation for inference. Unlike existing approaches to background subtraction, our FBM does not require a manually selected threshold or a separate training video. Unlike existing motion segmentation techniques, our FBM can segment foreground motions over complex background with mixed motions, and detect stopped objects. Since most dynamic scene datasets only contain videos with a single foreground object over a simple background, we develop a new challenging dataset with multiple foreground objects over complex dynamic backgrounds. In experiments, we show that jointly modeling the background and foreground segments with FBM yields significant improvements in accuracy on both background estimation and motion segmentation, compared to state-of-the-art methods.
提出了一种同时进行复杂动态场景背景估计和运动分割的前景背景混合模型(FBM)。我们的FBM包括一组特定位置的动态纹理(DT)组件,用于建模局部背景运动,以及一组全局DT组件,用于建模一致的前景运动。我们推导了一种估计FBM参数的EM算法。我们还使用马尔可夫随机场网格将空间约束应用于FBM,并推导出相应的变分近似推理。与现有的背景减法方法不同,我们的FBM不需要手动选择阈值或单独的训练视频。与现有的运动分割技术不同,我们的FBM可以在混合运动的复杂背景上分割前景运动,并检测停止的物体。由于大多数动态场景数据集仅包含在简单背景上具有单个前景对象的视频,因此我们开发了一个具有复杂动态背景上具有多个前景对象的新的具有挑战性的数据集。在实验中,我们表明,与最先进的方法相比,使用FBM联合建模背景和前景片段在背景估计和运动分割方面的准确性都有显着提高。
{"title":"Joint Motion Segmentation and Background Estimation in Dynamic Scenes","authors":"Adeel Mumtaz, Weichen Zhang, Antoni B. Chan","doi":"10.1109/CVPR.2014.54","DOIUrl":"https://doi.org/10.1109/CVPR.2014.54","url":null,"abstract":"We propose a joint foreground-background mixture model (FBM) that simultaneously performs background estimation and motion segmentation in complex dynamic scenes. Our FBM consist of a set of location-specific dynamic texture (DT) components, for modeling local background motion, and set of global DT components, for modeling consistent foreground motion. We derive an EM algorithm for estimating the parameters of the FBM. We also apply spatial constraints to the FBM using an Markov random field grid, and derive a corresponding variational approximation for inference. Unlike existing approaches to background subtraction, our FBM does not require a manually selected threshold or a separate training video. Unlike existing motion segmentation techniques, our FBM can segment foreground motions over complex background with mixed motions, and detect stopped objects. Since most dynamic scene datasets only contain videos with a single foreground object over a simple background, we develop a new challenging dataset with multiple foreground objects over complex dynamic backgrounds. In experiments, we show that jointly modeling the background and foreground segments with FBM yields significant improvements in accuracy on both background estimation and motion segmentation, compared to state-of-the-art methods.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"231 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120890056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Transfer Joint Matching for Unsupervised Domain Adaptation 无监督域自适应的传递联合匹配
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.183
Mingsheng Long, Jianmin Wang, Guiguang Ding, Jiaguang Sun, Philip S. Yu
Visual domain adaptation, which learns an accurate classifier for a new domain using labeled images from an old domain, has shown promising value in computer vision yet still been a challenging problem. Most prior works have explored two learning strategies independently for domain adaptation: feature matching and instance reweighting. In this paper, we show that both strategies are important and inevitable when the domain difference is substantially large. We therefore put forward a novel Transfer Joint Matching (TJM) approach to model them in a unified optimization problem. Specifically, TJM aims to reduce the domain difference by jointly matching the features and reweighting the instances across domains in a principled dimensionality reduction procedure, and construct new feature representation that is invariant to both the distribution difference and the irrelevant instances. Comprehensive experimental results verify that TJM can significantly outperform competitive methods for cross-domain image recognition problems.
视觉领域自适应是利用旧领域的标记图像学习新领域的准确分类器,在计算机视觉中显示出很好的应用价值,但仍然是一个具有挑战性的问题。大多数先前的工作都独立探索了两种学习策略:特征匹配和实例重加权。在本文中,我们表明,当领域差异很大时,这两种策略都是重要的和不可避免的。因此,我们提出了一种新的传递关节匹配(TJM)方法来对它们进行统一优化问题的建模。具体而言,TJM旨在通过有原则的降维过程,通过联合匹配特征并跨域重新加权实例来减少域差异,并构建对分布差异和不相关实例都不变的新特征表示。综合实验结果证明,TJM在跨域图像识别问题上明显优于竞争方法。
{"title":"Transfer Joint Matching for Unsupervised Domain Adaptation","authors":"Mingsheng Long, Jianmin Wang, Guiguang Ding, Jiaguang Sun, Philip S. Yu","doi":"10.1109/CVPR.2014.183","DOIUrl":"https://doi.org/10.1109/CVPR.2014.183","url":null,"abstract":"Visual domain adaptation, which learns an accurate classifier for a new domain using labeled images from an old domain, has shown promising value in computer vision yet still been a challenging problem. Most prior works have explored two learning strategies independently for domain adaptation: feature matching and instance reweighting. In this paper, we show that both strategies are important and inevitable when the domain difference is substantially large. We therefore put forward a novel Transfer Joint Matching (TJM) approach to model them in a unified optimization problem. Specifically, TJM aims to reduce the domain difference by jointly matching the features and reweighting the instances across domains in a principled dimensionality reduction procedure, and construct new feature representation that is invariant to both the distribution difference and the irrelevant instances. Comprehensive experimental results verify that TJM can significantly outperform competitive methods for cross-domain image recognition problems.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124819129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 620
Posebits for Monocular Human Pose Estimation 用于单目人体姿态估计的波塞比特
Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.300
Gerard Pons-Moll, David J. Fleet, B. Rosenhahn
We advocate the inference of qualitative information about 3D human pose, called posebits, from images. Posebits represent Boolean geometric relationships between body parts (e.g., left-leg in front of right-leg or hands close to each other). The advantages of posebits as a mid-level representation are 1) for many tasks of interest, such qualitative pose information may be sufficient (e.g., semantic image retrieval), 2) it is relatively easy to annotate large image corpora with posebits, as it simply requires answers to yes/no questions, and 3) they help resolve challenging pose ambiguities and therefore facilitate the difficult talk of image-based 3D pose estimation. We introduce posebits, a posebit database, a method for selecting useful posebits for pose estimation and a structural SVM model for posebit inference. Experiments show the use of posebits for semantic image retrieval and for improving 3D pose estimation.
我们提倡从图像中推断关于三维人体姿势的定性信息,称为posebits。波塞位表示身体部位之间的布尔几何关系(例如,左腿在右腿前面或双手彼此靠近)。posebit作为中级表示的优点是:1)对于许多感兴趣的任务,这种定性的姿态信息可能是足够的(例如,语义图像检索);2)用posebit注释大型图像语料库相对容易,因为它只需要回答是/否问题;3)它们有助于解决具有挑战性的姿态歧义,从而促进基于图像的3D姿态估计的困难讨论。我们介绍了波塞比特、波塞比特数据库、一种选择有用波塞比特进行位姿估计的方法以及一种用于波塞比特推断的结构支持向量机模型。实验表明,posebit可用于语义图像检索和改进三维姿态估计。
{"title":"Posebits for Monocular Human Pose Estimation","authors":"Gerard Pons-Moll, David J. Fleet, B. Rosenhahn","doi":"10.1109/CVPR.2014.300","DOIUrl":"https://doi.org/10.1109/CVPR.2014.300","url":null,"abstract":"We advocate the inference of qualitative information about 3D human pose, called posebits, from images. Posebits represent Boolean geometric relationships between body parts (e.g., left-leg in front of right-leg or hands close to each other). The advantages of posebits as a mid-level representation are 1) for many tasks of interest, such qualitative pose information may be sufficient (e.g., semantic image retrieval), 2) it is relatively easy to annotate large image corpora with posebits, as it simply requires answers to yes/no questions, and 3) they help resolve challenging pose ambiguities and therefore facilitate the difficult talk of image-based 3D pose estimation. We introduce posebits, a posebit database, a method for selecting useful posebits for pose estimation and a structural SVM model for posebit inference. Experiments show the use of posebits for semantic image retrieval and for improving 3D pose estimation.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"19 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125013843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 74
期刊
2014 IEEE Conference on Computer Vision and Pattern Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1