首页 > 最新文献

2013 IEEE International Conference on Computer Vision最新文献

英文 中文
Coherent Object Detection with 3D Geometric Context from a Single Image 基于单幅图像的三维几何背景的相干目标检测
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.320
Jiyan Pan, T. Kanade
Objects in a real world image cannot have arbitrary appearance, sizes and locations due to geometric constraints in 3D space. Such a 3D geometric context plays an important role in resolving visual ambiguities and achieving coherent object detection. In this paper, we develop a RANSAC-CRF framework to detect objects that are geometrically coherent in the 3D world. Different from existing methods, we propose a novel generalized RANSAC algorithm to generate global 3D geometry hypotheses from local entities such that outlier suppression and noise reduction is achieved simultaneously. In addition, we evaluate those hypotheses using a CRF which considers both the compatibility of individual objects under global 3D geometric context and the compatibility between adjacent objects under local 3D geometric context. Experiment results show that our approach compares favorably with the state of the art.
由于三维空间的几何约束,现实世界图像中的物体不能具有任意的外观、大小和位置。这种三维几何环境对于解决视觉歧义和实现目标的相干检测具有重要作用。在本文中,我们开发了一个RANSAC-CRF框架来检测三维世界中几何相干的物体。与现有方法不同,我们提出了一种新的广义RANSAC算法,从局部实体生成全局三维几何假设,从而同时实现离群值抑制和降噪。此外,我们使用CRF来评估这些假设,该CRF考虑了全局三维几何环境下单个物体的兼容性和局部三维几何环境下相邻物体之间的兼容性。实验结果表明,我们的方法优于目前的技术水平。
{"title":"Coherent Object Detection with 3D Geometric Context from a Single Image","authors":"Jiyan Pan, T. Kanade","doi":"10.1109/ICCV.2013.320","DOIUrl":"https://doi.org/10.1109/ICCV.2013.320","url":null,"abstract":"Objects in a real world image cannot have arbitrary appearance, sizes and locations due to geometric constraints in 3D space. Such a 3D geometric context plays an important role in resolving visual ambiguities and achieving coherent object detection. In this paper, we develop a RANSAC-CRF framework to detect objects that are geometrically coherent in the 3D world. Different from existing methods, we propose a novel generalized RANSAC algorithm to generate global 3D geometry hypotheses from local entities such that outlier suppression and noise reduction is achieved simultaneously. In addition, we evaluate those hypotheses using a CRF which considers both the compatibility of individual objects under global 3D geometric context and the compatibility between adjacent objects under local 3D geometric context. Experiment results show that our approach compares favorably with the state of the art.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"20 1","pages":"2576-2583"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83937496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Robust Tucker Tensor Decomposition for Effective Image Representation 鲁棒塔克张量分解的有效图像表示
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.304
Miao Zhang, C. Ding
Many tensor based algorithms have been proposed for the study of high dimensional data in a large variety of computer vision and machine learning applications. However, most of the existing tensor analysis approaches are based on Frobenius norm, which makes them sensitive to outliers, because they minimize the sum of squared errors and enlarge the influence of both outliers and large feature noises. In this paper, we propose a robust Tucker tensor decomposition model (RTD) to suppress the influence of outliers, which uses L1-norm loss function. Yet, the optimization on L1-norm based tensor analysis is much harder than standard tensor decomposition. In this paper, we propose a simple and efficient algorithm to solve our RTD model. Moreover, tensor factorization-based image storage needs much less space than PCA based methods. We carry out extensive experiments to evaluate the proposed algorithm, and verify the robustness against image occlusions. Both numerical and visual results show that our RTD model is consistently better against the existence of outliers than previous tensor and PCA methods.
在各种各样的计算机视觉和机器学习应用中,已经提出了许多基于张量的算法来研究高维数据。然而,现有的张量分析方法大多是基于Frobenius范数的,由于它们将误差的平方和最小化,放大了异常点和大特征噪声的影响,因此对异常点比较敏感。本文提出了一种鲁棒Tucker张量分解模型(RTD),该模型使用l1范数损失函数来抑制异常值的影响。然而,基于l1范数的张量分析的优化比标准张量分解要困难得多。在本文中,我们提出了一种简单有效的算法来求解我们的RTD模型。此外,基于张量分解的图像存储比基于PCA的方法需要更少的空间。我们进行了大量的实验来评估所提出的算法,并验证了对图像遮挡的鲁棒性。数值和视觉结果都表明,我们的RTD模型对异常值的存在始终优于以往的张量和主成分分析方法。
{"title":"Robust Tucker Tensor Decomposition for Effective Image Representation","authors":"Miao Zhang, C. Ding","doi":"10.1109/ICCV.2013.304","DOIUrl":"https://doi.org/10.1109/ICCV.2013.304","url":null,"abstract":"Many tensor based algorithms have been proposed for the study of high dimensional data in a large variety of computer vision and machine learning applications. However, most of the existing tensor analysis approaches are based on Frobenius norm, which makes them sensitive to outliers, because they minimize the sum of squared errors and enlarge the influence of both outliers and large feature noises. In this paper, we propose a robust Tucker tensor decomposition model (RTD) to suppress the influence of outliers, which uses L1-norm loss function. Yet, the optimization on L1-norm based tensor analysis is much harder than standard tensor decomposition. In this paper, we propose a simple and efficient algorithm to solve our RTD model. Moreover, tensor factorization-based image storage needs much less space than PCA based methods. We carry out extensive experiments to evaluate the proposed algorithm, and verify the robustness against image occlusions. Both numerical and visual results show that our RTD model is consistently better against the existence of outliers than previous tensor and PCA methods.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"33 1","pages":"2448-2455"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86216896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Visual Reranking through Weakly Supervised Multi-graph Learning 基于弱监督多图学习的视觉重排序
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.323
Cheng Deng, R. Ji, W. Liu, D. Tao, Xinbo Gao
Visual reranking has been widely deployed to refine the quality of conventional content-based image retrieval engines. The current trend lies in employing a crowd of retrieved results stemming from multiple feature modalities to boost the overall performance of visual reranking. However, a major challenge pertaining to current reranking methods is how to take full advantage of the complementary property of distinct feature modalities. Given a query image and one feature modality, a regular visual reranking framework treats the top-ranked images as pseudo positive instances which are inevitably noisy, difficult to reveal this complementary property, and thus lead to inferior ranking performance. This paper proposes a novel image reranking approach by introducing a Co-Regularized Multi-Graph Learning (Co-RMGL) framework, in which the intra-graph and inter-graph constraints are simultaneously imposed to encode affinities in a single graph and consistency across different graphs. Moreover, weakly supervised learning driven by image attributes is performed to denoise the pseudo-labeled instances, thereby highlighting the unique strength of individual feature modality. Meanwhile, such learning can yield a few anchors in graphs that vitally enable the alignment and fusion of multiple graphs. As a result, an edge weight matrix learned from the fused graph automatically gives the ordering to the initially retrieved results. We evaluate our approach on four benchmark image retrieval datasets, demonstrating a significant performance gain over the state-of-the-arts.
视觉重排序已被广泛应用于改进传统的基于内容的图像检索引擎的质量。目前的趋势是利用来自多种特征模态的大量检索结果来提高视觉重排序的整体性能。然而,当前的重排序方法面临的主要挑战是如何充分利用不同特征模态的互补性。给定一个查询图像和一个特征模态,常规的视觉重排序框架将排名靠前的图像视为伪正实例,这些伪正实例不可避免地存在噪声,难以揭示这种互补特性,从而导致排名性能较差。本文通过引入协同正则化多图学习(Co-RMGL)框架,提出了一种新的图像重排序方法,该框架同时施加图内约束和图间约束来编码单个图中的亲和力和不同图之间的一致性。此外,通过图像属性驱动的弱监督学习对伪标记实例进行去噪,从而突出单个特征模态的独特强度。同时,这种学习可以在图中产生一些锚点,这些锚点对多个图的对齐和融合至关重要。结果,从融合图中学习到的边权矩阵自动给出了初始检索结果的排序。我们在四个基准图像检索数据集上评估了我们的方法,证明了比最先进的性能有显著的提高。
{"title":"Visual Reranking through Weakly Supervised Multi-graph Learning","authors":"Cheng Deng, R. Ji, W. Liu, D. Tao, Xinbo Gao","doi":"10.1109/ICCV.2013.323","DOIUrl":"https://doi.org/10.1109/ICCV.2013.323","url":null,"abstract":"Visual reranking has been widely deployed to refine the quality of conventional content-based image retrieval engines. The current trend lies in employing a crowd of retrieved results stemming from multiple feature modalities to boost the overall performance of visual reranking. However, a major challenge pertaining to current reranking methods is how to take full advantage of the complementary property of distinct feature modalities. Given a query image and one feature modality, a regular visual reranking framework treats the top-ranked images as pseudo positive instances which are inevitably noisy, difficult to reveal this complementary property, and thus lead to inferior ranking performance. This paper proposes a novel image reranking approach by introducing a Co-Regularized Multi-Graph Learning (Co-RMGL) framework, in which the intra-graph and inter-graph constraints are simultaneously imposed to encode affinities in a single graph and consistency across different graphs. Moreover, weakly supervised learning driven by image attributes is performed to denoise the pseudo-labeled instances, thereby highlighting the unique strength of individual feature modality. Meanwhile, such learning can yield a few anchors in graphs that vitally enable the alignment and fusion of multiple graphs. As a result, an edge weight matrix learned from the fused graph automatically gives the ordering to the initially retrieved results. We evaluate our approach on four benchmark image retrieval datasets, demonstrating a significant performance gain over the state-of-the-arts.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"126 1","pages":"2600-2607"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77525433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 79
Decomposing Bag of Words Histograms 分解词袋直方图
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.45
Ankit Gandhi, Alahari Karteek, C. V. Jawahar
We aim to decompose a global histogram representation of an image into histograms of its associated objects and regions. This task is formulated as an optimization problem, given a set of linear classifiers, which can effectively discriminate the object categories present in the image. Our decomposition bypasses harder problems associated with accurately localizing and segmenting objects. We evaluate our method on a wide variety of composite histograms, and also compare it with MRF-based solutions. In addition to merely measuring the accuracy of decomposition, we also show the utility of the estimated object and background histograms for the task of image classification on the PASCAL VOC 2007 dataset.
我们的目标是将图像的全局直方图表示分解为其相关对象和区域的直方图。该任务被表述为一个优化问题,给定一组线性分类器,可以有效地区分图像中存在的对象类别。我们的分解绕过了与精确定位和分割对象相关的更难的问题。我们在多种复合直方图上评估了我们的方法,并将其与基于磁共振成像的解决方案进行了比较。除了测量分解的准确性外,我们还展示了估计的对象和背景直方图在PASCAL VOC 2007数据集上的图像分类任务中的效用。
{"title":"Decomposing Bag of Words Histograms","authors":"Ankit Gandhi, Alahari Karteek, C. V. Jawahar","doi":"10.1109/ICCV.2013.45","DOIUrl":"https://doi.org/10.1109/ICCV.2013.45","url":null,"abstract":"We aim to decompose a global histogram representation of an image into histograms of its associated objects and regions. This task is formulated as an optimization problem, given a set of linear classifiers, which can effectively discriminate the object categories present in the image. Our decomposition bypasses harder problems associated with accurately localizing and segmenting objects. We evaluate our method on a wide variety of composite histograms, and also compare it with MRF-based solutions. In addition to merely measuring the accuracy of decomposition, we also show the utility of the estimated object and background histograms for the task of image classification on the PASCAL VOC 2007 dataset.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"6 1","pages":"305-312"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91529516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Drosophila Embryo Stage Annotation Using Label Propagation 标签繁殖技术对果蝇胚胎阶段的注释
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.139
T. Kazmar, E. Kvon, A. Stark, Christoph H. Lampert
In this work we propose a system for automatic classification of Drosophila embryos into developmental stages. While the system is designed to solve an actual problem in biological research, we believe that the principle underlying it is interesting not only for biologists, but also for researchers in computer vision. The main idea is to combine two orthogonal sources of information: one is a classifier trained on strongly invariant features, which makes it applicable to images of very different conditions, but also leads to rather noisy predictions. The other is a label propagation step based on a more powerful similarity measure that however is only consistent within specific subsets of the data at a time. In our biological setup, the information sources are the shape and the staining patterns of embryo images. We show experimentally that while neither of the methods can be used by itself to achieve satisfactory results, their combination achieves prediction quality comparable to human performance.
在这项工作中,我们提出了一个系统的自动分类果蝇胚胎的发育阶段。虽然该系统旨在解决生物学研究中的实际问题,但我们相信,它背后的原理不仅对生物学家来说很有趣,而且对计算机视觉研究人员来说也很有趣。主要思想是结合两个正交的信息源:一个是在强不变特征上训练的分类器,这使得它适用于非常不同条件的图像,但也会导致相当嘈杂的预测。另一种是基于更强大的相似性度量的标签传播步骤,但是一次只能在特定的数据子集内保持一致。在我们的生物学设置中,信息源是胚胎图像的形状和染色模式。我们通过实验证明,虽然这两种方法都不能单独使用以获得令人满意的结果,但它们的组合可以实现与人类性能相当的预测质量。
{"title":"Drosophila Embryo Stage Annotation Using Label Propagation","authors":"T. Kazmar, E. Kvon, A. Stark, Christoph H. Lampert","doi":"10.1109/ICCV.2013.139","DOIUrl":"https://doi.org/10.1109/ICCV.2013.139","url":null,"abstract":"In this work we propose a system for automatic classification of Drosophila embryos into developmental stages. While the system is designed to solve an actual problem in biological research, we believe that the principle underlying it is interesting not only for biologists, but also for researchers in computer vision. The main idea is to combine two orthogonal sources of information: one is a classifier trained on strongly invariant features, which makes it applicable to images of very different conditions, but also leads to rather noisy predictions. The other is a label propagation step based on a more powerful similarity measure that however is only consistent within specific subsets of the data at a time. In our biological setup, the information sources are the shape and the staining patterns of embryo images. We show experimentally that while neither of the methods can be used by itself to achieve satisfactory results, their combination achieves prediction quality comparable to human performance.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"58 1","pages":"1089-1096"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91299244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Domain Adaptive Classification 领域自适应分类
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.324
Fatemeh Mirrashed, Mohammad Rastegari
We propose an unsupervised domain adaptation method that exploits intrinsic compact structures of categories across different domains using binary attributes. Our method directly optimizes for classification in the target domain. The key insight is finding attributes that are discriminative across categories and predictable across domains. We achieve a performance that significantly exceeds the state-of-the-art results on standard benchmarks. In fact, in many cases, our method reaches the same-domain performance, the upper bound, in unsupervised domain adaptation scenarios.
提出了一种利用二元属性在不同领域间利用类别的内在紧密结构的无监督领域自适应方法。我们的方法直接对目标域的分类进行优化。关键的洞察力是找到跨类别和跨领域可预测的属性。我们实现的性能大大超过了标准基准的最先进的结果。事实上,在许多情况下,我们的方法在无监督域自适应场景中达到了同域性能的上限。
{"title":"Domain Adaptive Classification","authors":"Fatemeh Mirrashed, Mohammad Rastegari","doi":"10.1109/ICCV.2013.324","DOIUrl":"https://doi.org/10.1109/ICCV.2013.324","url":null,"abstract":"We propose an unsupervised domain adaptation method that exploits intrinsic compact structures of categories across different domains using binary attributes. Our method directly optimizes for classification in the target domain. The key insight is finding attributes that are discriminative across categories and predictable across domains. We achieve a performance that significantly exceeds the state-of-the-art results on standard benchmarks. In fact, in many cases, our method reaches the same-domain performance, the upper bound, in unsupervised domain adaptation scenarios.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"49 1","pages":"2608-2615"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85701711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Robust Dictionary Learning by Error Source Decomposition 基于错误源分解的鲁棒字典学习
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.276
Zhuoyuan Chen, Ying Wu
Sparsity models have recently shown great promise in many vision tasks. Using a learned dictionary in sparsity models can in general outperform predefined bases in clean data. In practice, both training and testing data may be corrupted and contain noises and outliers. Although recent studies attempted to cope with corrupted data and achieved encouraging results in testing phase, how to handle corruption in training phase still remains a very difficult problem. In contrast to most existing methods that learn the dictionary from clean data, this paper is targeted at handling corruptions and outliers in training data for dictionary learning. We propose a general method to decompose the reconstructive residual into two components: a non-sparse component for small universal noises and a sparse component for large outliers, respectively. In addition, further analysis reveals the connection between our approach and the ``partial'' dictionary learning approach, updating only part of the prototypes (or informative code words) with remaining (or noisy code words) fixed. Experiments on synthetic data as well as real applications have shown satisfactory performance of this new robust dictionary learning approach.
稀疏模型最近在许多视觉任务中显示出巨大的前景。在稀疏模型中使用学习过的字典通常可以优于干净数据中的预定义基。在实际应用中,训练数据和测试数据都可能被破坏,并且包含噪声和异常值。虽然近年来的研究试图处理损坏的数据,并在测试阶段取得了令人鼓舞的成果,但如何在训练阶段处理损坏仍然是一个非常困难的问题。与大多数现有的从干净数据中学习字典的方法相比,本文的目标是处理字典学习训练数据中的腐败和异常值。我们提出了一种将重构残差分解为两个分量的通用方法:小通用噪声的非稀疏分量和大异常值的稀疏分量。此外,进一步的分析揭示了我们的方法与“部分”字典学习方法之间的联系,该方法仅更新部分原型(或信息码字),其余(或噪声码字)固定。在合成数据和实际应用上的实验表明,这种新的鲁棒字典学习方法具有令人满意的性能。
{"title":"Robust Dictionary Learning by Error Source Decomposition","authors":"Zhuoyuan Chen, Ying Wu","doi":"10.1109/ICCV.2013.276","DOIUrl":"https://doi.org/10.1109/ICCV.2013.276","url":null,"abstract":"Sparsity models have recently shown great promise in many vision tasks. Using a learned dictionary in sparsity models can in general outperform predefined bases in clean data. In practice, both training and testing data may be corrupted and contain noises and outliers. Although recent studies attempted to cope with corrupted data and achieved encouraging results in testing phase, how to handle corruption in training phase still remains a very difficult problem. In contrast to most existing methods that learn the dictionary from clean data, this paper is targeted at handling corruptions and outliers in training data for dictionary learning. We propose a general method to decompose the reconstructive residual into two components: a non-sparse component for small universal noises and a sparse component for large outliers, respectively. In addition, further analysis reveals the connection between our approach and the ``partial'' dictionary learning approach, updating only part of the prototypes (or informative code words) with remaining (or noisy code words) fixed. Experiments on synthetic data as well as real applications have shown satisfactory performance of this new robust dictionary learning approach.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"122 1","pages":"2216-2223"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86056897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Tracking via Robust Multi-task Multi-view Joint Sparse Representation 基于鲁棒多任务多视图联合稀疏表示的跟踪
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.86
Zhibin Hong, Xue Mei, D. Prokhorov, D. Tao
Combining multiple observation views has proven beneficial for tracking. In this paper, we cast tracking as a novel multi-task multi-view sparse learning problem and exploit the cues from multiple views including various types of visual features, such as intensity, color, and edge, where each feature observation can be sparsely represented by a linear combination of atoms from an adaptive feature dictionary. The proposed method is integrated in a particle filter framework where every view in each particle is regarded as an individual task. We jointly consider the underlying relationship between tasks across different views and different particles, and tackle it in a unified robust multi-task formulation. In addition, to capture the frequently emerging outlier tasks, we decompose the representation matrix to two collaborative components which enable a more robust and accurate approximation. We show that the proposed formulation can be efficiently solved using the Accelerated Proximal Gradient method with a small number of closed-form updates. The presented tracker is implemented using four types of features and is tested on numerous benchmark video sequences. Both the qualitative and quantitative results demonstrate the superior performance of the proposed approach compared to several state-of-the-art trackers.
结合多个观察视图已被证明有利于跟踪。在本文中,我们将跟踪作为一个新的多任务多视图稀疏学习问题,并利用来自多个视图的线索,包括各种类型的视觉特征,如强度,颜色和边缘,其中每个特征观察可以由自适应特征字典中的原子的线性组合稀疏表示。该方法集成在粒子过滤框架中,每个粒子中的每个视图都被视为一个单独的任务。我们共同考虑了跨不同视图和不同粒子的任务之间的潜在关系,并在一个统一的鲁棒多任务公式中处理它。此外,为了捕获频繁出现的异常任务,我们将表示矩阵分解为两个协作组件,从而实现更鲁棒和准确的近似。我们表明,使用加速近端梯度方法可以有效地求解该公式,只需少量的封闭形式更新。所提出的跟踪器使用四种类型的特征实现,并在许多基准视频序列上进行了测试。定性和定量结果都表明,与几种最先进的跟踪器相比,所提出的方法具有优越的性能。
{"title":"Tracking via Robust Multi-task Multi-view Joint Sparse Representation","authors":"Zhibin Hong, Xue Mei, D. Prokhorov, D. Tao","doi":"10.1109/ICCV.2013.86","DOIUrl":"https://doi.org/10.1109/ICCV.2013.86","url":null,"abstract":"Combining multiple observation views has proven beneficial for tracking. In this paper, we cast tracking as a novel multi-task multi-view sparse learning problem and exploit the cues from multiple views including various types of visual features, such as intensity, color, and edge, where each feature observation can be sparsely represented by a linear combination of atoms from an adaptive feature dictionary. The proposed method is integrated in a particle filter framework where every view in each particle is regarded as an individual task. We jointly consider the underlying relationship between tasks across different views and different particles, and tackle it in a unified robust multi-task formulation. In addition, to capture the frequently emerging outlier tasks, we decompose the representation matrix to two collaborative components which enable a more robust and accurate approximation. We show that the proposed formulation can be efficiently solved using the Accelerated Proximal Gradient method with a small number of closed-form updates. The presented tracker is implemented using four types of features and is tested on numerous benchmark video sequences. Both the qualitative and quantitative results demonstrate the superior performance of the proposed approach compared to several state-of-the-art trackers.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"14 1","pages":"649-656"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73069484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 158
Anchored Neighborhood Regression for Fast Example-Based Super-Resolution 基于实例的快速超分辨率锚定邻域回归
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.241
R. Timofte, V. Smet, L. Gool
Recently there have been significant advances in image up scaling or image super-resolution based on a dictionary of low and high resolution exemplars. The running time of the methods is often ignored despite the fact that it is a critical factor for real applications. This paper proposes fast super-resolution methods while making no compromise on quality. First, we support the use of sparse learned dictionaries in combination with neighbor embedding methods. In this case, the nearest neighbors are computed using the correlation with the dictionary atoms rather than the Euclidean distance. Moreover, we show that most of the current approaches reach top performance for the right parameters. Second, we show that using global collaborative coding has considerable speed advantages, reducing the super-resolution mapping to a precomputed projective matrix. Third, we propose the anchored neighborhood regression. That is to anchor the neighborhood embedding of a low resolution patch to the nearest atom in the dictionary and to precompute the corresponding embedding matrix. These proposals are contrasted with current state-of-the-art methods on standard images. We obtain similar or improved quality and one or two orders of magnitude speed improvements.
近年来,在基于低分辨率和高分辨率样本词典的图像放大或图像超分辨率方面取得了重大进展。方法的运行时间经常被忽略,尽管它是实际应用程序的一个关键因素。本文提出了一种不影响图像质量的快速超分辨方法。首先,我们支持将稀疏学习字典与邻居嵌入方法结合使用。在这种情况下,使用与字典原子的相关性而不是欧几里得距离来计算最近邻。此外,我们还表明,对于正确的参数,大多数当前方法都能达到最佳性能。其次,我们发现使用全局协同编码具有相当大的速度优势,将超分辨率映射减少到预先计算的投影矩阵。第三,提出锚定邻域回归。即将低分辨率贴片的邻域嵌入锚定到字典中最近的原子上,并预先计算相应的嵌入矩阵。这些建议与目前最先进的标准图像方法进行了对比。我们获得了类似或改进的质量和一到两个数量级的速度改进。
{"title":"Anchored Neighborhood Regression for Fast Example-Based Super-Resolution","authors":"R. Timofte, V. Smet, L. Gool","doi":"10.1109/ICCV.2013.241","DOIUrl":"https://doi.org/10.1109/ICCV.2013.241","url":null,"abstract":"Recently there have been significant advances in image up scaling or image super-resolution based on a dictionary of low and high resolution exemplars. The running time of the methods is often ignored despite the fact that it is a critical factor for real applications. This paper proposes fast super-resolution methods while making no compromise on quality. First, we support the use of sparse learned dictionaries in combination with neighbor embedding methods. In this case, the nearest neighbors are computed using the correlation with the dictionary atoms rather than the Euclidean distance. Moreover, we show that most of the current approaches reach top performance for the right parameters. Second, we show that using global collaborative coding has considerable speed advantages, reducing the super-resolution mapping to a precomputed projective matrix. Third, we propose the anchored neighborhood regression. That is to anchor the neighborhood embedding of a low resolution patch to the nearest atom in the dictionary and to precompute the corresponding embedding matrix. These proposals are contrasted with current state-of-the-art methods on standard images. We obtain similar or improved quality and one or two orders of magnitude speed improvements.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"22 1","pages":"1920-1927"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74735379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1191
Multi-attributed Dictionary Learning for Sparse Coding 稀疏编码的多属性字典学习
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.145
Chen-Kuo Chiang, Te-Feng Su, Chih Yen, S. Lai
We present a multi-attributed dictionary learning algorithm for sparse coding. Considering training samples with multiple attributes, a new distance matrix is proposed by jointly incorporating data and attribute similarities. Then, an objective function is presented to learn category-dependent dictionaries that are compact (closeness of dictionary atoms based on data distance and attribute similarity), reconstructive (low reconstruction error with correct dictionary) and label-consistent (encouraging the labels of dictionary atoms to be similar). We have demonstrated our algorithm on action classification and face recognition tasks on several publicly available datasets. Experimental results with improved performance over previous dictionary learning methods are shown to validate the effectiveness of the proposed algorithm.
提出了一种稀疏编码的多属性字典学习算法。针对具有多个属性的训练样本,将数据和属性相似度结合,提出了一个新的距离矩阵。然后,提出了一个目标函数来学习紧凑型(基于数据距离和属性相似性的字典原子的紧密性)、重构型(使用正确的字典的低重构误差)和标签一致性(鼓励字典原子的标签相似)的类别依赖字典。我们已经在几个公开可用的数据集上演示了我们的算法在动作分类和人脸识别任务上的应用。实验结果表明,该算法的有效性优于以往的字典学习方法。
{"title":"Multi-attributed Dictionary Learning for Sparse Coding","authors":"Chen-Kuo Chiang, Te-Feng Su, Chih Yen, S. Lai","doi":"10.1109/ICCV.2013.145","DOIUrl":"https://doi.org/10.1109/ICCV.2013.145","url":null,"abstract":"We present a multi-attributed dictionary learning algorithm for sparse coding. Considering training samples with multiple attributes, a new distance matrix is proposed by jointly incorporating data and attribute similarities. Then, an objective function is presented to learn category-dependent dictionaries that are compact (closeness of dictionary atoms based on data distance and attribute similarity), reconstructive (low reconstruction error with correct dictionary) and label-consistent (encouraging the labels of dictionary atoms to be similar). We have demonstrated our algorithm on action classification and face recognition tasks on several publicly available datasets. Experimental results with improved performance over previous dictionary learning methods are shown to validate the effectiveness of the proposed algorithm.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"9 1","pages":"1137-1144"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74963844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
期刊
2013 IEEE International Conference on Computer Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1