2011 International Conference on Computer Vision最新文献

英文中文

Spatio-temporal clustering of probabilistic region trajectories 概率区域轨迹的时空聚类

2011 International Conference on Computer Vision

Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126438

Fabio Galasso, M. Iwasaki, K. Nobori, R. Cipolla

We propose a novel model for the spatio-temporal clustering of trajectories based on motion, which applies to challenging street-view video sequences of pedestrians captured by a mobile camera. A key contribution of our work is the introduction of novel probabilistic region trajectories, motivated by the non-repeatability of segmentation of frames in a video sequence. Hierarchical image segments are obtained by using a state-of-the-art hierarchical segmentation algorithm, and connected from adjacent frames in a directed acyclic graph. The region trajectories and measures of confidence are extracted from this graph using a dynamic programming-based optimisation. Our second main contribution is a Bayesian framework with a twofold goal: to learn the optimal, in a maximum likelihood sense, Random Forests classifier of motion patterns based on video features, and construct a unique graph from region trajectories of different frames, lengths and hierarchical levels. Finally, we demonstrate the use of Isomap for effective spatio-temporal clustering of the region trajectories of pedestrians. We support our claims with experimental results on new and existing challenging video sequences.

我们提出了一种新的基于运动的轨迹时空聚类模型，该模型适用于移动摄像机拍摄的行人街景视频序列。我们工作的一个关键贡献是引入了新的概率区域轨迹，其动机是视频序列中帧的分割不可重复。采用最先进的分层分割算法获得分层图像片段，并在有向无环图中从相邻帧连接起来。使用基于动态规划的优化从该图中提取区域轨迹和置信度度量。我们的第二个主要贡献是一个具有双重目标的贝叶斯框架:在最大似然意义上学习基于视频特征的运动模式的随机森林分类器，并从不同帧、长度和层次的区域轨迹构建一个独特的图。最后，我们展示了使用Isomap对行人的区域轨迹进行有效的时空聚类。我们用新的和现有的具有挑战性的视频序列的实验结果支持我们的主张。

{"title":"Spatio-temporal clustering of probabilistic region trajectories","authors":"Fabio Galasso, M. Iwasaki, K. Nobori, R. Cipolla","doi":"10.1109/ICCV.2011.6126438","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126438","url":null,"abstract":"We propose a novel model for the spatio-temporal clustering of trajectories based on motion, which applies to challenging street-view video sequences of pedestrians captured by a mobile camera. A key contribution of our work is the introduction of novel probabilistic region trajectories, motivated by the non-repeatability of segmentation of frames in a video sequence. Hierarchical image segments are obtained by using a state-of-the-art hierarchical segmentation algorithm, and connected from adjacent frames in a directed acyclic graph. The region trajectories and measures of confidence are extracted from this graph using a dynamic programming-based optimisation. Our second main contribution is a Bayesian framework with a twofold goal: to learn the optimal, in a maximum likelihood sense, Random Forests classifier of motion patterns based on video features, and construct a unique graph from region trajectories of different frames, lengths and hierarchical levels. Finally, we demonstrate the use of Isomap for effective spatio-temporal clustering of the region trajectories of pedestrians. We support our claims with experimental results on new and existing challenging video sequences.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"1 1","pages":"1738-1745"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73565378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Density-aware person detection and tracking in crowds 人群中具有密度意识的人员检测和跟踪

2011 International Conference on Computer Vision

Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126526

Mikel D. Rodriguez, I. Laptev, Josef Sivic, Jean-Yves Audibert

We address the problem of person detection and tracking in crowded video scenes. While the detection of individual objects has been improved significantly over the recent years, crowd scenes remain particularly challenging for the detection and tracking tasks due to heavy occlusions, high person densities and significant variation in people's appearance. To address these challenges, we propose to leverage information on the global structure of the scene and to resolve all detections jointly. In particular, we explore constraints imposed by the crowd density and formulate person detection as the optimization of a joint energy function combining crowd density estimation and the localization of individual people. We demonstrate how the optimization of such an energy function significantly improves person detection and tracking in crowds. We validate our approach on a challenging video dataset of crowded scenes.

我们解决了拥挤视频场景中的人员检测和跟踪问题。虽然近年来对单个物体的检测已经有了很大的提高，但由于严重的遮挡、高密度的人群以及人们外表的显著变化，人群场景的检测和跟踪任务仍然具有特别大的挑战性。为了应对这些挑战，我们建议利用关于场景全局结构的信息，并联合解决所有检测。特别是，我们探索了人群密度所施加的约束，并将人员检测定义为将人群密度估计与个体定位相结合的联合能量函数的优化。我们演示了这种能量函数的优化如何显着提高人群中的人员检测和跟踪。我们在一个具有挑战性的拥挤场景视频数据集上验证了我们的方法。

引用次数: 355

An adaptive coupled-layer visual model for robust visual tracking 一种鲁棒视觉跟踪的自适应耦合层视觉模型

2011 International Conference on Computer Vision

Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126390

Luka Cehovin, M. Kristan, A. Leonardis

This paper addresses the problem of tracking objects which undergo rapid and significant appearance changes. We propose a novel coupled-layer visual model that combines the target's global and local appearance. The local layer in this model is a set of local patches that geometrically constrain the changes in the target's appearance. This layer probabilistically adapts to the target's geometric deformation, while its structure is updated by removing and adding the local patches. The addition of the patches is constrained by the global layer that probabilistically models target's global visual properties such as color, shape and apparent local motion. The global visual properties are updated during tracking using the stable patches from the local layer. By this coupled constraint paradigm between the adaptation of the global and the local layer, we achieve a more robust tracking through significant appearance changes. Indeed, the experimental results on challenging sequences confirm that our tracker outperforms the related state-of-the-art trackers by having smaller failure rate as well as better accuracy.

本文解决了快速和显著的外观变化的目标的跟踪问题。我们提出了一种新的结合目标全局和局部外观的耦合层视觉模型。该模型中的局部层是一组局部斑块，这些局部斑块在几何上约束目标外观的变化。该层概率地适应目标的几何变形，同时通过移除和添加局部补丁来更新其结构。补丁的添加受到全局层的限制，全局层对目标的全局视觉属性(如颜色、形状和明显的局部运动)进行概率建模。全局视觉属性在跟踪过程中使用来自局部层的稳定补丁进行更新。通过这种全局层和局部层之间的耦合约束范式，我们通过显著的外观变化实现了更强的鲁棒跟踪。事实上，在具有挑战性的序列上的实验结果证实，我们的跟踪器具有更小的故障率和更高的准确性，优于相关的最先进的跟踪器。

引用次数: 107

Trajectory reconstruction from non-overlapping surveillance cameras with relative depth ordering constraints 基于相对深度排序约束的非重叠监控摄像机轨迹重建

2011 International Conference on Computer Vision

Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126334

B. Micusík

We present a method for reconstructing a trajectory of an object moving in front of non-overlapping fully or partially calibrated cameras. The non-overlapping setup turns that problem ill-posed as no point correspondences can be established which are necessary for the well known point triangulation. The proposed solution instead builds on the assumption of trajectory smoothness and depth ordering prior information. We propose a novel formulation with a consistent minimization criterion and a way to utilize the depth ordering prior reflected by the size change of a bounding box associated to an image point being tracked. Reconstructing trajectory minimizing the trajectory smoothness, its re-projection error and employing the depth priors is casted as the Second Order Cone Program yielding a global optimum. The new formulation together with the proposed depth prior significantly improves the trajectory reconstruction in sense of accuracy and topology, and speeds up the solver. Synthetic and real experiments validate the feasibility of the proposed approach.

我们提出了一种重建在非重叠的完全或部分校准相机前运动的物体轨迹的方法。不重叠的设置使得没有点对应的病态问题可以建立，而这是众所周知的点三角剖分所必需的。该方法基于轨迹平滑和深度排序先验信息的假设。我们提出了一种新的公式，具有一致的最小化标准和一种利用深度排序先验的方法，这种先验是由与被跟踪的图像点相关的边界框的大小变化所反映的。以最小的轨迹平滑度和重投影误差重构轨迹，利用深度先验将其转化为二阶锥规划，得到全局最优解。新公式与深度先验在精度和拓扑意义上显著提高了轨迹重建，加快了求解速度。综合实验和实际实验验证了该方法的可行性。

引用次数: 7

Video from a single coded exposure photograph using a learned over-complete dictionary 视频从一个单一的编码曝光照片使用学习过完整的字典

2011 International Conference on Computer Vision

Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126254

Y. Hitomi, Jinwei Gu, Mohit Gupta, T. Mitsunaga, S. Nayar

Cameras face a fundamental tradeoff between the spatial and temporal resolution - digital still cameras can capture images with high spatial resolution, but most high-speed video cameras suffer from low spatial resolution. It is hard to overcome this tradeoff without incurring a significant increase in hardware costs. In this paper, we propose techniques for sampling, representing and reconstructing the space-time volume in order to overcome this tradeoff. Our approach has two important distinctions compared to previous works: (1) we achieve sparse representation of videos by learning an over-complete dictionary on video patches, and (2) we adhere to practical constraints on sampling scheme which is imposed by architectures of present image sensor devices. Consequently, our sampling scheme can be implemented on image sensors by making a straightforward modification to the control unit. To demonstrate the power of our approach, we have implemented a prototype imaging system with per-pixel coded exposure control using a liquid crystal on silicon (LCoS) device. Using both simulations and experiments on a wide range of scenes, we show that our method can effectively reconstruct a video from a single image maintaining high spatial resolution.

相机面临着空间分辨率和时间分辨率之间的基本权衡——数码相机可以捕捉高空间分辨率的图像，但大多数高速摄像机的空间分辨率都很低。很难在不显著增加硬件成本的情况下克服这种权衡。在本文中，我们提出了采样，表示和重构时空体积的技术，以克服这种权衡。与以前的工作相比，我们的方法有两个重要的区别:(1)我们通过学习视频补丁上的过完备字典来实现视频的稀疏表示，(2)我们坚持由当前图像传感器设备架构施加的采样方案的实际约束。因此，通过对控制单元进行简单的修改，我们的采样方案可以在图像传感器上实现。为了证明我们方法的强大功能，我们已经实现了一个原型成像系统，该系统使用硅上液晶(LCoS)器件实现了每像素编码曝光控制。通过对各种场景的模拟和实验，我们证明了我们的方法可以有效地从单个图像重建视频，并保持高空间分辨率。

{"title":"Video from a single coded exposure photograph using a learned over-complete dictionary","authors":"Y. Hitomi, Jinwei Gu, Mohit Gupta, T. Mitsunaga, S. Nayar","doi":"10.1109/ICCV.2011.6126254","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126254","url":null,"abstract":"Cameras face a fundamental tradeoff between the spatial and temporal resolution - digital still cameras can capture images with high spatial resolution, but most high-speed video cameras suffer from low spatial resolution. It is hard to overcome this tradeoff without incurring a significant increase in hardware costs. In this paper, we propose techniques for sampling, representing and reconstructing the space-time volume in order to overcome this tradeoff. Our approach has two important distinctions compared to previous works: (1) we achieve sparse representation of videos by learning an over-complete dictionary on video patches, and (2) we adhere to practical constraints on sampling scheme which is imposed by architectures of present image sensor devices. Consequently, our sampling scheme can be implemented on image sensors by making a straightforward modification to the control unit. To demonstrate the power of our approach, we have implemented a prototype imaging system with per-pixel coded exposure control using a liquid crystal on silicon (LCoS) device. Using both simulations and experiments on a wide range of scenes, we show that our method can effectively reconstruct a video from a single image maintaining high spatial resolution.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"35 1","pages":"287-294"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80006886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 233

Random ensemble metrics for object recognition 用于对象识别的随机集成度量

2011 International Conference on Computer Vision

Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126466

Tatsuo Kozakaya, S. Ito, Susumu Kubota

This paper presents a novel and generic approach for metric learning, random ensemble metrics (REMetric). To improve generalization performance, we introduce the concept of ensemble learning to the metric learning scheme. Unlike previous methods, our method does not optimize the global objective function for the whole training data. It learns multiple discriminative projection vectors obtained from linear support vector machines (SVM) using randomly subsampled training data. The final metric matrix is then obtained by integrating these vectors. As a result of using SVM, the learned metric has an excellent scalability for the dimensionality of features. Therefore, it does not require any prior dimensionality reduction techniques such as PCA. Moreover, our method allows us to unify dimensionality reduction and metric learning by controlling the number of the projection vectors. We demonstrate through experiments, that our method can avoid overfitting even though a relatively small number of training data is provided. The experiments are performed with three different datasets; the Viewpoint Invariant Pedestrian Recognition (VIPeR) dataset, the Labeled Face in the Wild (LFW) dataset and the Oxford 102 category flower dataset. The results show that our method achieves equivalent or superior performance compared to existing state-of-the-art metric learning methods.

本文提出了一种新的通用度量学习方法——随机集成度量(REMetric)。为了提高泛化性能，我们在度量学习方案中引入了集成学习的概念。与以前的方法不同，我们的方法没有对整个训练数据进行全局目标函数优化。该算法利用随机下采样的训练数据，学习线性支持向量机(SVM)得到的多个判别投影向量。然后通过对这些向量积分得到最终的度量矩阵。由于使用支持向量机，学习到的度量对特征的维数有很好的可扩展性。因此，它不需要任何先前的降维技术，如PCA。此外，我们的方法允许我们通过控制投影向量的数量来统一降维和度量学习。我们通过实验证明，即使提供相对较少的训练数据，我们的方法也可以避免过拟合。实验用三种不同的数据集进行;视点不变行人识别(VIPeR)数据集、野生标记脸(LFW)数据集和牛津102分类花数据集。结果表明，与现有的最先进的度量学习方法相比，我们的方法达到了相当或更好的性能。

{"title":"Random ensemble metrics for object recognition","authors":"Tatsuo Kozakaya, S. Ito, Susumu Kubota","doi":"10.1109/ICCV.2011.6126466","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126466","url":null,"abstract":"This paper presents a novel and generic approach for metric learning, random ensemble metrics (REMetric). To improve generalization performance, we introduce the concept of ensemble learning to the metric learning scheme. Unlike previous methods, our method does not optimize the global objective function for the whole training data. It learns multiple discriminative projection vectors obtained from linear support vector machines (SVM) using randomly subsampled training data. The final metric matrix is then obtained by integrating these vectors. As a result of using SVM, the learned metric has an excellent scalability for the dimensionality of features. Therefore, it does not require any prior dimensionality reduction techniques such as PCA. Moreover, our method allows us to unify dimensionality reduction and metric learning by controlling the number of the projection vectors. We demonstrate through experiments, that our method can avoid overfitting even though a relatively small number of training data is provided. The experiments are performed with three different datasets; the Viewpoint Invariant Pedestrian Recognition (VIPeR) dataset, the Labeled Face in the Wild (LFW) dataset and the Oxford 102 category flower dataset. The results show that our method achieves equivalent or superior performance compared to existing state-of-the-art metric learning methods.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"12 1","pages":"1959-1966"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80199053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

Isotonic CCA for sequence alignment and activity recognition 序列比对和活动识别的等渗CCA

2011 International Conference on Computer Vision

Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126545

Shahriar Shariat, V. Pavlovic

This paper presents an approach for sequence alignment based on canonical correlation analysis(CCA). We show that a novel set of constraints imposed on traditional CCA leads to canonical solutions with the time warping property, i.e., non-decreasing monotonicity in time. This formulation generalizes the more traditional dynamic time warping (DTW) solutions to cases where the alignment is accomplished on arbitrary subsequence segments, optimally determined from data, instead on individual sequence samples. We then introduce a robust and efficient algorithm to find such alignments using non-negative least squares reductions. Experimental results show that this new method, when applied to MOCAP activity recognition problems, can yield improved recognition accuracy.

提出了一种基于典型相关分析(CCA)的序列比对方法。我们证明了在传统CCA上施加的一组新的约束导致正则解具有时间翘曲性质，即时间上的非减少单调性。该公式将更传统的动态时间翘曲(DTW)解决方案推广到在任意子序列段上完成对齐的情况，从数据中最佳地确定，而不是在单个序列样本上。然后，我们引入了一种鲁棒且高效的算法，使用非负最小二乘约简来找到这样的对齐。实验结果表明，将该方法应用于MOCAP动作识别问题，可以提高识别精度。

引用次数: 37

Robust unsupervised motion pattern inference from video and applications 鲁棒无监督运动模式推断视频和应用

2011 International Conference on Computer Vision

Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126308

Xuemei Zhao, G. Medioni

We propose an unsupervised learning framework to infer motion patterns in videos and in turn use them to improve tracking of moving objects in sequences from static cameras. Based on tracklets, we use a manifold learning method Tensor Voting to infer the local geometric structures in (x, y) space, and embed tracklet points into (x, y, θ) space, where θ represents motion direction. In this space, points automatically form intrinsic manifold structures, each of which corresponds to a motion pattern. To define each group, a novel robustmanifold grouping algorithm is proposed. Tensor Voting is performed to provide multiple geometric cues which formulate multiple similarity kernels between any pair of points, and a spectral clustering technique is used in this multiple kernel setting. The grouping algorithm achieves better performance than state-of-the-art methods in our applications. Extracted motion patterns can then be used as a prior to improve the performance of any object tracker. It is especially useful to reduce false alarms and ID switches. Experiments are performed on challenging real-world sequences, and a quantitative analysis of the results shows the framework effectively improves state-of-the-art tracker.

我们提出了一个无监督学习框架来推断视频中的运动模式，并反过来使用它们来改进对静态摄像机序列中运动物体的跟踪。基于轨迹点，我们使用流形学习方法Tensor Voting来推断(x, y)空间中的局部几何结构，并将轨迹点嵌入到(x, y， θ)空间中，其中θ表示运动方向。在这个空间中，点自动形成内在的流形结构，每个流形结构对应一个运动模式。为了定义每个组，提出了一种新的鲁棒流形分组算法。执行张量投票以提供多个几何线索，这些线索在任意对点之间形成多个相似核，并在此多核设置中使用谱聚类技术。在我们的应用中，分组算法比最先进的方法具有更好的性能。然后，提取的运动模式可以用作改进任何目标跟踪器性能的先验。它对减少误报和ID开关特别有用。在具有挑战性的现实世界序列上进行了实验，结果的定量分析表明，该框架有效地改进了最先进的跟踪器。

{"title":"Robust unsupervised motion pattern inference from video and applications","authors":"Xuemei Zhao, G. Medioni","doi":"10.1109/ICCV.2011.6126308","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126308","url":null,"abstract":"We propose an unsupervised learning framework to infer motion patterns in videos and in turn use them to improve tracking of moving objects in sequences from static cameras. Based on tracklets, we use a manifold learning method Tensor Voting to infer the local geometric structures in (x, y) space, and embed tracklet points into (x, y, θ) space, where θ represents motion direction. In this space, points automatically form intrinsic manifold structures, each of which corresponds to a motion pattern. To define each group, a novel robustmanifold grouping algorithm is proposed. Tensor Voting is performed to provide multiple geometric cues which formulate multiple similarity kernels between any pair of points, and a spectral clustering technique is used in this multiple kernel setting. The grouping algorithm achieves better performance than state-of-the-art methods in our applications. Extracted motion patterns can then be used as a prior to improve the performance of any object tracker. It is especially useful to reduce false alarms and ID switches. Experiments are performed on challenging real-world sequences, and a quantitative analysis of the results shows the framework effectively improves state-of-the-art tracker.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"27 1","pages":"715-722"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81533417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 31

Annotator rationales for visual recognition 用于视觉识别的注释器原理

2011 International Conference on Computer Vision

Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126394

Jeff Donahue, K. Grauman

Traditional supervised visual learning simply asks annotators “what” label an image should have. We propose an approach for image classification problems requiring subjective judgment that also asks “why”, and uses that information to enrich the learned model. We develop two forms of visual annotator rationales: in the first, the annotator highlights the spatial region of interest he found most influential to the label selected, and in the second, he comments on the visual attributes that were most important. For either case, we show how to map the response to synthetic contrast examples, and then exploit an existing large-margin learning technique to refine the decision boundary accordingly. Results on multiple scene categorization and human attractiveness tasks show the promise of our approach, which can more accurately learn complex categories with the explanations behind the label choices.

传统的监督式视觉学习只是问注释者图像应该有“什么”标签。我们提出了一种需要主观判断的图像分类问题的方法，该方法也会问“为什么”，并使用该信息来丰富学习模型。我们开发了两种形式的视觉注释器原理:在第一种形式中，注释器突出显示他发现对所选标签最有影响的感兴趣的空间区域，在第二种形式中，他对最重要的视觉属性进行评论。对于这两种情况，我们展示了如何将响应映射到综合对比示例，然后利用现有的大边际学习技术来相应地改进决策边界。在多场景分类和人类吸引力任务上的结果显示了我们的方法的前景，该方法可以更准确地学习复杂的类别，并提供标签选择背后的解释。

引用次数: 76

Fisher Discrimination Dictionary Learning for sparse representation 稀疏表示的Fisher判别字典学习

2011 International Conference on Computer Vision

Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126286

Meng Yang, Lei Zhang, Xiangchu Feng, D. Zhang

Sparse representation based classification has led to interesting image recognition results, while the dictionary used for sparse coding plays a key role in it. This paper presents a novel dictionary learning (DL) method to improve the pattern classification performance. Based on the Fisher discrimination criterion, a structured dictionary, whose dictionary atoms have correspondence to the class labels, is learned so that the reconstruction error after sparse coding can be used for pattern classification. Meanwhile, the Fisher discrimination criterion is imposed on the coding coefficients so that they have small within-class scatter but big between-class scatter. A new classification scheme associated with the proposed Fisher discrimination DL (FDDL) method is then presented by using both the discriminative information in the reconstruction error and sparse coding coefficients. The proposed FDDL is extensively evaluated on benchmark image databases in comparison with existing sparse representation and DL based classification methods.

基于稀疏表示的分类导致了有趣的图像识别结果，而用于稀疏编码的字典在其中起着关键作用。本文提出了一种新的字典学习方法来提高模式分类性能。基于Fisher判别准则，学习到一个字典原子与类标签对应的结构化字典，利用稀疏编码后的重构误差进行模式分类。同时，对编码系数施加Fisher判别准则，使编码系数类内散点小，类间散点大。然后利用重构误差中的判别信息和稀疏编码系数，提出了一种与Fisher判别DL (FDDL)方法相关联的分类方案。与现有的稀疏表示和基于深度学习的分类方法相比，本文提出的FDDL在基准图像数据库上进行了广泛的评估。

引用次数: 972

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2011 International Conference on Computer Vision

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀