SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition

2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06) Pub Date : 2006-06-17 DOI:10.1109/CVPR.2006.301

Haotong Zhang, A. Berg, M. Maire, Jitendra Malik

{"title":"SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition","authors":"Haotong Zhang, A. Berg, M. Maire, Jitendra Malik","doi":"10.1109/CVPR.2006.301","DOIUrl":null,"url":null,"abstract":"We consider visual category recognition in the framework of measuring similarities, or equivalently perceptual distances, to prototype examples of categories. This approach is quite flexible, and permits recognition based on color, texture, and particularly shape, in a homogeneous framework. While nearest neighbor classifiers are natural in this setting, they suffer from the problem of high variance (in bias-variance decomposition) in the case of limited sampling. Alternatively, one could use support vector machines but they involve time-consuming optimization and computation of pairwise distances. We propose a hybrid of these two methods which deals naturally with the multiclass setting, has reasonable computational complexity both in training and at run time, and yields excellent results in practice. The basic idea is to find close neighbors to a query sample and train a local support vector machine that preserves the distance function on the collection of neighbors. Our method can be applied to large, multiclass data sets for which it outperforms nearest neighbor and support vector machines, and remains efficient when the problem becomes intractable for support vector machines. A wide variety of distance functions can be used and our experiments show state-of-the-art performance on a number of benchmark data sets for shape and texture classification (MNIST, USPS, CUReT) and object recognition (Caltech- 101). On Caltech-101 we achieved a correct classification rate of 59.05%(±0.56%) at 15 training images per class, and 66.23%(±0.48%) at 30 training images.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1334","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2006.301","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1334

Abstract

We consider visual category recognition in the framework of measuring similarities, or equivalently perceptual distances, to prototype examples of categories. This approach is quite flexible, and permits recognition based on color, texture, and particularly shape, in a homogeneous framework. While nearest neighbor classifiers are natural in this setting, they suffer from the problem of high variance (in bias-variance decomposition) in the case of limited sampling. Alternatively, one could use support vector machines but they involve time-consuming optimization and computation of pairwise distances. We propose a hybrid of these two methods which deals naturally with the multiclass setting, has reasonable computational complexity both in training and at run time, and yields excellent results in practice. The basic idea is to find close neighbors to a query sample and train a local support vector machine that preserves the distance function on the collection of neighbors. Our method can be applied to large, multiclass data sets for which it outperforms nearest neighbor and support vector machines, and remains efficient when the problem becomes intractable for support vector machines. A wide variety of distance functions can be used and our experiments show state-of-the-art performance on a number of benchmark data sets for shape and texture classification (MNIST, USPS, CUReT) and object recognition (Caltech- 101). On Caltech-101 we achieved a correct classification rate of 59.05%(±0.56%) at 15 training images per class, and 66.23%(±0.48%) at 30 training images.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

SVM-KNN:视觉类别识别的判别最近邻分类

我们在测量相似性的框架中考虑视觉类别识别，或者等效的感知距离，到类别的原型例子。这种方法非常灵活，允许在同质框架中基于颜色、纹理、特别是形状进行识别。虽然最近邻分类器在这种情况下是自然的，但在有限采样的情况下，它们存在高方差(在偏差-方差分解中)的问题。另外，我们也可以使用支持向量机，但这涉及到耗时的优化和两两距离的计算。我们提出了一种两种方法的混合方法，该方法自然地处理多类设置，在训练和运行时具有合理的计算复杂度，并且在实践中取得了很好的效果。其基本思想是找到查询样本的近邻，并训练一个局部支持向量机，该支持向量机保留邻居集合上的距离函数。我们的方法可以应用于大型，多类数据集，它优于最近邻和支持向量机，并且在问题变得难以处理时仍然有效。可以使用各种各样的距离函数，我们的实验在形状和纹理分类(MNIST, USPS, CUReT)和对象识别(Caltech- 101)的许多基准数据集上显示了最先进的性能。在Caltech-101上，每类15张训练图像的正确分类率为59.05%(±0.56%)，30张训练图像的正确分类率为66.23%(±0.48%)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)

自引率

0.00%

发文量

期刊最新文献

A Dynamic Bayesian Network Model for Autonomous 3D Reconstruction from a Single Indoor Image Efficient Maximally Stable Extremal Region (MSER) Tracking Transformation invariant component analysis for binary images Region-Tree Based Stereo Using Dynamic Programming Optimization Probabilistic 3D Polyp Detection in CT Images: The Role of Sample Alignment