Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238349
S. Mahamud, M. Hebert
The optimal distance measure for a given discrimination task under the nearest neighbor framework has been shown to be the likelihood that a pair of measurements have different class labels [S. Mahamud et al., (2002)]. For implementation and efficiency considerations, the optimal distance measure was approximated by combining more elementary distance measures defined on simple feature spaces. We address two important issues that arise in practice for such an approach: (a) What form should the elementary distance measure in each feature space take? We motivate the need to use the optimal distance measure in simple feature spaces as the elementary distance measures; such distance measures have the desirable property that they are invariant to distance-respecting transformations, (b) How do we combine the elementary distance measures ? We present the precise statistical assumptions under which a linear logistic model holds exactly. We benchmark our model with three other methods on a challenging face discrimination task and show that our approach is competitive with the state of the art.
在最近邻框架下,对于给定的识别任务,最优距离度量已被证明是一对测量值具有不同类标签的可能性[S]。Mahamud et al.,(2002)。为了实现和效率的考虑,将定义在简单特征空间上的更多基本距离度量组合在一起来逼近最优距离度量。我们解决了这种方法在实践中出现的两个重要问题:(a)每个特征空间中的基本距离度量应该采取什么形式?我们激发了在简单特征空间中使用最优距离度量作为基本距离度量的需求;(b)我们如何结合基本距离度量?我们提出了精确的统计假设,在此假设下,线性逻辑模型完全成立。我们将我们的模型与其他三种方法在具有挑战性的人脸识别任务上进行基准测试,并表明我们的方法与最先进的方法相比具有竞争力。
{"title":"Minimum risk distance measure for object recognition","authors":"S. Mahamud, M. Hebert","doi":"10.1109/ICCV.2003.1238349","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238349","url":null,"abstract":"The optimal distance measure for a given discrimination task under the nearest neighbor framework has been shown to be the likelihood that a pair of measurements have different class labels [S. Mahamud et al., (2002)]. For implementation and efficiency considerations, the optimal distance measure was approximated by combining more elementary distance measures defined on simple feature spaces. We address two important issues that arise in practice for such an approach: (a) What form should the elementary distance measure in each feature space take? We motivate the need to use the optimal distance measure in simple feature spaces as the elementary distance measures; such distance measures have the desirable property that they are invariant to distance-respecting transformations, (b) How do we combine the elementary distance measures ? We present the precise statistical assumptions under which a linear logistic model holds exactly. We benchmark our model with three other methods on a challenging face discrimination task and show that our approach is competitive with the state of the art.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125458375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238361
Stella X. Yu, Jianbo Shi
We propose a principled account on multiclass spectral clustering. Given a discrete clustering formulation, we first solve a relaxed continuous optimization problem by eigen-decomposition. We clarify the role of eigenvectors as a generator of all optimal solutions through orthonormal transforms. We then solve an optimal discretization problem, which seeks a discrete solution closest to the continuous optima. The discretization is efficiently computed in an iterative fashion using singular value decomposition and nonmaximum suppression. The resulting discrete solutions are nearly global-optimal. Our method is robust to random initialization and converges faster than other clustering methods. Experiments on real image segmentation are reported.
{"title":"Multiclass spectral clustering","authors":"Stella X. Yu, Jianbo Shi","doi":"10.1109/ICCV.2003.1238361","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238361","url":null,"abstract":"We propose a principled account on multiclass spectral clustering. Given a discrete clustering formulation, we first solve a relaxed continuous optimization problem by eigen-decomposition. We clarify the role of eigenvectors as a generator of all optimal solutions through orthonormal transforms. We then solve an optimal discretization problem, which seeks a discrete solution closest to the continuous optima. The discretization is efficiently computed in an iterative fashion using singular value decomposition and nonmaximum suppression. The resulting discrete solutions are nearly global-optimal. Our method is robust to random initialization and converges faster than other clustering methods. Experiments on real image segmentation are reported.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123307295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238438
T. Gevers
We aim at using color information to classify the physical nature of edges in video. To achieve physics-based edge classification, we first propose a novel approach to color edge detection by automatic noise-adaptive thresholding derived from sensor noise analysis. Then, we present a taxonomy on color edge types. As a result, a parameter-free edge classifier is obtained by labeling color transitions into one of the following types: (1) shadow-geometry, (2) highlight edges, (3) material edges. The proposed method is empirically verified on images showing complex real world scenes.
{"title":"Reflectance-based classification of color edges","authors":"T. Gevers","doi":"10.1109/ICCV.2003.1238438","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238438","url":null,"abstract":"We aim at using color information to classify the physical nature of edges in video. To achieve physics-based edge classification, we first propose a novel approach to color edge detection by automatic noise-adaptive thresholding derived from sensor noise analysis. Then, we present a taxonomy on color edge types. As a result, a parameter-free edge classifier is obtained by labeling color transitions into one of the following types: (1) shadow-geometry, (2) highlight edges, (3) material edges. The proposed method is empirically verified on images showing complex real world scenes.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126810052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238462
M. Ben-Ezra, S. Nayar
The perception of transparent objects from images is known to be a very hard problem in vision. Given a single image, it is difficult to even detect the presence of transparent objects in the scene. In this paper, we explore what can be said about transparent objects by a moving observer. We show how features that are imaged through a transparent object behave differently from those that are rigidly attached to the scene. We present a novel model-based approach to recover the shapes and the poses of transparent objects from known motion. The objects can be complex in that they may be composed of multiple layers with different refractive indices. We have conducted numerous simulations to verify the practical feasibility of our algorithm. We have applied it to real scenes that include transparent objects and recovered the shapes of the objects with high accuracy.
{"title":"What does motion reveal about transparency?","authors":"M. Ben-Ezra, S. Nayar","doi":"10.1109/ICCV.2003.1238462","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238462","url":null,"abstract":"The perception of transparent objects from images is known to be a very hard problem in vision. Given a single image, it is difficult to even detect the presence of transparent objects in the scene. In this paper, we explore what can be said about transparent objects by a moving observer. We show how features that are imaged through a transparent object behave differently from those that are rigidly attached to the scene. We present a novel model-based approach to recover the shapes and the poses of transparent objects from known motion. The objects can be complex in that they may be composed of multiple layers with different refractive indices. We have conducted numerous simulations to verify the practical feasibility of our algorithm. We have applied it to real scenes that include transparent objects and recovered the shapes of the objects with high accuracy.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121838502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238445
Zoran Duric, Fayin Li, H. Wechsler, V. Cherkassky
This paper describes a novel application of statistical learning theory (SLT) to control model complexity in flow estimation. SLT provides analytical generalization bounds suitable for practical model selection from small and noisy data sets of image measurements (normal flow). The method addresses the aperture problem by using the penalized risk (ridge regression). We demonstrate an application of this method on both synthetic and real image sequences and use it for motion interpolation and extrapolation. Our experimental results show that our approach compares favorably against alternative model selection methods such as the Akaike's final prediction error, Schwartz's criterion, generalized cross-validation, and Shibata's model selector.
{"title":"Controlling model complexity in flow estimation","authors":"Zoran Duric, Fayin Li, H. Wechsler, V. Cherkassky","doi":"10.1109/ICCV.2003.1238445","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238445","url":null,"abstract":"This paper describes a novel application of statistical learning theory (SLT) to control model complexity in flow estimation. SLT provides analytical generalization bounds suitable for practical model selection from small and noisy data sets of image measurements (normal flow). The method addresses the aperture problem by using the penalized risk (ridge regression). We demonstrate an application of this method on both synthetic and real image sequences and use it for motion interpolation and extrapolation. Our experimental results show that our approach compares favorably against alternative model selection methods such as the Akaike's final prediction error, Schwartz's criterion, generalized cross-validation, and Shibata's model selector.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123819423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238407
Gyuri Dorkó, C. Schmid
We introduce a novel method for constructing and selecting scale-invariant object parts. Scale-invariant local descriptors are first grouped into basic parts. A classifier is then learned for each of these parts, and feature selection is used to determine the most discriminative ones. This approach allows robust pan detection, and it is invariant under scale changes-that is, neither the training images nor the test images have to be normalized. The proposed method is evaluated in car detection tasks with significant variations in viewing conditions, and promising results are demonstrated. Different local regions, classifiers and feature selection methods are quantitatively compared. Our evaluation shows that local invariant descriptors are an appropriate representation for object classes such as cars, and it underlines the importance of feature selection.
{"title":"Selection of scale-invariant parts for object class recognition","authors":"Gyuri Dorkó, C. Schmid","doi":"10.1109/ICCV.2003.1238407","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238407","url":null,"abstract":"We introduce a novel method for constructing and selecting scale-invariant object parts. Scale-invariant local descriptors are first grouped into basic parts. A classifier is then learned for each of these parts, and feature selection is used to determine the most discriminative ones. This approach allows robust pan detection, and it is invariant under scale changes-that is, neither the training images nor the test images have to be normalized. The proposed method is evaluated in car detection tasks with significant variations in viewing conditions, and promising results are demonstrated. Different local regions, classifiers and feature selection methods are quantitatively compared. Our evaluation shows that local invariant descriptors are an appropriate representation for object classes such as cars, and it underlines the importance of feature selection.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"31 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113964349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238466
Zhang Tao, D. Freedman
We present a novel method for tracking objects by combining density matching with shape priors. Density matching is a tracking method which operates by maximizing the Bhattacharyya similarity measure between the photometric distribution from an estimated image region and a model photometric distribution. Such trackers can be expressed as PDE-based curve evolutions, which can be implemented using level sets. Shape priors can be combined with this level-set implementation of density matching by representing the shape priors as a series of level sets; a variational approach allows for a natural, parametrization-independent shape term to be derived. Experimental results on real image sequences are shown.
{"title":"Tracking objects using density matching and shape priors","authors":"Zhang Tao, D. Freedman","doi":"10.1109/ICCV.2003.1238466","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238466","url":null,"abstract":"We present a novel method for tracking objects by combining density matching with shape priors. Density matching is a tracking method which operates by maximizing the Bhattacharyya similarity measure between the photometric distribution from an estimated image region and a model photometric distribution. Such trackers can be expressed as PDE-based curve evolutions, which can be implemented using level sets. Shape priors can be combined with this level-set implementation of density matching by representing the shape priors as a series of level sets; a variational approach allows for a natural, parametrization-independent shape term to be derived. Experimental results on real image sequences are shown.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123065062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238465
O. Camps, Hwasup Lim, M. C. Mazzaro, M. Sznaier
A requirement common to most dynamic vision applications is the ability to track objects in a sequence of frames. This problem has been extensively studied in the past few years, leading to several techniques, such as unscented particle filter based trackers, that exploit a combination of the (assumed) target dynamics, empirically learned noise distributions and past position observations. While successful in many scenarios, these trackers remain fragile to occlusion and model uncertainty in the target dynamics. As we show in this paper, these difficulties can be addressed by modeling the dynamics of the target as an unknown operator that satisfies certain interpolation conditions. Results from interpolation theory can then be used to find this operator by solving a convex optimization problem. As illustrated with several examples, combining this operator with Kalman and UPF techniques leads to both robustness improvement and computational complexity reduction.
{"title":"A Caratheodory-Fejer approach to robust multiframe tracking","authors":"O. Camps, Hwasup Lim, M. C. Mazzaro, M. Sznaier","doi":"10.1109/ICCV.2003.1238465","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238465","url":null,"abstract":"A requirement common to most dynamic vision applications is the ability to track objects in a sequence of frames. This problem has been extensively studied in the past few years, leading to several techniques, such as unscented particle filter based trackers, that exploit a combination of the (assumed) target dynamics, empirically learned noise distributions and past position observations. While successful in many scenarios, these trackers remain fragile to occlusion and model uncertainty in the target dynamics. As we show in this paper, these difficulties can be addressed by modeling the dynamics of the target as an unknown operator that satisfies certain interpolation conditions. Results from interpolation theory can then be used to find this operator by solving a convex optimization problem. As illustrated with several examples, combining this operator with Kalman and UPF techniques leads to both robustness improvement and computational complexity reduction.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127819100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238630
Matthew A. Brown, D. Lowe
The problem considered in this paper is the fully automatic construction of panoramas. Fundamentally, this problem requires recognition, as we need to know which parts of the panorama join up. Previous approaches have used human input or restrictions on the image sequence for the matching step. In this work we use object recognition techniques based on invariant local features to select matching images, and a probabilistic model for verification. Because of this our method is insensitive to the ordering, orientation, scale and illumination of the images. It is also insensitive to 'noise' images which are not part of the panorama at all, that is, it recognises panoramas. This suggests a useful application for photographers: the system takes as input the images on an entire flash card or film, recognises images that form part of a panorama, and stitches them with no user input whatsoever.
{"title":"Recognising panoramas","authors":"Matthew A. Brown, D. Lowe","doi":"10.1109/ICCV.2003.1238630","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238630","url":null,"abstract":"The problem considered in this paper is the fully automatic construction of panoramas. Fundamentally, this problem requires recognition, as we need to know which parts of the panorama join up. Previous approaches have used human input or restrictions on the image sequence for the matching step. In this work we use object recognition techniques based on invariant local features to select matching images, and a probabilistic model for verification. Because of this our method is insensitive to the ordering, orientation, scale and illumination of the images. It is also insensitive to 'noise' images which are not part of the panorama at all, that is, it recognises panoramas. This suggests a useful application for photographers: the system takes as input the images on an entire flash card or film, recognises images that form part of a panorama, and stitches them with no user input whatsoever.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132749575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238363
R. Okada, Y. Taniguchi, K. Furukawa, K. Onoguchi
We present a novel method for detecting vehicles as obstacles in various road scenes using a single onboard camera. Vehicles are detected by testing whether the motion of a set of three horizontal line segments, which are always on the vehicles, satisfies the motion constraint of the ground plane or that of the surface plane of the vehicles. The motion constraint of each plane is derived from the projective invariant combined with the vanishing line of the plane that is a prior knowledge of road scenes. The proposed method is implemented into a newly developed onboard LSI. Experimental results for real road scenes under various conditions show the effectiveness of the proposed method.
{"title":"Obstacle detection using projective invariant and vanishing lines","authors":"R. Okada, Y. Taniguchi, K. Furukawa, K. Onoguchi","doi":"10.1109/ICCV.2003.1238363","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238363","url":null,"abstract":"We present a novel method for detecting vehicles as obstacles in various road scenes using a single onboard camera. Vehicles are detected by testing whether the motion of a set of three horizontal line segments, which are always on the vehicles, satisfies the motion constraint of the ground plane or that of the surface plane of the vehicles. The motion constraint of each plane is derived from the projective invariant combined with the vanishing line of the plane that is a prior knowledge of road scenes. The proposed method is implemented into a newly developed onboard LSI. Experimental results for real road scenes under various conditions show the effectiveness of the proposed method.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114619883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}