首页 > 最新文献

2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)最新文献

英文 中文
Real-Time Semi-Automatic Segmentation Using a Bayesian Network 基于贝叶斯网络的实时半自动分割
Eric N. Mortensen, J. Jia
This paper presents a semi-automatic segmentation technique called Bayesian cut that formulates object boundary detection as the most probable explanation (MPE) of a Bayesian network’s joint probability distribution. A two-layer Bayesian network structure is formulated from a planar graph representing a watershed segmentation of an image. The network’s prior probabilities encode the confidence that an edge in the planar graph belongs to an object boundary while the conditional probability tables (CPTs) enforce global contour properties of closure and simplicity (i.e., no self-intersections). Evidence, in the form of one or more connected boundary points, allows the network to compute the MPE with minimal user guidance. The constraints imposed by CPTs also permit a linear-time algorithm to compute the MPE, which in turn allows for interactive segmentation where every mouse movement recomputes the MPE based on the current cursor position and displays the corresponding segmentation.
本文提出了一种称为贝叶斯切割的半自动分割技术,该技术将目标边界检测作为贝叶斯网络联合概率分布的最可能解释(MPE)。从表示图像分水岭分割的平面图出发,构造了两层贝叶斯网络结构。网络的先验概率编码了平面图中边缘属于对象边界的置信度,而条件概率表(cpt)强制执行闭合和简单性(即无自交)的全局轮廓属性。证据,以一个或多个连接的边界点的形式,允许网络在最小的用户指导下计算MPE。cpt施加的约束还允许线性时间算法来计算MPE,这反过来又允许交互式分割,其中每次鼠标移动都会根据当前光标位置重新计算MPE并显示相应的分割。
{"title":"Real-Time Semi-Automatic Segmentation Using a Bayesian Network","authors":"Eric N. Mortensen, J. Jia","doi":"10.1109/CVPR.2006.239","DOIUrl":"https://doi.org/10.1109/CVPR.2006.239","url":null,"abstract":"This paper presents a semi-automatic segmentation technique called Bayesian cut that formulates object boundary detection as the most probable explanation (MPE) of a Bayesian network’s joint probability distribution. A two-layer Bayesian network structure is formulated from a planar graph representing a watershed segmentation of an image. The network’s prior probabilities encode the confidence that an edge in the planar graph belongs to an object boundary while the conditional probability tables (CPTs) enforce global contour properties of closure and simplicity (i.e., no self-intersections). Evidence, in the form of one or more connected boundary points, allows the network to compute the MPE with minimal user guidance. The constraints imposed by CPTs also permit a linear-time algorithm to compute the MPE, which in turn allows for interactive segmentation where every mouse movement recomputes the MPE based on the current cursor position and displays the corresponding segmentation.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131359161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Measure Locally, Reason Globally: Occlusion-sensitive Articulated Pose Estimation 局部测量,全局推理:闭塞敏感关节姿态估计
L. Sigal, Michael J. Black
Part-based tree-structured models have been widely used for 2D articulated human pose-estimation. These approaches admit efficient inference algorithms while capturing the important kinematic constraints of the human body as a graphical model. These methods often fail however when multiple body parts fit the same image region resulting in global pose estimates that poorly explain the overall image evidence. Attempts to solve this problem have focused on the use of strong prior models that are limited to learned activities such as walking. We argue that the problem actually lies with the image observations and not with the prior. In particular, image evidence for each body part is estimated independently of other parts without regard to self-occlusion. To address this we introduce occlusion-sensitive local likelihoods that approximate the global image likelihood using per-pixel hidden binary variables that encode the occlusion relationships between parts. This occlusion reasoning introduces interactions between non-adjacent body parts creating loops in the underlying graphical model. We deal with this using an extension of an approximate belief propagation algorithm (PAMPAS). The algorithm recovers the real-valued 2D pose of the body in the presence of occlusions, does not require strong priors over body pose and does a quantitatively better job of explaining image evidence than previous methods.
基于零件的树状结构模型已广泛用于二维关节人体姿态估计。这些方法采用高效的推理算法,同时将人体重要的运动约束作为图形模型捕获。然而,当多个身体部位适合同一图像区域时,这些方法往往会失败,导致全局姿态估计无法解释整体图像证据。解决这个问题的尝试集中在使用强先验模型,这些模型仅限于学习活动,如走路。我们认为,问题实际上在于图像观测,而不是先验。特别是,每个身体部位的图像证据独立于其他部位进行估计,而不考虑自遮挡。为了解决这个问题,我们引入了对遮挡敏感的局部似然,它使用编码部分之间遮挡关系的逐像素隐藏二进制变量来近似全局图像似然。这种遮挡推理引入了非相邻身体部位之间的相互作用,在底层图形模型中创建了循环。我们使用近似信念传播算法(PAMPAS)的扩展来处理这个问题。该算法在存在遮挡的情况下恢复身体的实值二维姿态,不需要对身体姿态有很强的先验性,并且在定量解释图像证据方面比以前的方法做得更好。
{"title":"Measure Locally, Reason Globally: Occlusion-sensitive Articulated Pose Estimation","authors":"L. Sigal, Michael J. Black","doi":"10.1109/CVPR.2006.180","DOIUrl":"https://doi.org/10.1109/CVPR.2006.180","url":null,"abstract":"Part-based tree-structured models have been widely used for 2D articulated human pose-estimation. These approaches admit efficient inference algorithms while capturing the important kinematic constraints of the human body as a graphical model. These methods often fail however when multiple body parts fit the same image region resulting in global pose estimates that poorly explain the overall image evidence. Attempts to solve this problem have focused on the use of strong prior models that are limited to learned activities such as walking. We argue that the problem actually lies with the image observations and not with the prior. In particular, image evidence for each body part is estimated independently of other parts without regard to self-occlusion. To address this we introduce occlusion-sensitive local likelihoods that approximate the global image likelihood using per-pixel hidden binary variables that encode the occlusion relationships between parts. This occlusion reasoning introduces interactions between non-adjacent body parts creating loops in the underlying graphical model. We deal with this using an extension of an approximate belief propagation algorithm (PAMPAS). The algorithm recovers the real-valued 2D pose of the body in the presence of occlusions, does not require strong priors over body pose and does a quantitatively better job of explaining image evidence than previous methods.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121756671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 260
Shape Guided Object Segmentation 形状引导目标分割
Eran Borenstein, Jitendra Malik
We construct a Bayesian model that integrates topdown with bottom-up criteria, capitalizing on their relative merits to obtain figure-ground segmentation that is shape-specific and texture invariant. A hierarchy of bottom-up segments in multiple scales is used to construct a prior on all possible figure-ground segmentations of the image. This prior is used by our top-down part to query and detect object parts in the image using stored shape templates. The detected parts are integrated to produce a global approximation for the object’s shape, which is then used by an inference algorithm to produce the final segmentation. Experiments with a large sample of horse and runner images demonstrate strong figure-ground segmentation despite high object and background variability. The segmentations are robust to changes in appearance since the matching component depends on shape criteria alone. The model may be useful for additional visual tasks requiring labeling, such as the segmentation of multiple scene objects.
我们构建了一个贝叶斯模型,该模型集成了自顶向下和自底向上的标准,利用它们的相对优点来获得形状特定且纹理不变的图-地分割。在多个尺度下,自下而上的分段层次结构用于在图像的所有可能的图形-背景分割上构建先验。我们的自顶向下部分使用这个先验来使用存储的形状模板查询和检测图像中的对象部分。检测到的部分被整合以产生物体形状的全局近似,然后由推理算法使用该近似来产生最终的分割。对大量马和跑步者图像进行的实验表明,尽管物体和背景具有很高的可变性,但图像-背景分割效果很好。由于匹配组件仅依赖于形状标准,因此分割对外观变化具有鲁棒性。该模型可能对需要标记的其他视觉任务有用,例如多个场景对象的分割。
{"title":"Shape Guided Object Segmentation","authors":"Eran Borenstein, Jitendra Malik","doi":"10.1109/CVPR.2006.276","DOIUrl":"https://doi.org/10.1109/CVPR.2006.276","url":null,"abstract":"We construct a Bayesian model that integrates topdown with bottom-up criteria, capitalizing on their relative merits to obtain figure-ground segmentation that is shape-specific and texture invariant. A hierarchy of bottom-up segments in multiple scales is used to construct a prior on all possible figure-ground segmentations of the image. This prior is used by our top-down part to query and detect object parts in the image using stored shape templates. The detected parts are integrated to produce a global approximation for the object’s shape, which is then used by an inference algorithm to produce the final segmentation. Experiments with a large sample of horse and runner images demonstrate strong figure-ground segmentation despite high object and background variability. The segmentations are robust to changes in appearance since the matching component depends on shape criteria alone. The model may be useful for additional visual tasks requiring labeling, such as the segmentation of multiple scene objects.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121876858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 123
3D Face Recognition Using 3D Alignment for PCA 基于PCA的三维人脸识别
T. Russ, Chris Boehnen, Tanya Peters
This paper presents a 3D approach for recognizing faces based on Principal Component Analysis (PCA). The approach addresses the issue of proper 3D face alignment required by PCA for maximum data compression and good generalization performance for new untrained faces. This issue has traditionally been addressed by 2D data normalization, a step that eliminates 3D object size information important for the recognition process. We achieve correspondence of facial points by registering a 3D face to a scaled generic 3D reference face and subsequently perform a surface normal search algorithm. 3D scaling of the generic reference face is performed to enable better alignment of facial points while preserving important 3D size information in the input face. The benefits of this approach for 3D face recognition and dimensionality reduction have been demonstrated on components of the Face Recognition Grand Challenge (FRGC) database versions 1 and 2.
提出了一种基于主成分分析(PCA)的三维人脸识别方法。该方法解决了PCA所需的适当的三维人脸对齐问题,以最大限度地压缩数据,并对新的未经训练的人脸具有良好的泛化性能。这个问题传统上是通过2D数据规范化来解决的,这一步骤消除了对识别过程很重要的3D对象大小信息。我们通过将三维人脸注册到缩放的通用三维参考人脸来实现人脸点的对应,然后执行表面法线搜索算法。执行通用参考人脸的3D缩放,以便更好地对齐人脸点,同时保留输入人脸中的重要3D尺寸信息。这种方法对3D人脸识别和降维的好处已经在人脸识别大挑战(FRGC)数据库版本1和2的组件上得到了证明。
{"title":"3D Face Recognition Using 3D Alignment for PCA","authors":"T. Russ, Chris Boehnen, Tanya Peters","doi":"10.1109/CVPR.2006.13","DOIUrl":"https://doi.org/10.1109/CVPR.2006.13","url":null,"abstract":"This paper presents a 3D approach for recognizing faces based on Principal Component Analysis (PCA). The approach addresses the issue of proper 3D face alignment required by PCA for maximum data compression and good generalization performance for new untrained faces. This issue has traditionally been addressed by 2D data normalization, a step that eliminates 3D object size information important for the recognition process. We achieve correspondence of facial points by registering a 3D face to a scaled generic 3D reference face and subsequently perform a surface normal search algorithm. 3D scaling of the generic reference face is performed to enable better alignment of facial points while preserving important 3D size information in the input face. The benefits of this approach for 3D face recognition and dimensionality reduction have been demonstrated on components of the Face Recognition Grand Challenge (FRGC) database versions 1 and 2.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"247 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121877759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 93
Hierarchical Statistical Learning of Generic Parts of Object Structure 对象结构共性部分的分层统计学习
S. Fidler, Gregor Berginc, A. Leonardis
With the growing interest in object categorization various methods have emerged that perform well in this challenging task, yet are inherently limited to only a moderate number of object classes. In pursuit of a more general categorization system this paper proposes a way to overcome the computational complexity encompassing the enormous number of different object categories by exploiting the statistical properties of the highly structured visual world. Our approach proposes a hierarchical acquisition of generic parts of object structure, varying from simple to more complex ones, which stem from the favorable statistics of natural images. The parts recovered in the individual layers of the hierarchy can be used in a top-down manner resulting in a robust statistical engine that could be efficiently used within many of the current categorization systems. The proposed approach has been applied to large image datasets yielding important statistical insights into the generic parts of object structure.
随着对对象分类的兴趣日益浓厚,出现了各种方法,它们在这一具有挑战性的任务中表现良好,但本质上仅限于数量适中的对象类。为了追求一个更通用的分类系统,本文提出了一种利用高度结构化的视觉世界的统计特性来克服包含大量不同对象类别的计算复杂性的方法。我们的方法提出了从简单到复杂的物体结构的一般部分的分层获取,这源于自然图像的有利统计。在层次结构的各个层中恢复的部分可以以自顶向下的方式使用,从而产生一个健壮的统计引擎,可以在许多当前的分类系统中有效地使用。所提出的方法已应用于大型图像数据集,对对象结构的一般部分产生重要的统计见解。
{"title":"Hierarchical Statistical Learning of Generic Parts of Object Structure","authors":"S. Fidler, Gregor Berginc, A. Leonardis","doi":"10.1109/CVPR.2006.134","DOIUrl":"https://doi.org/10.1109/CVPR.2006.134","url":null,"abstract":"With the growing interest in object categorization various methods have emerged that perform well in this challenging task, yet are inherently limited to only a moderate number of object classes. In pursuit of a more general categorization system this paper proposes a way to overcome the computational complexity encompassing the enormous number of different object categories by exploiting the statistical properties of the highly structured visual world. Our approach proposes a hierarchical acquisition of generic parts of object structure, varying from simple to more complex ones, which stem from the favorable statistics of natural images. The parts recovered in the individual layers of the hierarchy can be used in a top-down manner resulting in a robust statistical engine that could be efficiently used within many of the current categorization systems. The proposed approach has been applied to large image datasets yielding important statistical insights into the generic parts of object structure.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"185 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132587234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 53
Feature Selection for Evaluating Fluorescence Microscopy Images in Genome-Wide Cell Screens 全基因组细胞筛选中评价荧光显微镜图像的特征选择
V. Kovalev, N. Harder, B. Neumann, Michael Held, U. Liebel, H. Erfle, J. Ellenberg, R. Eils, K. Rohr
We investigate different approaches for efficient feature space reduction and compare different methods for cell classification. The application context is the development of automatic methods for analysing fluorescence microscopy images with the goal to identify those genes that are involved in the mitosis of human cells (cell division). We distinguish four cell classes comprising interphase cells, mitotic cells, apoptotic cells, and cells with clustered nuclei. Feature space reduction was performed using the Principal Component Analysis and Independent Component Analysis methods. Six classification methods were examined including unsupervised clustering algorithms such as K-means, Hard Competitive Learning, and Neural Gas as well as Hierarchical Clustering, Support Vector Machines, and Random Forests classifiers. Detailed results on the cell image classification accuracy and computational efficiency achieved using different feature sets and different classification methods are reported.
我们研究了不同的有效特征空间约简方法,并比较了不同的细胞分类方法。应用背景是开发用于分析荧光显微镜图像的自动方法,目的是识别那些参与人类细胞有丝分裂(细胞分裂)的基因。我们将细胞分为四类,包括间期细胞、有丝分裂细胞、凋亡细胞和细胞核聚集的细胞。采用主成分分析和独立成分分析方法进行特征空间约简。研究了六种分类方法,包括无监督聚类算法,如K-means、硬竞争学习和神经气体,以及分层聚类、支持向量机和随机森林分类器。本文报道了不同特征集和不同分类方法对细胞图像分类精度和计算效率的影响。
{"title":"Feature Selection for Evaluating Fluorescence Microscopy Images in Genome-Wide Cell Screens","authors":"V. Kovalev, N. Harder, B. Neumann, Michael Held, U. Liebel, H. Erfle, J. Ellenberg, R. Eils, K. Rohr","doi":"10.1109/CVPR.2006.121","DOIUrl":"https://doi.org/10.1109/CVPR.2006.121","url":null,"abstract":"We investigate different approaches for efficient feature space reduction and compare different methods for cell classification. The application context is the development of automatic methods for analysing fluorescence microscopy images with the goal to identify those genes that are involved in the mitosis of human cells (cell division). We distinguish four cell classes comprising interphase cells, mitotic cells, apoptotic cells, and cells with clustered nuclei. Feature space reduction was performed using the Principal Component Analysis and Independent Component Analysis methods. Six classification methods were examined including unsupervised clustering algorithms such as K-means, Hard Competitive Learning, and Neural Gas as well as Hierarchical Clustering, Support Vector Machines, and Random Forests classifiers. Detailed results on the cell image classification accuracy and computational efficiency achieved using different feature sets and different classification methods are reported.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132738351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition SVM-KNN:视觉类别识别的判别最近邻分类
Haotong Zhang, A. Berg, M. Maire, Jitendra Malik
We consider visual category recognition in the framework of measuring similarities, or equivalently perceptual distances, to prototype examples of categories. This approach is quite flexible, and permits recognition based on color, texture, and particularly shape, in a homogeneous framework. While nearest neighbor classifiers are natural in this setting, they suffer from the problem of high variance (in bias-variance decomposition) in the case of limited sampling. Alternatively, one could use support vector machines but they involve time-consuming optimization and computation of pairwise distances. We propose a hybrid of these two methods which deals naturally with the multiclass setting, has reasonable computational complexity both in training and at run time, and yields excellent results in practice. The basic idea is to find close neighbors to a query sample and train a local support vector machine that preserves the distance function on the collection of neighbors. Our method can be applied to large, multiclass data sets for which it outperforms nearest neighbor and support vector machines, and remains efficient when the problem becomes intractable for support vector machines. A wide variety of distance functions can be used and our experiments show state-of-the-art performance on a number of benchmark data sets for shape and texture classification (MNIST, USPS, CUReT) and object recognition (Caltech- 101). On Caltech-101 we achieved a correct classification rate of 59.05%(±0.56%) at 15 training images per class, and 66.23%(±0.48%) at 30 training images.
我们在测量相似性的框架中考虑视觉类别识别,或者等效的感知距离,到类别的原型例子。这种方法非常灵活,允许在同质框架中基于颜色、纹理、特别是形状进行识别。虽然最近邻分类器在这种情况下是自然的,但在有限采样的情况下,它们存在高方差(在偏差-方差分解中)的问题。另外,我们也可以使用支持向量机,但这涉及到耗时的优化和两两距离的计算。我们提出了一种两种方法的混合方法,该方法自然地处理多类设置,在训练和运行时具有合理的计算复杂度,并且在实践中取得了很好的效果。其基本思想是找到查询样本的近邻,并训练一个局部支持向量机,该支持向量机保留邻居集合上的距离函数。我们的方法可以应用于大型,多类数据集,它优于最近邻和支持向量机,并且在问题变得难以处理时仍然有效。可以使用各种各样的距离函数,我们的实验在形状和纹理分类(MNIST, USPS, CUReT)和对象识别(Caltech- 101)的许多基准数据集上显示了最先进的性能。在Caltech-101上,每类15张训练图像的正确分类率为59.05%(±0.56%),30张训练图像的正确分类率为66.23%(±0.48%)。
{"title":"SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition","authors":"Haotong Zhang, A. Berg, M. Maire, Jitendra Malik","doi":"10.1109/CVPR.2006.301","DOIUrl":"https://doi.org/10.1109/CVPR.2006.301","url":null,"abstract":"We consider visual category recognition in the framework of measuring similarities, or equivalently perceptual distances, to prototype examples of categories. This approach is quite flexible, and permits recognition based on color, texture, and particularly shape, in a homogeneous framework. While nearest neighbor classifiers are natural in this setting, they suffer from the problem of high variance (in bias-variance decomposition) in the case of limited sampling. Alternatively, one could use support vector machines but they involve time-consuming optimization and computation of pairwise distances. We propose a hybrid of these two methods which deals naturally with the multiclass setting, has reasonable computational complexity both in training and at run time, and yields excellent results in practice. The basic idea is to find close neighbors to a query sample and train a local support vector machine that preserves the distance function on the collection of neighbors. Our method can be applied to large, multiclass data sets for which it outperforms nearest neighbor and support vector machines, and remains efficient when the problem becomes intractable for support vector machines. A wide variety of distance functions can be used and our experiments show state-of-the-art performance on a number of benchmark data sets for shape and texture classification (MNIST, USPS, CUReT) and object recognition (Caltech- 101). On Caltech-101 we achieved a correct classification rate of 59.05%(±0.56%) at 15 training images per class, and 66.23%(±0.48%) at 30 training images.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132812344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1334
Modeling Correspondences for Multi-Camera Tracking Using Nonlinear Manifold Learning and Target Dynamics 基于非线性流形学习和目标动力学的多相机跟踪对应关系建模
Vlad I. Morariu, O. Camps
Multi-camera tracking systems often must maintain consistent identity labels of the targets across views to recover 3D trajectories and fully take advantage of the additional information available from the multiple sensors. Previous approaches to the "correspondence across views" problem include matching features, using camera calibration information, and computing homographies between views under the assumption that the world is planar. However, it can be difficult to match features across significantly different views. Furthermore, calibration information is not always available and planar world hypothesis can be too restrictive. In this paper, a new approach is presented for matching correspondences based on the use of nonlinear manifold learning and system dynamics identification. The proposed approach does not require similar views, calibration nor geometric assumptions of the 3D environment, and is robust to noise and occlusion. Experimental results demonstrate the use of this approach to generate and predict views in cases where identity labels become ambiguous.
多相机跟踪系统通常必须保持目标在不同视图中的一致身份标签,以恢复3D轨迹,并充分利用多个传感器提供的额外信息。先前解决“跨视图对应”问题的方法包括匹配特征、使用相机校准信息以及在假设世界是平面的情况下计算视图之间的同形图。然而,在明显不同的视图中匹配特征是很困难的。此外,标定信息并不总是可用的,平面世界假设可能过于严格。本文提出了一种基于非线性流形学习和系统动力学辨识的匹配方法。该方法不需要类似的视图、校准或3D环境的几何假设,并且对噪声和遮挡具有鲁棒性。实验结果表明,在身份标签变得模糊的情况下,使用这种方法可以生成和预测视图。
{"title":"Modeling Correspondences for Multi-Camera Tracking Using Nonlinear Manifold Learning and Target Dynamics","authors":"Vlad I. Morariu, O. Camps","doi":"10.1109/CVPR.2006.189","DOIUrl":"https://doi.org/10.1109/CVPR.2006.189","url":null,"abstract":"Multi-camera tracking systems often must maintain consistent identity labels of the targets across views to recover 3D trajectories and fully take advantage of the additional information available from the multiple sensors. Previous approaches to the \"correspondence across views\" problem include matching features, using camera calibration information, and computing homographies between views under the assumption that the world is planar. However, it can be difficult to match features across significantly different views. Furthermore, calibration information is not always available and planar world hypothesis can be too restrictive. In this paper, a new approach is presented for matching correspondences based on the use of nonlinear manifold learning and system dynamics identification. The proposed approach does not require similar views, calibration nor geometric assumptions of the 3D environment, and is robust to noise and occlusion. Experimental results demonstrate the use of this approach to generate and predict views in cases where identity labels become ambiguous.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132936233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 54
Correlated Label Propagation with Application to Multi-label Learning 相关标签传播及其在多标签学习中的应用
Feng Kang, Rong Jin, R. Sukthankar
Many computer vision applications, such as scene analysis and medical image interpretation, are ill-suited for traditional classification where each image can only be associated with a single class. This has stimulated recent work in multi-label learning where a given image can be tagged with multiple class labels. A serious problem with existing approaches is that they are unable to exploit correlations between class labels. This paper presents a novel framework for multi-label learning termed Correlated Label Propagation (CLP) that explicitly models interactions between labels in an efficient manner. As in standard label propagation, labels attached to training data points are propagated to test data points; however, unlike standard algorithms that treat each label independently, CLP simultaneously co-propagates multiple labels. Existing work eschews such an approach since naive algorithms for label co-propagation are intractable. We present an algorithm based on properties of submodular functions that efficiently finds an optimal solution. Our experiments demonstrate that CLP leads to significant gains in precision/recall against standard techniques on two real-world computer vision tasks involving several hundred labels.
许多计算机视觉应用,如场景分析和医学图像解释,不适合传统的分类,因为每个图像只能与单个类相关联。这激发了最近在多标签学习方面的工作,其中给定的图像可以用多个类标签标记。现有方法的一个严重问题是它们无法利用类标签之间的相关性。本文提出了一种新的多标签学习框架,称为相关标签传播(CLP),它以一种有效的方式显式地建模标签之间的相互作用。与标准标签传播一样,附加在训练数据点上的标签被传播到测试数据点;然而,与独立处理每个标签的标准算法不同,CLP同时共同传播多个标签。现有的工作避开了这种方法,因为标签共传播的朴素算法是难以处理的。本文提出了一种基于子模函数性质的算法,可以有效地求出最优解。我们的实验表明,在涉及数百个标签的两个真实世界的计算机视觉任务中,与标准技术相比,CLP在精度/召回率方面取得了显著的进步。
{"title":"Correlated Label Propagation with Application to Multi-label Learning","authors":"Feng Kang, Rong Jin, R. Sukthankar","doi":"10.1109/CVPR.2006.90","DOIUrl":"https://doi.org/10.1109/CVPR.2006.90","url":null,"abstract":"Many computer vision applications, such as scene analysis and medical image interpretation, are ill-suited for traditional classification where each image can only be associated with a single class. This has stimulated recent work in multi-label learning where a given image can be tagged with multiple class labels. A serious problem with existing approaches is that they are unable to exploit correlations between class labels. This paper presents a novel framework for multi-label learning termed Correlated Label Propagation (CLP) that explicitly models interactions between labels in an efficient manner. As in standard label propagation, labels attached to training data points are propagated to test data points; however, unlike standard algorithms that treat each label independently, CLP simultaneously co-propagates multiple labels. Existing work eschews such an approach since naive algorithms for label co-propagation are intractable. We present an algorithm based on properties of submodular functions that efficiently finds an optimal solution. Our experiments demonstrate that CLP leads to significant gains in precision/recall against standard techniques on two real-world computer vision tasks involving several hundred labels.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133914854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 247
Depth from Familiar Objects: A Hierarchical Model for 3D Scenes 熟悉物体的深度:3D场景的层次模型
Erik B. Sudderth, A. Torralba, W. Freeman, A. Willsky
We develop an integrated, probabilistic model for the appearance and three-dimensional geometry of cluttered scenes. Object categories are modeled via distributions over the 3D location and appearance of visual features. Uncertainty in the number of object instances depicted in a particular image is then achieved via a transformed Dirichlet process. In contrast with image-based approaches to object recognition, we model scale variations as the perspective projection of objects in different 3D poses. To calibrate the underlying geometry, we incorporate binocular stereo images into the training process. A robust likelihood model accounts for outliers in matched stereo features, allowing effective learning of 3D object structure from partial 2D segmentations. Applied to a dataset of office scenes, our model detects objects at multiple scales via a coarse reconstruction of the corresponding 3D geometry.
我们开发了一个集成的概率模型,用于杂乱场景的外观和三维几何。对象类别通过分布在3D位置和视觉特征的外观来建模。然后通过变换的狄利克雷过程来实现特定图像中所描绘的对象实例数量的不确定性。与基于图像的物体识别方法相比,我们将尺度变化建模为物体在不同3D姿态下的透视投影。为了校准底层几何,我们将双目立体图像纳入训练过程。鲁棒似然模型考虑匹配立体特征中的异常值,允许从部分2D分割中有效学习3D对象结构。应用于办公场景的数据集,我们的模型通过对相应的3D几何形状进行粗重建来检测多个尺度上的物体。
{"title":"Depth from Familiar Objects: A Hierarchical Model for 3D Scenes","authors":"Erik B. Sudderth, A. Torralba, W. Freeman, A. Willsky","doi":"10.1109/CVPR.2006.97","DOIUrl":"https://doi.org/10.1109/CVPR.2006.97","url":null,"abstract":"We develop an integrated, probabilistic model for the appearance and three-dimensional geometry of cluttered scenes. Object categories are modeled via distributions over the 3D location and appearance of visual features. Uncertainty in the number of object instances depicted in a particular image is then achieved via a transformed Dirichlet process. In contrast with image-based approaches to object recognition, we model scale variations as the perspective projection of objects in different 3D poses. To calibrate the underlying geometry, we incorporate binocular stereo images into the training process. A robust likelihood model accounts for outliers in matched stereo features, allowing effective learning of 3D object structure from partial 2D segmentations. Applied to a dataset of office scenes, our model detects objects at multiple scales via a coarse reconstruction of the corresponding 3D geometry.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115047964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 77
期刊
2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1