2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)最新文献

英文中文

Inferring Facial Action Units with Causal Relations 用因果关系推断面部动作单元

2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)

Pub Date : 2006-06-17 DOI: 10.1109/CVPR.2006.154

Yan Tong, Wenhui Liao, Q. Ji

A system that could automatically analyze the facial actions in real time have applications in a number of different fields. However, developing such a system is always a challenging task due to the richness, ambiguity, and dynamic nature of facial actions. Although a number of research groups attempt to recognize action units (AUs) by either improving facial feature extraction techniques, or the AU classification techniques, these methods often recognize AUs individually and statically, therefore ignoring the semantic relationships among AUs and the dynamics of AUs. Hence, these approaches cannot always recognize AUs reliably, robustly, and consistently. In this paper, we propose a novel approach for AUs classification, that systematically accounts for relationships among AUs and their temporal evolution. Specifically, we use a dynamic Bayesian network (DBN) to model the relationships among different AUs. The DBN provides a coherent and unified hierarchical probabilistic framework to represent probabilistic relationships among different AUs and account for the temporal changes in facial action development. Under our system, robust computer vision techniques are used to get AU measurements. And such AU measurements are then applied as evidence into the DBN for inferencing various AUs. The experiments show the integration of AU relationships and AU dynamics with AU image measurements yields significant improvements in AU recognition.

一个可以实时自动分析面部动作的系统在许多不同的领域都有应用。然而，由于面部动作的丰富性、模糊性和动态性，开发这样的系统始终是一项具有挑战性的任务。尽管许多研究小组试图通过改进面部特征提取技术或动作单元分类技术来识别动作单元，但这些方法通常是静态地单独识别动作单元，从而忽略了动作单元之间的语义关系和动作单元之间的动态关系。因此，这些方法不能总是可靠、健壮和一致地识别AUs。在本文中，我们提出了一种新的类群分类方法，该方法系统地解释了类群之间的关系及其时间演化。具体来说，我们使用动态贝叶斯网络(DBN)来建模不同au之间的关系。DBN提供了一个连贯统一的层次概率框架来表示不同活动之间的概率关系，并解释了面部动作发展的时间变化。在我们的系统中，使用了鲁棒的计算机视觉技术来获得AU测量。然后将这些AU测量值作为证据应用到DBN中，以推断各种AU。实验表明，将AU关系和AU动态与AU图像测量相结合，可以显著改善AU识别。

{"title":"Inferring Facial Action Units with Causal Relations","authors":"Yan Tong, Wenhui Liao, Q. Ji","doi":"10.1109/CVPR.2006.154","DOIUrl":"https://doi.org/10.1109/CVPR.2006.154","url":null,"abstract":"A system that could automatically analyze the facial actions in real time have applications in a number of different fields. However, developing such a system is always a challenging task due to the richness, ambiguity, and dynamic nature of facial actions. Although a number of research groups attempt to recognize action units (AUs) by either improving facial feature extraction techniques, or the AU classification techniques, these methods often recognize AUs individually and statically, therefore ignoring the semantic relationships among AUs and the dynamics of AUs. Hence, these approaches cannot always recognize AUs reliably, robustly, and consistently. In this paper, we propose a novel approach for AUs classification, that systematically accounts for relationships among AUs and their temporal evolution. Specifically, we use a dynamic Bayesian network (DBN) to model the relationships among different AUs. The DBN provides a coherent and unified hierarchical probabilistic framework to represent probabilistic relationships among different AUs and account for the temporal changes in facial action development. Under our system, robust computer vision techniques are used to get AU measurements. And such AU measurements are then applied as evidence into the DBN for inferencing various AUs. The experiments show the integration of AU relationships and AU dynamics with AU image measurements yields significant improvements in AU recognition.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124528414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 40

Multiple Face Model of Hybrid Fourier Feature for Large Face Image Set 大型人脸图像集的混合傅里叶特征多人脸模型

2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)

Pub Date : 2006-06-17 DOI: 10.1109/CVPR.2006.201

Wonjun Hwang, Gyu-tae Park, Jongha Lee, S. Kee

The face recognition system based on the only single classifier considering the restricted information can not guarantee the generality and superiority of performances in a real situation. To challenge such problems, we propose the hybrid Fourier features extracted from different frequency bands and multiple face models. The hybrid Fourier feature comprises three different Fourier domains; merged real and imaginary components, Fourier spectrum and phase angle. When deriving Fourier features from three Fourier domains, we define three different frequency bandwidths, so that additional complementary features can be obtained. After this, they are individually classified by Linear Discriminant Analysis. This approach makes possible analyzing a face image from the various viewpoints to recognize identities. Moreover, we propose multiple face models based on different eye positions with a same image size, and it contributes to increasing the performance of the proposed system. We evaluated this proposed system using the Face Recognition Grand Challenge (FRGC) experimental protocols known as the largest data sets available. Experimental results on FRGC version 2.0 data sets has proven that the proposed method shows better verification rates than the baseline of FRGC on 2D frontal face images under various situations such as illumination changes, expression changes, and time elapses.

基于单一分类器的人脸识别系统考虑到有限的信息，不能保证在真实情况下性能的通用性和优越性。为了解决这些问题，我们提出了从不同频带和多个人脸模型中提取混合傅里叶特征。混合傅里叶特征包括三个不同的傅里叶域;合并实虚分量，傅里叶频谱和相位角。当从三个傅里叶域中导出傅里叶特征时，我们定义了三个不同的频率带宽，以便获得额外的互补特征。然后，分别用线性判别分析对它们进行分类。这种方法使得从不同角度分析人脸图像来识别身份成为可能。此外，我们提出了基于相同图像大小的不同眼睛位置的多个人脸模型，这有助于提高系统的性能。我们使用人脸识别大挑战(FRGC)实验协议评估了该系统，该实验协议被称为可用的最大数据集。在FRGC 2.0版本数据集上的实验结果表明，在光照变化、表情变化、时间流逝等多种情况下，该方法对二维正面人脸图像的验证率优于FRGC基线。

{"title":"Multiple Face Model of Hybrid Fourier Feature for Large Face Image Set","authors":"Wonjun Hwang, Gyu-tae Park, Jongha Lee, S. Kee","doi":"10.1109/CVPR.2006.201","DOIUrl":"https://doi.org/10.1109/CVPR.2006.201","url":null,"abstract":"The face recognition system based on the only single classifier considering the restricted information can not guarantee the generality and superiority of performances in a real situation. To challenge such problems, we propose the hybrid Fourier features extracted from different frequency bands and multiple face models. The hybrid Fourier feature comprises three different Fourier domains; merged real and imaginary components, Fourier spectrum and phase angle. When deriving Fourier features from three Fourier domains, we define three different frequency bandwidths, so that additional complementary features can be obtained. After this, they are individually classified by Linear Discriminant Analysis. This approach makes possible analyzing a face image from the various viewpoints to recognize identities. Moreover, we propose multiple face models based on different eye positions with a same image size, and it contributes to increasing the performance of the proposed system. We evaluated this proposed system using the Face Recognition Grand Challenge (FRGC) experimental protocols known as the largest data sets available. Experimental results on FRGC version 2.0 data sets has proven that the proposed method shows better verification rates than the baseline of FRGC on 2D frontal face images under various situations such as illumination changes, expression changes, and time elapses.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130891878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 57

Tunable Kernels for Tracking 用于跟踪的可调内核

2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)

Pub Date : 2006-06-17 DOI: 10.1109/CVPR.2006.317

Vasu Parameswaran, Visvanathan Ramesh, Imad Zoghlami

We present a tunable representation for tracking that simultaneously encodes appearance and geometry in a manner that enables the use of mean-shift iterations for tracking. The classic formulation of the tracking problem using mean-shift iterations encodes spatial information very loosely (i.e. using radially symmetric kernels). A problem with such a formulation is that it becomes easy for the tracker to get confused with other objects having the same feature distribution but different spatial configurations of features. Subsequent approaches have addressed this issue but not to the degree of generality required for tracking specific classes of objects and motions (e.g. humans walking). In this paper, we formulate the tracking problem in a manner that encodes the spatial configuration of features along with their density and yet retains robustness to spatial deformations and feature density variations. The encoding of spatial configuration is done using a set of kernels whose parameters can be optimized for a given class of objects and motions, off-line. The formulation enables the use of meanshift iterations and runs in real-time. We demonstrate better tracking results on synthetic and real image sequences as compared to the original mean-shift tracker.

我们提出了一种可调的跟踪表示，它同时以一种允许使用mean-shift迭代进行跟踪的方式编码外观和几何形状。使用均值移位迭代的跟踪问题的经典公式非常松散地编码空间信息(即使用径向对称核)。这种公式的一个问题是，跟踪器很容易与具有相同特征分布但特征空间配置不同的其他对象混淆。随后的方法已经解决了这个问题，但没有达到跟踪特定类别的物体和运动(例如人类行走)所需的通用性程度。在本文中，我们以一种对特征的空间配置及其密度进行编码的方式来表述跟踪问题，同时保持对空间变形和特征密度变化的鲁棒性。空间配置的编码是使用一组内核完成的，这些内核的参数可以针对给定的对象和运动类进行离线优化。该公式允许使用meanshift迭代并实时运行。与原始均值移位跟踪器相比，我们在合成和真实图像序列上展示了更好的跟踪结果。

{"title":"Tunable Kernels for Tracking","authors":"Vasu Parameswaran, Visvanathan Ramesh, Imad Zoghlami","doi":"10.1109/CVPR.2006.317","DOIUrl":"https://doi.org/10.1109/CVPR.2006.317","url":null,"abstract":"We present a tunable representation for tracking that simultaneously encodes appearance and geometry in a manner that enables the use of mean-shift iterations for tracking. The classic formulation of the tracking problem using mean-shift iterations encodes spatial information very loosely (i.e. using radially symmetric kernels). A problem with such a formulation is that it becomes easy for the tracker to get confused with other objects having the same feature distribution but different spatial configurations of features. Subsequent approaches have addressed this issue but not to the degree of generality required for tracking specific classes of objects and motions (e.g. humans walking). In this paper, we formulate the tracking problem in a manner that encodes the spatial configuration of features along with their density and yet retains robustness to spatial deformations and feature density variations. The encoding of spatial configuration is done using a set of kernels whose parameters can be optimized for a given class of objects and motions, off-line. The formulation enables the use of meanshift iterations and runs in real-time. We demonstrate better tracking results on synthetic and real image sequences as compared to the original mean-shift tracker.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128866321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 49

Image Denoising with Shrinkage and Redundant Representations 基于收缩和冗余表示的图像去噪

2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)

Pub Date : 2006-06-17 DOI: 10.1109/CVPR.2006.143

Michael Elad, Boaz Matalon, M. Zibulevsky

Shrinkage is a well known and appealing denoising technique. The use of shrinkage is known to be optimal for Gaussian white noise, provided that the sparsity on the signal’s representation is enforced using a unitary transform. Still, shrinkage is also practiced successfully with nonunitary, and even redundant representations. In this paper we shed some light on this behavior. We show that simple shrinkage could be interpreted as the first iteration of an algorithm that solves the basis pursuit denoising (BPDN) problem. Thus, this work leads to a novel iterative shrinkage algorithm that can be considered as an effective pursuit method. We demonstrate this algorithm, both on synthetic data, and for the image denoising problem, where we learn the image prior parameters directly from the given image. The results in both cases are superior to several popular alternatives.

收缩是一种众所周知且吸引人的去噪技术。已知使用收缩是高斯白噪声的最佳选择，前提是使用酉变换强制执行信号表示的稀疏性。尽管如此，对于非单一的，甚至冗余的表示，收缩也可以成功地实践。在本文中，我们阐明了这种行为。我们表明，简单的收缩可以解释为解决基追踪去噪(BPDN)问题的算法的第一次迭代。因此，这项工作导致了一种新的迭代收缩算法，可以被认为是一种有效的追踪方法。我们在合成数据和图像去噪问题上演示了该算法，其中我们直接从给定图像中学习图像先验参数。这两种方法的结果都优于几种流行的替代方法。

引用次数: 106

Multi-Target Tracking - Linking Identities using Bayesian Network Inference 多目标跟踪-使用贝叶斯网络推理链接身份

2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)

Pub Date : 2006-06-17 DOI: 10.1109/CVPR.2006.198

Peter Nillius, Josephine Sullivan, S. Carlsson

Multi-target tracking requires locating the targets and labeling their identities. The latter is a challenge when many targets, with indistinct appearances, frequently occlude one another, as in football and surveillance tracking. We present an approach to solving this labeling problem. When isolated, a target can be tracked and its identity maintained. While, if targets interact this is not always the case. This paper assumes a track graph exists, denoting when targets are isolated and describing how they interact. Measures of similarity between isolated tracks are defined. The goal is to associate the identities of the isolated tracks, by exploiting the graph constraints and similarity measures. We formulate this as a Bayesian network inference problem, allowing us to use standard message propagation to find the most probable set of paths in an efficient way. The high complexity inevitable in large problems is gracefully reduced by removing dependency links between tracks. We apply the method to a 10 min sequence of an international football game and compare results to ground truth.

多目标跟踪需要对目标进行定位并标记其身份。后者是一个挑战，因为在足球和监视跟踪中，许多目标的外观模糊不清，经常相互遮挡。我们提出了一种解决这个标签问题的方法。当被隔离时，目标可以被跟踪并保持其身份。然而，如果目标相互作用，情况并非总是如此。本文假设存在一个轨迹图，表示目标何时被隔离并描述它们如何相互作用。定义了孤立轨道之间的相似性度量。目标是通过利用图形约束和相似性度量来关联孤立轨道的身份。我们将其表述为贝叶斯网络推理问题，允许我们使用标准消息传播以有效的方式找到最可能的路径集。在大型问题中不可避免的高复杂性可以通过移除轨道之间的依赖链接而优雅地降低。我们将该方法应用于一场国际足球比赛的10分钟序列，并将结果与地面事实进行比较。

引用次数: 178

Single View Reconstruction of Curved Surfaces 曲面的单视图重建

2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)

Pub Date : 2006-06-17 DOI: 10.1109/CVPR.2006.281

Mukta Prasad, A. Fitzgibbon

Recent advances in single-view reconstruction (SVR) have been in modelling power (curved 2.5D surfaces) and automation (automatic photo pop-up). We extend SVR along both of these directions. We increase modelling power in several ways: (i) We represent general 3D surfaces, rather than 2.5D Monge patches; (ii) We describe a closed-form method to reconstruct a smooth surface from its image apparent contour, including multilocal singularities ("kidney-bean" self-occlusions); (iii) We show how to incorporate user-specified data such as surface normals, interpolation and approximation constraints; (iv) We show how this algorithm can be adapted to deal with surfaces of arbitrary genus. We also show how the modelling process can be automated for simple object shapes and views, using a-priori object class information. We demonstrate these advances on natural images drawn from a number of object classes.

单视图重建(SVR)的最新进展是建模能力(弯曲的2.5D曲面)和自动化(自动弹出照片)。我们沿着这两个方向扩展SVR。我们通过几种方式增加建模能力:(i)我们表示一般的3D表面，而不是2.5D蒙日补丁;(ii)我们描述了一种从图像表观轮廓重建光滑表面的封闭形式方法，包括多局部奇点(“肾豆”自闭塞);(iii)我们展示了如何合并用户指定的数据，如表面法线，插值和近似约束;(iv)我们展示了该算法如何适用于处理任意属的曲面。我们还展示了如何使用先验对象类信息自动化简单对象形状和视图的建模过程。我们展示了从许多对象类中提取的自然图像的这些进展。

引用次数: 139

Recursive estimation of generative models of video 视频生成模型的递归估计

2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)

Pub Date : 2006-06-17 DOI: 10.1109/CVPR.2006.248

Nemanja Petrović, A. Ivanovic, N. Jojic

In this paper we present a generative model and learning procedure for unsupervised video clustering into scenes. The work addresses two important problems: realistic modeling of the sources of variability in the video and fast transformation invariant frame clustering. We suggest a solution to the problem of computationally intensive learning in this model by combining the recursive model estimation, fast inference, and on-line learning. Thus, we achieve real time frame clustering performance. Novel aspects of this method include an algorithm for the clustering of Gaussian mixtures, and the fast computation of the KL divergence between two mixtures of Gaussians. The efficiency and the performance of clustering and KL approximation methods are demonstrated. We also present novel video browsing tool based on the visualization of the variables in the generative model.

本文提出了一种无监督视频聚类的生成模型和学习过程。该工作解决了两个重要问题:视频中可变性源的逼真建模和快速变换不变帧聚类。我们提出了一种将递归模型估计、快速推理和在线学习相结合的方法来解决该模型中计算密集型学习的问题。因此，我们实现了实时帧聚类性能。该方法的新颖之处包括高斯混合聚类算法，以及两个高斯混合间KL散度的快速计算。证明了聚类和KL近似方法的效率和性能。我们还提出了一种新的基于生成模型中变量可视化的视频浏览工具。

引用次数: 24

Accurate Tracking of Monotonically Advancing Fronts 单调推进锋面的精确跟踪

2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)

Pub Date : 2006-06-17 DOI: 10.1109/CVPR.2006.46

M. Hassouna, A. Farag

A wide range of computer vision applications such as distance field computation, shape from shading, and shape representation require an accurate solution of a particular Hamilton-Jacobi (HJ) equation, known as the Eikonal equation. Although the fast marching method (FMM) is the most stable and consistent method among existing techniques for solving such equation, it suffers from large numerical error along diagonal directions as well as its computational complexity is not optimal. In this paper, we propose an improved version of the FMMthat is both highly accurate and computationally efficient for Cartesian domains. The new method is called the multi-stencils fast marching (MSFM), which computes the solution at each grid point by solving the Eikonal equation along several stencils and then picks the solution that satisfies the fast marching causality relationship. The stencils are centered at each grid point x and cover its entire nearest neighbors. In 2D space, 2 stencils cover the 8-neighbors of x, while in 3D space, 6 stencils cover its 26-neighbors. For those stencils that are not aligned with the natural coordinate system, the Eikonal equation is derived using directional derivatives and then solved using a higher order finite difference scheme.

广泛的计算机视觉应用，如距离场计算，阴影形状和形状表示，需要一个特定的Hamilton-Jacobi (HJ)方程的精确解，称为Eikonal方程。快速推进法(FMM)是目前求解该类方程的最稳定、最一致的方法，但其在对角线方向上存在较大的数值误差，且计算复杂度不是最优的。在本文中，我们提出了一个改进的fmm版本，它在笛卡尔域上具有很高的精度和计算效率。这种新方法被称为多模板快速推进(MSFM)，它通过沿多个模板求解Eikonal方程来计算每个网格点的解，然后选择满足快速推进因果关系的解。模板以每个网格点x为中心，并覆盖其整个最近的邻居。在二维空间中，2个模板覆盖x的8个邻居，而在三维空间中，6个模板覆盖x的26个邻居。对于不与自然坐标系对齐的模板，先用方向导数推导出Eikonal方程，然后用高阶有限差分格式求解。

{"title":"Accurate Tracking of Monotonically Advancing Fronts","authors":"M. Hassouna, A. Farag","doi":"10.1109/CVPR.2006.46","DOIUrl":"https://doi.org/10.1109/CVPR.2006.46","url":null,"abstract":"A wide range of computer vision applications such as distance field computation, shape from shading, and shape representation require an accurate solution of a particular Hamilton-Jacobi (HJ) equation, known as the Eikonal equation. Although the fast marching method (FMM) is the most stable and consistent method among existing techniques for solving such equation, it suffers from large numerical error along diagonal directions as well as its computational complexity is not optimal. In this paper, we propose an improved version of the FMMthat is both highly accurate and computationally efficient for Cartesian domains. The new method is called the multi-stencils fast marching (MSFM), which computes the solution at each grid point by solving the Eikonal equation along several stencils and then picks the solution that satisfies the fast marching causality relationship. The stencils are centered at each grid point x and cover its entire nearest neighbors. In 2D space, 2 stencils cover the 8-neighbors of x, while in 3D space, 6 stencils cover its 26-neighbors. For those stencils that are not aligned with the natural coordinate system, the Eikonal equation is derived using directional derivatives and then solved using a higher order finite difference scheme.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126214536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Structure from Motion with Known Camera Positions 结构从运动与已知的相机位置

2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)

Pub Date : 2006-06-17 DOI: 10.1109/CVPR.2006.296

R. Carceroni, Ankita Kumar, Kostas Daniilidis

The wide availability of GPS sensors is changing the landscape in the applications of structure from motion techniques for localization. In this paper, we study the problem of estimating camera orientations from multiple views, given the positions of the viewpoints in a world coordinate system and a set of point correspondences across the views. Given three or more views, the above problem has a finite number of solutions for three or more point correspondences. Given six or more views, the problem has a finite number of solutions for just two or more points. In the three-view case, we show the necessary and sufficient conditions for the three essential matrices to be consistent with a set of known baselines. We also introduce a method to recover the absolute orientations of three views in world coordinates from their essential matrices. To refine these estimates we perform a least-squares minimization on the group cross product SO(3) × SO(3) × SO(3). We report experiments on synthetic data and on data from the ICCV2005 Computer Vision Contest.

GPS传感器的广泛应用正在改变结构从运动技术到定位的应用格局。本文研究了在给定视点在世界坐标系中的位置和视点间的一组对应点的情况下，从多个视点估计相机方向的问题。给定三个或更多视图，上述问题对于三个或更多点对应具有有限个数的解。给定六个或更多的视图，该问题只有两个或更多点的有限数量的解决方案。在三视图的情况下，我们展示了三个基本矩阵与一组已知基线一致的必要和充分条件。我们还介绍了一种从世界坐标的三个视图的基本矩阵中恢复其绝对方向的方法。为了改进这些估计，我们对群外积SO(3) × SO(3) × SO(3)执行最小二乘最小化。我们报告了合成数据和ICCV2005计算机视觉竞赛数据的实验。

引用次数: 29

Supervised Learning of Edges and Object Boundaries 边缘和对象边界的监督学习

2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)

Pub Date : 2006-06-17 DOI: 10.1109/CVPR.2006.298

Piotr Dollár, Z. Tu, Serge J. Belongie

Edge detection is one of the most studied problems in computer vision, yet it remains a very challenging task. It is difficult since often the decision for an edge cannot be made purely based on low level cues such as gradient, instead we need to engage all levels of information, low, middle, and high, in order to decide where to put edges. In this paper we propose a novel supervised learning algorithm for edge and object boundary detection which we refer to as Boosted Edge Learning or BEL for short. A decision of an edge point is made independently at each location in the image; a very large aperture is used providing significant context for each decision. In the learning stage, the algorithm selects and combines a large number of features across different scales in order to learn a discriminative model using an extended version of the Probabilistic Boosting Tree classification algorithm. The learning based framework is highly adaptive and there are no parameters to tune. We show applications for edge detection in a number of specific image domains as well as on natural images. We test on various datasets including the Berkeley dataset and the results obtained are very good.

边缘检测是计算机视觉中研究最多的问题之一，但它仍然是一个非常具有挑战性的任务。这是很困难的，因为对于边缘的决定不能完全基于低层次的线索，如梯度，相反，我们需要参与所有层次的信息，低，中，高，以决定放置边缘的位置。在本文中，我们提出了一种新的边缘和目标边界检测的监督学习算法，我们将其称为增强边缘学习(boosting edge learning，简称BEL)。在图像的每个位置独立地确定边缘点;使用非常大的孔径为每个决定提供重要的背景。在学习阶段，算法选择并组合大量不同尺度的特征，使用扩展版本的概率提升树分类算法来学习判别模型。基于学习的框架具有很高的适应性，没有参数需要调整。我们展示了边缘检测在许多特定图像域以及自然图像上的应用。我们在包括Berkeley数据集在内的各种数据集上进行了测试，得到了很好的结果。

{"title":"Supervised Learning of Edges and Object Boundaries","authors":"Piotr Dollár, Z. Tu, Serge J. Belongie","doi":"10.1109/CVPR.2006.298","DOIUrl":"https://doi.org/10.1109/CVPR.2006.298","url":null,"abstract":"Edge detection is one of the most studied problems in computer vision, yet it remains a very challenging task. It is difficult since often the decision for an edge cannot be made purely based on low level cues such as gradient, instead we need to engage all levels of information, low, middle, and high, in order to decide where to put edges. In this paper we propose a novel supervised learning algorithm for edge and object boundary detection which we refer to as Boosted Edge Learning or BEL for short. A decision of an edge point is made independently at each location in the image; a very large aperture is used providing significant context for each decision. In the learning stage, the algorithm selects and combines a large number of features across different scales in order to learn a discriminative model using an extended version of the Probabilistic Boosting Tree classification algorithm. The learning based framework is highly adaptive and there are no parameters to tune. We show applications for edge detection in a number of specific image domains as well as on natural images. We test on various datasets including the Berkeley dataset and the results obtained are very good.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127558247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 491

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀