首页 > 最新文献

2007 IEEE Conference on Computer Vision and Pattern Recognition最新文献

英文 中文
Patch-based Image Correlation with Rapid Filtering 基于patch的快速滤波图像相关
Pub Date : 2007-06-17 DOI: 10.1109/CVPR.2007.383373
G. Guo, C. Dyer
This paper describes a patch-based approach for rapid image correlation or template matching. By representing a template image with an ensemble of patches, the method is robust with respect to variations such as local appearance variation, partial occlusion, and scale changes. Rectangle filters are applied to each image patch for fast filtering based on the integral image representation. A new method is developed for feature dimension reduction by detecting the "salient" image structures given a single image. Experiments on a variety images show the success of the method in dealing with different variations in the test images. In terms of computation time, the approach is faster than traditional methods by up to two orders of magnitude and is at least three times faster than a fast implementation of normalized cross correlation.
本文描述了一种基于补丁的快速图像相关或模板匹配方法。通过对模板图像进行拼接,该方法对局部外观变化、局部遮挡和尺度变化等变化具有鲁棒性。基于图像的积分表示,对每个图像块应用矩形滤波器进行快速滤波。提出了一种通过检测单个图像的“显著性”结构来实现特征降维的新方法。在各种图像上的实验表明,该方法可以成功地处理测试图像中的不同变化。在计算时间方面,该方法比传统方法快两个数量级,比快速实现归一化互相关快至少三倍。
{"title":"Patch-based Image Correlation with Rapid Filtering","authors":"G. Guo, C. Dyer","doi":"10.1109/CVPR.2007.383373","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383373","url":null,"abstract":"This paper describes a patch-based approach for rapid image correlation or template matching. By representing a template image with an ensemble of patches, the method is robust with respect to variations such as local appearance variation, partial occlusion, and scale changes. Rectangle filters are applied to each image patch for fast filtering based on the integral image representation. A new method is developed for feature dimension reduction by detecting the \"salient\" image structures given a single image. Experiments on a variety images show the success of the method in dealing with different variations in the test images. In terms of computation time, the approach is faster than traditional methods by up to two orders of magnitude and is at least three times faster than a fast implementation of normalized cross correlation.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125605215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
Sensor noise modeling using the Skellam distribution: Application to the color edge detection 使用Skellam分布的传感器噪声建模:在颜色边缘检测中的应用
Pub Date : 2007-06-17 DOI: 10.1109/CVPR.2007.383004
Youngbae Hwang, Jun-Sik Kim, In-So Kweon
In this paper, we introduce the Skellam distribution as a sensor noise model for CCD or CMOS cameras. This is derived from the Poisson distribution of photons that determine the sensor response. We show that the Skellam distribution can be used to measure the intensity difference of pixels in the spatial domain, as well as in the temporal domain. In addition, we show that Skellam parameters are linearly related to the intensity of the pixels. This property means that the brighter pixels tolerate greater variation of intensity than the darker pixels. This enables us to decide automatically whether two pixels have different colors. We apply this modeling to detect the edges in color images. The resulting algorithm requires only a confidence interval for a hypothesis test, because it uses the distribution of image noise directly. More importantly, we demonstrate that without conventional Gaussian smoothing the noise model-based approach can automatically extract the fine details of image structures, such as edges and corners, independent of camera setting.
本文介绍了Skellam分布作为CCD或CMOS相机的传感器噪声模型。这是由决定传感器响应的光子泊松分布得出的。我们表明,Skellam分布可以用来测量像素在空间域的强度差,以及在时域。此外,我们表明Skellam参数与像素的强度线性相关。这一特性意味着较亮的像素比较暗的像素能承受更大的强度变化。这使我们能够自动决定两个像素是否具有不同的颜色。我们将此模型应用于彩色图像的边缘检测。所得到的算法只需要一个置信区间进行假设检验,因为它直接使用了图像噪声的分布。更重要的是,我们证明了在没有传统高斯平滑的情况下,基于噪声模型的方法可以自动提取图像结构的精细细节,如边缘和角落,而不依赖于相机设置。
{"title":"Sensor noise modeling using the Skellam distribution: Application to the color edge detection","authors":"Youngbae Hwang, Jun-Sik Kim, In-So Kweon","doi":"10.1109/CVPR.2007.383004","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383004","url":null,"abstract":"In this paper, we introduce the Skellam distribution as a sensor noise model for CCD or CMOS cameras. This is derived from the Poisson distribution of photons that determine the sensor response. We show that the Skellam distribution can be used to measure the intensity difference of pixels in the spatial domain, as well as in the temporal domain. In addition, we show that Skellam parameters are linearly related to the intensity of the pixels. This property means that the brighter pixels tolerate greater variation of intensity than the darker pixels. This enables us to decide automatically whether two pixels have different colors. We apply this modeling to detect the edges in color images. The resulting algorithm requires only a confidence interval for a hypothesis test, because it uses the distribution of image noise directly. More importantly, we demonstrate that without conventional Gaussian smoothing the noise model-based approach can automatically extract the fine details of image structures, such as edges and corners, independent of camera setting.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"301 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122636612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 53
Pose and Illumination Invariant Face Recognition in Video 视频中姿态和光照不变人脸识别
Pub Date : 2007-06-17 DOI: 10.1109/CVPR.2007.383376
Yilei Xu, A. Roy-Chowdhury, Keyur Patel
The use of video sequences for face recognition has been relatively less studied than image-based approaches. In this paper, we present a framework for face recognition from video sequences that is robust to large changes in facial pose and lighting conditions. Our method is based on a recently obtained theoretical result that can integrate the effects of motion, lighting and shape in generating an image using a perspective camera. This result can be used to estimate the pose and illumination conditions for each frame of the probe sequence. Then, using a 3D face model, we synthesize images corresponding to the pose and illumination conditions estimated in the probe sequences. Similarity between the synthesized images and the probe video is computed by integrating over the entire sequence. The method can handle situations where the pose and lighting conditions in the training and testing data are completely disjoint.
与基于图像的方法相比,使用视频序列进行人脸识别的研究相对较少。在本文中,我们提出了一个从视频序列中识别人脸的框架,该框架对面部姿势和光照条件的大变化具有鲁棒性。我们的方法是基于最近获得的理论结果,该结果可以将运动,照明和形状的效果集成在使用透视相机生成图像中。这个结果可以用来估计每一帧探针序列的姿态和照明条件。然后,利用三维人脸模型,合成与探测序列中估计的姿态和光照条件相对应的图像。通过对整个序列进行积分计算合成图像与探测视频之间的相似度。该方法可以处理训练数据和测试数据中姿态和光照条件完全不一致的情况。
{"title":"Pose and Illumination Invariant Face Recognition in Video","authors":"Yilei Xu, A. Roy-Chowdhury, Keyur Patel","doi":"10.1109/CVPR.2007.383376","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383376","url":null,"abstract":"The use of video sequences for face recognition has been relatively less studied than image-based approaches. In this paper, we present a framework for face recognition from video sequences that is robust to large changes in facial pose and lighting conditions. Our method is based on a recently obtained theoretical result that can integrate the effects of motion, lighting and shape in generating an image using a perspective camera. This result can be used to estimate the pose and illumination conditions for each frame of the probe sequence. Then, using a 3D face model, we synthesize images corresponding to the pose and illumination conditions estimated in the probe sequences. Similarity between the synthesized images and the probe video is computed by integrating over the entire sequence. The method can handle situations where the pose and lighting conditions in the training and testing data are completely disjoint.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122990765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Discriminative Cluster Refinement: Improving Object Category Recognition Given Limited Training Data 判别聚类改进:在有限训练数据下改进目标类别识别
Pub Date : 2007-06-17 DOI: 10.1109/CVPR.2007.383270
Liu Yang, Rong Jin, C. Pantofaru, R. Sukthankar
A popular approach to problems in image classification is to represent the image as a bag of visual words and then employ a classifier to categorize the image. Unfortunately, a significant shortcoming of this approach is that the clustering and classification are disconnected. Since the clustering into visual words is unsupervised, the representation does not necessarily capture the aspects of the data that are most useful for classification. More seriously, the semantic relationship between clusters is lost, causing the overall classification performance to suffer. We introduce "discriminative cluster refinement" (DCR), a method that explicitly models the pairwise relationships between different visual words by exploiting their co-occurrence information. The assigned class labels are used to identify the co-occurrence patterns that are most informative for object classification. DCR employs a maximum-margin approach to generate an optimal kernel matrix for classification. One important benefit of DCR is that it integrates smoothly into existing bag-of-words information retrieval systems by employing the set of visual words generated by any clustering method. While DCR could improve a broad class of information retrieval systems, this paper focuses on object category recognition. We present a direct comparison with a state-of-the art method on the PASCAL 2006 database and show that cluster refinement results in a significant improvement in classification accuracy given a small number of training examples.
一种常用的图像分类方法是将图像表示为一袋视觉词,然后使用分类器对图像进行分类。不幸的是,这种方法的一个显著缺点是聚类和分类是分离的。由于聚类为视觉词是无监督的,因此表示不一定捕获对分类最有用的数据方面。更严重的是,聚类之间的语义关系丢失,导致整体分类性能受到影响。我们引入了“判别聚类精化”(DCR),一种利用共现信息对不同视觉词之间的成对关系进行显式建模的方法。分配的类标签用于识别对对象分类最有信息的共现模式。DCR采用最大边际法生成最优核矩阵进行分类。DCR的一个重要优点是,它可以通过使用任何聚类方法生成的视觉词集顺利地集成到现有的词袋信息检索系统中。虽然DCR可以改进广泛的信息检索系统,但本文的重点是对象类别识别。我们在PASCAL 2006数据库上与最先进的方法进行了直接比较,并表明在给定少量训练示例的情况下,聚类精化可以显著提高分类精度。
{"title":"Discriminative Cluster Refinement: Improving Object Category Recognition Given Limited Training Data","authors":"Liu Yang, Rong Jin, C. Pantofaru, R. Sukthankar","doi":"10.1109/CVPR.2007.383270","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383270","url":null,"abstract":"A popular approach to problems in image classification is to represent the image as a bag of visual words and then employ a classifier to categorize the image. Unfortunately, a significant shortcoming of this approach is that the clustering and classification are disconnected. Since the clustering into visual words is unsupervised, the representation does not necessarily capture the aspects of the data that are most useful for classification. More seriously, the semantic relationship between clusters is lost, causing the overall classification performance to suffer. We introduce \"discriminative cluster refinement\" (DCR), a method that explicitly models the pairwise relationships between different visual words by exploiting their co-occurrence information. The assigned class labels are used to identify the co-occurrence patterns that are most informative for object classification. DCR employs a maximum-margin approach to generate an optimal kernel matrix for classification. One important benefit of DCR is that it integrates smoothly into existing bag-of-words information retrieval systems by employing the set of visual words generated by any clustering method. While DCR could improve a broad class of information retrieval systems, this paper focuses on object category recognition. We present a direct comparison with a state-of-the art method on the PASCAL 2006 database and show that cluster refinement results in a significant improvement in classification accuracy given a small number of training examples.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123008055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
On the Performance Prediction and Validation for Multisensor Fusion 多传感器融合性能预测与验证研究
Pub Date : 2007-06-17 DOI: 10.1109/CVPR.2007.383112
Rong Wang, B. Bhanu
Multiple sensors are commonly fused to improve the detection and recognition performance of computer vision and pattern recognition systems. The traditional approach to determine the optimal sensor combination is to try all possible sensor combinations by performing exhaustive experiments. In this paper, we present a theoretical approach that predicts the performance of sensor fusion that allows us to select the optimal combination. We start with the characteristics of each sensor by computing the match score and non-match score distributions of objects to be recognized. These distributions are modeled as a mixture of Gaussians. Then, we use an explicit Phi transformation that maps a receiver operating characteristic (ROC) curve to a straight line in 2-D space whose axes are related to the false alarm rate (FAR) and the Hit rate (Hit). Finally, using this representation, we derive a set of metrics to evaluate the sensor fusion performance and find the optimal sensor combination. We verify our prediction approach on the publicly available XM2VTS database as well as other databases.
为了提高计算机视觉和模式识别系统的检测和识别性能,通常将多个传感器融合在一起。确定最优传感器组合的传统方法是通过穷举实验来尝试所有可能的传感器组合。在本文中,我们提出了一种理论方法来预测传感器融合的性能,使我们能够选择最佳组合。我们从每个传感器的特征开始,计算待识别物体的匹配分数和非匹配分数分布。这些分布被建模为高斯分布的混合。然后,我们使用显式Phi变换,将接收器工作特性(ROC)曲线映射到二维空间中的直线,其轴与虚警率(FAR)和命中率(Hit)相关。最后,利用这种表示,我们推导了一组指标来评估传感器融合性能并找到最佳的传感器组合。我们在公开可用的XM2VTS数据库以及其他数据库上验证了我们的预测方法。
{"title":"On the Performance Prediction and Validation for Multisensor Fusion","authors":"Rong Wang, B. Bhanu","doi":"10.1109/CVPR.2007.383112","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383112","url":null,"abstract":"Multiple sensors are commonly fused to improve the detection and recognition performance of computer vision and pattern recognition systems. The traditional approach to determine the optimal sensor combination is to try all possible sensor combinations by performing exhaustive experiments. In this paper, we present a theoretical approach that predicts the performance of sensor fusion that allows us to select the optimal combination. We start with the characteristics of each sensor by computing the match score and non-match score distributions of objects to be recognized. These distributions are modeled as a mixture of Gaussians. Then, we use an explicit Phi transformation that maps a receiver operating characteristic (ROC) curve to a straight line in 2-D space whose axes are related to the false alarm rate (FAR) and the Hit rate (Hit). Finally, using this representation, we derive a set of metrics to evaluate the sensor fusion performance and find the optimal sensor combination. We verify our prediction approach on the publicly available XM2VTS database as well as other databases.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131564386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Adaptive Distance Metric Learning for Clustering 聚类的自适应距离度量学习
Pub Date : 2007-06-17 DOI: 10.1109/CVPR.2007.383103
Jieping Ye, Zheng Zhao, Huan Liu
A good distance metric is crucial for unsupervised learning from high-dimensional data. To learn a metric without any constraint or class label information, most unsupervised metric learning algorithms appeal to projecting observed data onto a low-dimensional manifold, where geometric relationships such as local or global pairwise distances are preserved. However, the projection may not necessarily improve the separability of the data, which is the desirable outcome of clustering. In this paper, we propose a novel unsupervised adaptive metric learning algorithm, called AML, which performs clustering and distance metric learning simultaneously. AML projects the data onto a low-dimensional manifold, where the separability of the data is maximized. We show that the joint clustering and distance metric learning can be formulated as a trace maximization problem, which can be solved via an iterative procedure in the EM framework. Experimental results on a collection of benchmark data sets demonstrated the effectiveness of the proposed algorithm.
一个好的距离度量对于从高维数据中进行无监督学习至关重要。为了在没有任何约束或类标签信息的情况下学习度量,大多数无监督度量学习算法呼吁将观察到的数据投影到低维流形上,其中保留了局部或全局成对距离等几何关系。然而,投影不一定能提高数据的可分离性,而这是聚类的理想结果。在本文中,我们提出了一种新的无监督自适应度量学习算法,称为AML,它同时进行聚类和距离度量学习。AML将数据投影到低维流形上,其中数据的可分离性得到最大化。我们证明了联合聚类和距离度量学习可以被表述为轨迹最大化问题,该问题可以通过EM框架中的迭代过程来解决。在一组基准数据集上的实验结果证明了该算法的有效性。
{"title":"Adaptive Distance Metric Learning for Clustering","authors":"Jieping Ye, Zheng Zhao, Huan Liu","doi":"10.1109/CVPR.2007.383103","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383103","url":null,"abstract":"A good distance metric is crucial for unsupervised learning from high-dimensional data. To learn a metric without any constraint or class label information, most unsupervised metric learning algorithms appeal to projecting observed data onto a low-dimensional manifold, where geometric relationships such as local or global pairwise distances are preserved. However, the projection may not necessarily improve the separability of the data, which is the desirable outcome of clustering. In this paper, we propose a novel unsupervised adaptive metric learning algorithm, called AML, which performs clustering and distance metric learning simultaneously. AML projects the data onto a low-dimensional manifold, where the separability of the data is maximized. We show that the joint clustering and distance metric learning can be formulated as a trace maximization problem, which can be solved via an iterative procedure in the EM framework. Experimental results on a collection of benchmark data sets demonstrated the effectiveness of the proposed algorithm.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127626591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 119
Human Detection via Classification on Riemannian Manifolds 黎曼流形的分类人体检测
Pub Date : 2007-06-17 DOI: 10.1109/CVPR.2007.383197
Oncel Tuzel, F. Porikli, P. Meer
We present a new algorithm to detect humans in still images utilizing covariance matrices as object descriptors. Since these descriptors do not lie on a vector space, well known machine learning techniques are not adequate to learn the classifiers. The space of d-dimensional nonsingular covariance matrices can be represented as a connected Riemannian manifold. We present a novel approach for classifying points lying on a Riemannian manifold by incorporating the a priori information about the geometry of the space. The algorithm is tested on INRIA human database where superior detection rates are observed over the previous approaches.
我们提出了一种利用协方差矩阵作为目标描述符来检测静止图像中的人的新算法。由于这些描述符不在向量空间上,因此已知的机器学习技术不足以学习分类器。d维非奇异协方差矩阵的空间可以表示为连通的黎曼流形。我们提出了一种新的黎曼流形上点的分类方法,该方法结合了空间几何的先验信息。该算法在INRIA人类数据库上进行了测试,检测率优于以往的方法。
{"title":"Human Detection via Classification on Riemannian Manifolds","authors":"Oncel Tuzel, F. Porikli, P. Meer","doi":"10.1109/CVPR.2007.383197","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383197","url":null,"abstract":"We present a new algorithm to detect humans in still images utilizing covariance matrices as object descriptors. Since these descriptors do not lie on a vector space, well known machine learning techniques are not adequate to learn the classifiers. The space of d-dimensional nonsingular covariance matrices can be represented as a connected Riemannian manifold. We present a novel approach for classifying points lying on a Riemannian manifold by incorporating the a priori information about the geometry of the space. The algorithm is tested on INRIA human database where superior detection rates are observed over the previous approaches.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128081196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 543
On Constant Focal Length Self-Calibration From Multiple Views 多视角定焦距自标定研究
Pub Date : 2007-06-17 DOI: 10.1109/CVPR.2007.383066
B. Bocquillon, A. Bartoli, Pierre Gurdjos, Alain Crouzil
We investigate the problem of finding the metric structure of a general 3D scene viewed by a moving camera with square pixels and constant unknown focal length. While the problem has a concise and well-understood formulation in the stratified framework thanks to the absolute dual quadric, two open issues remain. The first issue concerns the generic Critical Motion Sequences, i.e. camera motions for which self-calibration is ambiguous. Most of the previous work focuses on the varying focal length case. We provide a thorough study of the constant focal length case. The second issue is to solve the nonlinear set of equations in four unknowns arising from the dual quadric formulation. Most of the previous work either does local nonlinear optimization, thereby requiring an initial solution, or linearizes the problem, which introduces artificial degeneracies, most of which likely to arise in practice. We use interval analysis to solve this problem. The resulting algorithm is guaranteed to find the solution and is not subject to artificial degeneracies. Directly using interval analysis usually results in computationally expensive algorithms. We propose a carefully chosen set of inclusion functions, making it possible to find the solution within few seconds. Comparisons of the proposed algorithm with existing ones are reported for simulated and real data.
我们研究了一个由正方形像素和恒定未知焦距的移动摄像机观看的一般三维场景的度量结构的寻找问题。由于绝对对偶二次方程的存在,该问题在分层框架中有一个简明易懂的表述,但仍有两个悬而未决的问题。第一个问题涉及到通用的关键运动序列,即相机运动,其中自校准是模糊的。以前的大部分工作都集中在变焦距的情况下。我们提供了一个深入的研究,恒定焦距的情况下。第二个问题是求解由对偶二次方程引起的四未知数非线性方程组。大多数先前的工作要么进行局部非线性优化,因此需要一个初始解,要么将问题线性化,这引入了人工退化,其中大多数可能在实践中出现。我们用区间分析来解决这个问题。所得到的算法保证能找到解并且不受人为退化的影响。直接使用区间分析通常会导致计算代价高昂的算法。我们提出了一组精心挑选的包含函数,使得在几秒钟内找到解决方案成为可能。在仿真数据和实际数据中,将所提算法与现有算法进行了比较。
{"title":"On Constant Focal Length Self-Calibration From Multiple Views","authors":"B. Bocquillon, A. Bartoli, Pierre Gurdjos, Alain Crouzil","doi":"10.1109/CVPR.2007.383066","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383066","url":null,"abstract":"We investigate the problem of finding the metric structure of a general 3D scene viewed by a moving camera with square pixels and constant unknown focal length. While the problem has a concise and well-understood formulation in the stratified framework thanks to the absolute dual quadric, two open issues remain. The first issue concerns the generic Critical Motion Sequences, i.e. camera motions for which self-calibration is ambiguous. Most of the previous work focuses on the varying focal length case. We provide a thorough study of the constant focal length case. The second issue is to solve the nonlinear set of equations in four unknowns arising from the dual quadric formulation. Most of the previous work either does local nonlinear optimization, thereby requiring an initial solution, or linearizes the problem, which introduces artificial degeneracies, most of which likely to arise in practice. We use interval analysis to solve this problem. The resulting algorithm is guaranteed to find the solution and is not subject to artificial degeneracies. Directly using interval analysis usually results in computationally expensive algorithms. We propose a carefully chosen set of inclusion functions, making it possible to find the solution within few seconds. Comparisons of the proposed algorithm with existing ones are reported for simulated and real data.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132560289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Learning a Spatially Smooth Subspace for Face Recognition 学习空间平滑子空间用于人脸识别
Pub Date : 2007-06-17 DOI: 10.1109/CVPR.2007.383054
Deng Cai, Xiaofei He, Yuxiao Hu, Jiawei Han, Thomas S. Huang
Subspace learning based face recognition methods have attracted considerable interests in recently years, including principal component analysis (PCA), linear discriminant analysis (LDA), locality preserving projection (LPP), neighborhood preserving embedding (NPE), marginal fisher analysis (MFA) and local discriminant embedding (LDE). These methods consider an n1timesn2 image as a vector in Rn 1 timesn 2 and the pixels of each image are considered as independent. While an image represented in the plane is intrinsically a matrix. The pixels spatially close to each other may be correlated. Even though we have n1xn2 pixels per image, this spatial correlation suggests the real number of freedom is far less. In this paper, we introduce a regularized subspace learning model using a Laplacian penalty to constrain the coefficients to be spatially smooth. All these existing subspace learning algorithms can fit into this model and produce a spatially smooth subspace which is better for image representation than their original version. Recognition, clustering and retrieval can be then performed in the image subspace. Experimental results on face recognition demonstrate the effectiveness of our method.
近年来,基于子空间学习的人脸识别方法引起了人们的广泛关注,包括主成分分析(PCA)、线性判别分析(LDA)、局部保持投影(LPP)、邻域保持嵌入(NPE)、边际fisher分析(MFA)和局部判别嵌入(LDE)。这些方法将n1timesn2图像视为r1times2中的向量,并且每个图像的像素被认为是独立的。而在平面上表示的图像本质上是一个矩阵。空间上彼此接近的像素可能是相关的。即使我们每张图像有n1xn2个像素,这种空间相关性表明实际的自由度要少得多。在本文中,我们引入了一个正则化的子空间学习模型,使用拉普拉斯惩罚来约束系数在空间上是光滑的。所有现有的子空间学习算法都可以适应该模型,并产生空间平滑的子空间,比原来的版本更好地用于图像表示。然后在图像子空间中进行识别、聚类和检索。人脸识别的实验结果证明了该方法的有效性。
{"title":"Learning a Spatially Smooth Subspace for Face Recognition","authors":"Deng Cai, Xiaofei He, Yuxiao Hu, Jiawei Han, Thomas S. Huang","doi":"10.1109/CVPR.2007.383054","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383054","url":null,"abstract":"Subspace learning based face recognition methods have attracted considerable interests in recently years, including principal component analysis (PCA), linear discriminant analysis (LDA), locality preserving projection (LPP), neighborhood preserving embedding (NPE), marginal fisher analysis (MFA) and local discriminant embedding (LDE). These methods consider an n1timesn2 image as a vector in Rn 1 timesn 2 and the pixels of each image are considered as independent. While an image represented in the plane is intrinsically a matrix. The pixels spatially close to each other may be correlated. Even though we have n1xn2 pixels per image, this spatial correlation suggests the real number of freedom is far less. In this paper, we introduce a regularized subspace learning model using a Laplacian penalty to constrain the coefficients to be spatially smooth. All these existing subspace learning algorithms can fit into this model and produce a spatially smooth subspace which is better for image representation than their original version. Recognition, clustering and retrieval can be then performed in the image subspace. Experimental results on face recognition demonstrate the effectiveness of our method.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132744373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 359
Unsupervised Learning of Image Transformations 图像变换的无监督学习
Pub Date : 2007-06-17 DOI: 10.1109/CVPR.2007.383036
R. Memisevic, Geoffrey E. Hinton
We describe a probabilistic model for learning rich, distributed representations of image transformations. The basic model is defined as a gated conditional random field that is trained to predict transformations of its inputs using a factorial set of latent variables. Inference in the model consists in extracting the transformation, given a pair of images, and can be performed exactly and efficiently. We show that, when trained on natural videos, the model develops domain specific motion features, in the form of fields of locally transformed edge filters. When trained on affine, or more general, transformations of still images, the model develops codes for these transformations, and can subsequently perform recognition tasks that are invariant under these transformations. It can also fantasize new transformations on previously unseen images. We describe several variations of the basic model and provide experimental results that demonstrate its applicability to a variety of tasks.
我们描述了一个概率模型,用于学习图像变换的丰富、分布式表示。基本模型被定义为一个门控条件随机场,它被训练成使用一组潜在变量的阶乘来预测其输入的变换。模型中的推理是在给定一对图像的情况下提取变换,可以准确有效地进行推理。我们表明,当对自然视频进行训练时,该模型以局部变换的边缘滤波器的场的形式发展出特定领域的运动特征。当对静止图像的仿射或更一般的转换进行训练时,该模型为这些转换开发代码,并随后执行在这些转换下不变的识别任务。它还可以幻想对以前看不见的图像进行新的转换。我们描述了基本模型的几种变体,并提供了实验结果,证明了它对各种任务的适用性。
{"title":"Unsupervised Learning of Image Transformations","authors":"R. Memisevic, Geoffrey E. Hinton","doi":"10.1109/CVPR.2007.383036","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383036","url":null,"abstract":"We describe a probabilistic model for learning rich, distributed representations of image transformations. The basic model is defined as a gated conditional random field that is trained to predict transformations of its inputs using a factorial set of latent variables. Inference in the model consists in extracting the transformation, given a pair of images, and can be performed exactly and efficiently. We show that, when trained on natural videos, the model develops domain specific motion features, in the form of fields of locally transformed edge filters. When trained on affine, or more general, transformations of still images, the model develops codes for these transformations, and can subsequently perform recognition tasks that are invariant under these transformations. It can also fantasize new transformations on previously unseen images. We describe several variations of the basic model and provide experimental results that demonstrate its applicability to a variety of tasks.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"48 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132810568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 217
期刊
2007 IEEE Conference on Computer Vision and Pattern Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1