首页 > 最新文献

Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)最新文献

英文 中文
Planar catadioptric stereo: geometry and calibration 平面反射立体:几何与标定
J. Gluckman, S. Nayar
By using mirror reflections of a scene, stereo images can be captured with a single camera (catadioptric stereo). Single camera stereo provides both geometric and radiometric advantages over traditional two camera stereo. In this paper we discuss the geometry and calibration of catadioptric stereo with two planar mirrors and show how the relative orientation, the epipolar geometry and the estimation of the focal length are constrained by planar motion. In addition, we have implemented a real-time system which demonstrates the viability of stereo with mirrors as an alternative to traditional two camera stereo.
通过使用场景的镜面反射,可以用单个相机捕获立体图像(反射立体)。单摄像机立体与传统的双摄像机立体相比,具有几何和辐射优势。本文讨论了平面反射镜反射体的几何结构和标定问题,并说明了平面运动对反射镜反射体的相对定向、极面几何和焦距估计的约束。此外,我们已经实现了一个实时系统,该系统展示了立体反射镜作为传统双摄像头立体的替代方案的可行性。
{"title":"Planar catadioptric stereo: geometry and calibration","authors":"J. Gluckman, S. Nayar","doi":"10.1109/CVPR.1999.786912","DOIUrl":"https://doi.org/10.1109/CVPR.1999.786912","url":null,"abstract":"By using mirror reflections of a scene, stereo images can be captured with a single camera (catadioptric stereo). Single camera stereo provides both geometric and radiometric advantages over traditional two camera stereo. In this paper we discuss the geometry and calibration of catadioptric stereo with two planar mirrors and show how the relative orientation, the epipolar geometry and the estimation of the focal length are constrained by planar motion. In addition, we have implemented a real-time system which demonstrates the viability of stereo with mirrors as an alternative to traditional two camera stereo.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80063035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 116
A framework for learning query concepts in image classification 图像分类中查询概念学习的框架
A. L. Ratan, O. Maron, W. Grimson, Tomas Lozano-Perez
In this paper, we adapt the Multiple Instance Learning paradigm using the Diverse Density algorithm as a way of modeling the ambiguity in images in order to learn "visual concepts" that can be used to classify new images. In this framework, a user labels an image as positive if the image contains the concept. Each example image is a bag of instances (sub-images) where only the bag is labeled-not the individual instances (sub-images). From a small collection of positive and negative examples, the system learns the concept and uses it to retrieve images that contain the concept from a large database. The learned "concepts" are simple templates that capture the color, texture and spatial properties of the class of images. We introduced this method earlier in the domain of natural scene classification using simple, low resolution sub-images as instances. In this paper, we extend the bag generator (the mechanism which takes an image and generates a set of instances) to generate more complex instances using multiple cues on segmented high resolution images. We show that this method can be used to learn certain object class concepts (e.g. cars) in addition, to natural scenes.
在本文中,我们采用多实例学习范式,使用不同密度算法作为对图像中的模糊性建模的一种方式,以学习可用于分类新图像的“视觉概念”。在这个框架中,如果图像包含这个概念,则用户将其标记为正面。每个示例图像都是一个实例包(子图像),其中只标记了实例包,而不标记单个实例(子图像)。从一小部分正面和负面的例子中,系统学习这个概念,并用它从一个大的数据库中检索包含这个概念的图像。学习到的“概念”是简单的模板,用于捕获图像类的颜色、纹理和空间属性。我们之前在自然场景分类领域使用简单、低分辨率的子图像作为实例介绍了这种方法。在本文中,我们扩展了bag生成器(获取图像并生成一组实例的机制),以便在分割的高分辨率图像上使用多个线索生成更复杂的实例。我们表明,除了自然场景之外,这种方法还可以用于学习某些对象类概念(例如汽车)。
{"title":"A framework for learning query concepts in image classification","authors":"A. L. Ratan, O. Maron, W. Grimson, Tomas Lozano-Perez","doi":"10.1109/CVPR.1999.786973","DOIUrl":"https://doi.org/10.1109/CVPR.1999.786973","url":null,"abstract":"In this paper, we adapt the Multiple Instance Learning paradigm using the Diverse Density algorithm as a way of modeling the ambiguity in images in order to learn \"visual concepts\" that can be used to classify new images. In this framework, a user labels an image as positive if the image contains the concept. Each example image is a bag of instances (sub-images) where only the bag is labeled-not the individual instances (sub-images). From a small collection of positive and negative examples, the system learns the concept and uses it to retrieve images that contain the concept from a large database. The learned \"concepts\" are simple templates that capture the color, texture and spatial properties of the class of images. We introduced this method earlier in the domain of natural scene classification using simple, low resolution sub-images as instances. In this paper, we extend the bag generator (the mechanism which takes an image and generates a set of instances) to generate more complex instances using multiple cues on segmented high resolution images. We show that this method can be used to learn certain object class concepts (e.g. cars) in addition, to natural scenes.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77336349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 83
Visual tracking and control using Lie algebras 利用李代数的视觉跟踪与控制
T. Drummond, R. Cipolla
A novel approach to visual servoing is presented, which takes advantage of the structure of the Lie algebra of affine transformations. The aim of this project is to use feedback from a visual sensor to guide a robot arm to a target position. The sensor is placed in the end effector of the robot, the 'camera-in-hand' approach, and thus provides direct feedback of the robot motion relative to the target scene via observed transformations of the scene. These scene transformations are obtained by measuring the affine deformations of a target planar contour, captured by use of an active contour, or snake. Deformations of the snake are constrained using the Lie groups of affine and projective transformations. Properties of the Lie algebra of affine transformations are exploited to integrate observed deformations to the target contour which can be compensated with appropriate robot motion using a non-linear control structure. These techniques have been implemented using a video camera to control a 5 DoF robot arm. Experiments with this implementation are presented, together with a discussion of the results.
提出了一种利用仿射变换的李代数结构实现视觉伺服的新方法。这个项目的目的是利用视觉传感器的反馈来引导机器人手臂到达目标位置。传感器被放置在机器人的末端执行器中,即“手持相机”的方法,从而通过观察到的场景变换,提供相对于目标场景的机器人运动的直接反馈。这些场景变换是通过测量目标平面轮廓的仿射变形来获得的,这些变形是通过使用活动轮廓或蛇来捕获的。利用仿射变换和射影变换的李群约束蛇的变形。利用仿射变换的李代数特性,将观测到的变形整合到目标轮廓上,并利用非线性控制结构用适当的机器人运动进行补偿。这些技术已经实现了使用视频摄像机来控制一个5自由度的机器人手臂。给出了该实现的实验,并对结果进行了讨论。
{"title":"Visual tracking and control using Lie algebras","authors":"T. Drummond, R. Cipolla","doi":"10.1109/CVPR.1999.784996","DOIUrl":"https://doi.org/10.1109/CVPR.1999.784996","url":null,"abstract":"A novel approach to visual servoing is presented, which takes advantage of the structure of the Lie algebra of affine transformations. The aim of this project is to use feedback from a visual sensor to guide a robot arm to a target position. The sensor is placed in the end effector of the robot, the 'camera-in-hand' approach, and thus provides direct feedback of the robot motion relative to the target scene via observed transformations of the scene. These scene transformations are obtained by measuring the affine deformations of a target planar contour, captured by use of an active contour, or snake. Deformations of the snake are constrained using the Lie groups of affine and projective transformations. Properties of the Lie algebra of affine transformations are exploited to integrate observed deformations to the target contour which can be compensated with appropriate robot motion using a non-linear control structure. These techniques have been implemented using a video camera to control a 5 DoF robot arm. Experiments with this implementation are presented, together with a discussion of the results.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87063423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 67
Harmonic maps and their applications in surface matching 谐波映射及其在曲面匹配中的应用
D. Zhang, M. Hebert
The surface-matching problem is investigated in this paper using a mathematical tool called harmonic maps. The theory of harmonic maps studies the mapping between different metric manifolds from the energy-minimization point of view. With the application of harmonic maps, a surface representation called harmonic shape images is generated to represent and match 3D freeform surfaces. The basic idea of harmonic shape images is to map a 3D surface patch with disc topology to a 2D domain and encode the shape information of the surface patch into the 2D image. This simplifies the surface-matching problem to a 2D image-matching problem. Due to the application of harmonic maps in generating harmonic shape images, harmonic shape images have the following advantages: they have sound mathematical background; they preserve both the shape and continuity of the underlying surfaces; and they are robust to occlusion and independent of any specific surface sampling scheme. The performance of surface matching using harmonic maps is evaluated using real data. Preliminary results are presented in the paper.
本文利用谐波映射这一数学工具研究了曲面匹配问题。调和映射理论是从能量最小化的角度研究不同度量流形之间的映射。通过谐波映射的应用,生成了一种称为谐波形状图像的曲面表示形式,用于表示和匹配三维自由曲面。谐波形状图像的基本思想是将具有圆盘拓扑结构的三维曲面斑块映射到二维域,并将曲面斑块的形状信息编码到二维图像中。这将表面匹配问题简化为二维图像匹配问题。由于谐波映射在谐波形状图像生成中的应用,使得谐波形状图像具有以下优点:具有良好的数学背景;它们保持了下垫面的形状和连续性;它们对遮挡具有鲁棒性,且不依赖于任何特定的表面采样方案。用实际数据对谐波映射曲面匹配的性能进行了评价。本文给出了初步结果。
{"title":"Harmonic maps and their applications in surface matching","authors":"D. Zhang, M. Hebert","doi":"10.1109/CVPR.1999.784731","DOIUrl":"https://doi.org/10.1109/CVPR.1999.784731","url":null,"abstract":"The surface-matching problem is investigated in this paper using a mathematical tool called harmonic maps. The theory of harmonic maps studies the mapping between different metric manifolds from the energy-minimization point of view. With the application of harmonic maps, a surface representation called harmonic shape images is generated to represent and match 3D freeform surfaces. The basic idea of harmonic shape images is to map a 3D surface patch with disc topology to a 2D domain and encode the shape information of the surface patch into the 2D image. This simplifies the surface-matching problem to a 2D image-matching problem. Due to the application of harmonic maps in generating harmonic shape images, harmonic shape images have the following advantages: they have sound mathematical background; they preserve both the shape and continuity of the underlying surfaces; and they are robust to occlusion and independent of any specific surface sampling scheme. The performance of surface matching using harmonic maps is evaluated using real data. Preliminary results are presented in the paper.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87334145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 208
High-level and generic models for visual search: When does high level knowledge help? 可视化搜索的高级和通用模型:高级知识何时有帮助?
A. Yuille, J. Coughlan
We analyze the problem of detecting a road target in background clutter and investigate the amount of prior (i.e. target specific) knowledge needed to perform this search task. The problem is formulated in terms of Bayesian inference and we define a Bayesian ensemble of problem instances. This formulation implies that the performance measures of different models depend on order parameters which characterize the problem. This demonstrates that if there is little clutter then only weak knowledge about the target is required in order to detect the target. However at a critical value of the order parameters there is a phase transition and it becomes effectively impossible to detect the target unless high-level target specific knowledge is used. These phase transitions determine different regimes within which different search strategies will be effective. These results have implications for bottom-up and top-down theories of vision.
我们分析了在背景杂波中检测道路目标的问题,并研究了执行该搜索任务所需的先验(即目标特定)知识的数量。该问题用贝叶斯推理来表述,并定义了一个问题实例的贝叶斯集合。这个公式意味着不同模型的性能度量取决于表征问题的顺序参数。这表明,如果杂波很小,那么检测目标只需要对目标有微弱的了解。然而,在阶参数的临界值处存在相变,除非使用高水平的目标特定知识,否则无法有效地检测目标。这些相变决定了不同的搜索策略在其中是有效的。这些结果对自下而上和自上而下的视觉理论具有启示意义。
{"title":"High-level and generic models for visual search: When does high level knowledge help?","authors":"A. Yuille, J. Coughlan","doi":"10.1109/CVPR.1999.784990","DOIUrl":"https://doi.org/10.1109/CVPR.1999.784990","url":null,"abstract":"We analyze the problem of detecting a road target in background clutter and investigate the amount of prior (i.e. target specific) knowledge needed to perform this search task. The problem is formulated in terms of Bayesian inference and we define a Bayesian ensemble of problem instances. This formulation implies that the performance measures of different models depend on order parameters which characterize the problem. This demonstrates that if there is little clutter then only weak knowledge about the target is required in order to detect the target. However at a critical value of the order parameters there is a phase transition and it becomes effectively impossible to detect the target unless high-level target specific knowledge is used. These phase transitions determine different regimes within which different search strategies will be effective. These results have implications for bottom-up and top-down theories of vision.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87809201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Dynamic occluding contours: a new external-energy term for snakes 动态咬合轮廓:蛇的一个新的外部能量术语
M. Covell, Trevor Darrell
Dynamic contours, or snakes, provide an effective method for tracking complex moving objects for segmentation and recognition tasks, but have difficulty tracking occluding boundaries on cluttered backgrounds. To compensate for this shortcoming, dynamic contours often rely on detailed object-shape or motion models to distinguish between the boundary of the tracked object and other boundaries in the background. In this paper we present a complementary approach to detailed object models: We improve the discriminative power of the local image measurements that drive the tracking process. We describe a new, robust external-energy term for dynamic contours that can track occluding boundaries without detailed object models. We show how our image model improves tracking in cluttered scenes, and describe how a fine-grained image-segmentation mask is created directly from the local image measurements used for tracking.
动态轮廓或蛇为跟踪复杂的运动物体提供了一种有效的方法,用于分割和识别任务,但在混乱背景上难以跟踪遮挡的边界。为了弥补这一缺陷,动态轮廓通常依赖于详细的物体形状或运动模型来区分被跟踪物体的边界和背景中的其他边界。在本文中,我们提出了一种详细目标模型的补充方法:我们提高了驱动跟踪过程的局部图像测量的判别能力。我们为动态轮廓描述了一个新的、鲁棒的外部能量项,它可以在没有详细对象模型的情况下跟踪遮挡边界。我们展示了我们的图像模型如何在混乱的场景中改善跟踪,并描述了如何直接从用于跟踪的局部图像测量中创建细粒度图像分割掩码。
{"title":"Dynamic occluding contours: a new external-energy term for snakes","authors":"M. Covell, Trevor Darrell","doi":"10.1109/CVPR.1999.784635","DOIUrl":"https://doi.org/10.1109/CVPR.1999.784635","url":null,"abstract":"Dynamic contours, or snakes, provide an effective method for tracking complex moving objects for segmentation and recognition tasks, but have difficulty tracking occluding boundaries on cluttered backgrounds. To compensate for this shortcoming, dynamic contours often rely on detailed object-shape or motion models to distinguish between the boundary of the tracked object and other boundaries in the background. In this paper we present a complementary approach to detailed object models: We improve the discriminative power of the local image measurements that drive the tracking process. We describe a new, robust external-energy term for dynamic contours that can track occluding boundaries without detailed object models. We show how our image model improves tracking in cluttered scenes, and describe how a fine-grained image-segmentation mask is created directly from the local image measurements used for tracking.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91166286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Motion segmentation: a synergistic approach 运动分割:一种协同方法
C. Fermüller, T. Brodský, Y. Aloimonos
Since estimation of camera motion requires knowledge of independent motion, and moving object detection and localization requires knowledge about the camera motion, the two problems of motion estimation and segmentation need to be solved together in a synergistic manner. This paper provides an approach to treating both these problems simultaneously. The technique introduced here is based on a novel concept, "scene ruggedness" which parameterizes the variation in estimated scene depth with the error in the underlying three-dimensional (3D) motion. The idea is that incorrect 3D motion estimates cause distortions in the estimated depth map, and as a result smooth scene patches are computed as rugged surfaces. The correct 3D motion can be distinguished, as it does not cause any distortion and thus gives rise to the background patches with the least depth variation between depth discontinuities, with the locations corresponding to independent motion being rugged. The algorithm presented employs a binocular observer whose nature is exploited in the extraction of depth discontinuities, a step that facilitates the overall procedure, but the technique can be extended to a monocular observer in a variety of ways.
由于摄像机运动的估计需要独立运动的知识,而运动目标的检测和定位需要摄像机运动的知识,所以运动估计和分割这两个问题需要协同解决。本文提供了一种同时处理这两个问题的方法。这里介绍的技术是基于一个新的概念,“场景粗犷度”,它参数化了估计的场景深度随底层三维运动误差的变化。这个想法是,不正确的3D运动估计会导致估计深度图的扭曲,结果平滑的场景补丁被计算为崎岖的表面。可以区分正确的3D运动,因为它不会造成任何失真,从而产生深度不连续之间深度变化最小的背景斑块,独立运动对应的位置是崎岖不平的。提出的算法采用双目观测者,其性质被用于提取深度不连续,这一步有利于整个过程,但该技术可以通过多种方式扩展到单目观测者。
{"title":"Motion segmentation: a synergistic approach","authors":"C. Fermüller, T. Brodský, Y. Aloimonos","doi":"10.1109/CVPR.1999.784633","DOIUrl":"https://doi.org/10.1109/CVPR.1999.784633","url":null,"abstract":"Since estimation of camera motion requires knowledge of independent motion, and moving object detection and localization requires knowledge about the camera motion, the two problems of motion estimation and segmentation need to be solved together in a synergistic manner. This paper provides an approach to treating both these problems simultaneously. The technique introduced here is based on a novel concept, \"scene ruggedness\" which parameterizes the variation in estimated scene depth with the error in the underlying three-dimensional (3D) motion. The idea is that incorrect 3D motion estimates cause distortions in the estimated depth map, and as a result smooth scene patches are computed as rugged surfaces. The correct 3D motion can be distinguished, as it does not cause any distortion and thus gives rise to the background patches with the least depth variation between depth discontinuities, with the locations corresponding to independent motion being rugged. The algorithm presented employs a binocular observer whose nature is exploited in the extraction of depth discontinuities, a step that facilitates the overall procedure, but the technique can be extended to a monocular observer in a variety of ways.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83782712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Estimating mixture models of images and inferring spatial transformations using the EM algorithm 利用EM算法估计图像混合模型和推断空间变换
B. Frey, N. Jojic
Mixture modeling and clustering algorithms are effective, simple ways to represent images using a set of data centers. However, in situations where the images include background clutter and transformations such as translation, rotation, shearing and warping, these methods extract data centers that include clutter and represent different transformations of essentially the same data. Taking face images as an example, it would be more useful for the different clusters to represent different poses and expressions, instead of cluttered versions of different translations, scales and rotations. By including clutter and transformation as unobserved, latent variables in a mixture model, we obtain a new "transformed mixture of Gaussians", which is invariant to a specified set of transformations. We show how a linear-time EM algorithm can be used to fit this model by jointly estimating a mixture model for the data and inferring the transformation for each image. We show that this algorithm can jointly align images of a human head and learn different poses. We also find that the algorithm performs better than k-nearest neighbors and mixtures of Gaussians on handwritten digit recognition.
混合建模和聚类算法是使用一组数据中心表示图像的有效、简单的方法。但是,在图像包含背景杂波和转换(如平移、旋转、剪切和翘曲)的情况下,这些方法提取包含杂波的数据中心,并表示本质上相同数据的不同转换。以人脸图像为例,不同的簇表示不同的姿势和表情,而不是不同的平移、缩放和旋转的杂乱版本,会更有用。通过将杂波和变换作为未观察到的潜在变量加入混合模型中,我们得到了一种新的“变换后的高斯混合”,它对一组特定的变换是不变的。我们展示了线性时间EM算法如何通过联合估计数据的混合模型和推断每个图像的转换来拟合该模型。我们证明了该算法可以联合对齐人类头部图像并学习不同的姿势。我们还发现该算法在手写数字识别上的性能优于k近邻和混合高斯。
{"title":"Estimating mixture models of images and inferring spatial transformations using the EM algorithm","authors":"B. Frey, N. Jojic","doi":"10.1109/CVPR.1999.786972","DOIUrl":"https://doi.org/10.1109/CVPR.1999.786972","url":null,"abstract":"Mixture modeling and clustering algorithms are effective, simple ways to represent images using a set of data centers. However, in situations where the images include background clutter and transformations such as translation, rotation, shearing and warping, these methods extract data centers that include clutter and represent different transformations of essentially the same data. Taking face images as an example, it would be more useful for the different clusters to represent different poses and expressions, instead of cluttered versions of different translations, scales and rotations. By including clutter and transformation as unobserved, latent variables in a mixture model, we obtain a new \"transformed mixture of Gaussians\", which is invariant to a specified set of transformations. We show how a linear-time EM algorithm can be used to fit this model by jointly estimating a mixture model for the data and inferring the transformation for each image. We show that this algorithm can jointly align images of a human head and learn different poses. We also find that the algorithm performs better than k-nearest neighbors and mixtures of Gaussians on handwritten digit recognition.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83894606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 94
A biprism-stereo camera system 双棱镜立体摄像系统
D. Lee, In-So Kweon, R. Cipolla
In this paper we propose a novel and practical stereo camera system that uses only one camera and a biprism placed in front of the camera. The equivalent of a stereo pair of images is formed as the left and right halves of a single CCD image using a biprism. The system is therefore cheap and extremely easy to calibrate since it requires only one CCD camera. An additional advantage of the geometrical set-up is that corresponding features lie on the same scanline automatically. The single camera and biprism have led to a simple stereo system for which correspondence is very easy and which is accurate for nearby objects in a small field of view. Since we we only, a single lens, calibration of the system is greatly simplified. This is due to the fact that we need to estimate only one focal length and one center of projection. Given the parameters in the biprism-stereo camera system, we can recover the depth of the object using only the disparity between the corresponding points.
本文提出了一种新颖实用的立体摄像系统,该系统仅使用一台摄像机,并在摄像机前放置一个双棱镜。等效的一对立体图像是形成一个单一的CCD图像的左右一半,使用一个双棱镜。由于只需要一个CCD相机,因此该系统非常便宜且非常容易校准。几何设置的另一个优点是相应的特征自动位于同一条扫描线上。单相机和双棱镜形成了一个简单的立体系统,它的对应性非常容易,而且在小视野范围内对附近的物体也很准确。由于我们只有一个镜头,系统的校准大大简化了。这是因为我们只需要估计一个焦距和一个投影中心。给定双棱镜-立体相机系统的参数,我们可以仅使用对应点之间的视差来恢复物体的深度。
{"title":"A biprism-stereo camera system","authors":"D. Lee, In-So Kweon, R. Cipolla","doi":"10.1109/CVPR.1999.786921","DOIUrl":"https://doi.org/10.1109/CVPR.1999.786921","url":null,"abstract":"In this paper we propose a novel and practical stereo camera system that uses only one camera and a biprism placed in front of the camera. The equivalent of a stereo pair of images is formed as the left and right halves of a single CCD image using a biprism. The system is therefore cheap and extremely easy to calibrate since it requires only one CCD camera. An additional advantage of the geometrical set-up is that corresponding features lie on the same scanline automatically. The single camera and biprism have led to a simple stereo system for which correspondence is very easy and which is accurate for nearby objects in a small field of view. Since we we only, a single lens, calibration of the system is greatly simplified. This is due to the fact that we need to estimate only one focal length and one center of projection. Given the parameters in the biprism-stereo camera system, we can recover the depth of the object using only the disparity between the corresponding points.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88269579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 55
Measurement of surface orientations of transparent objects using polarization in highlight 利用高光偏振测量透明物体的表面方向
Megumi Saito, Yoichi Sato, K. Ikeuchi, H. Kashiwagi
This paper proposes a method for obtaining surface orientations of transparent objects using polarization in highlight. Since the highlight, the specular component of reflection light from objects, is observed only near the specular direction, it appears merely on limited parts on an object surface. In order to obtain orientations of a whole object surface, we employ a spherical extended light source. This paper reports its experimental apparatus, a shape recovery algorithm, and its performance evaluation.
提出了一种利用高光偏振获取透明物体表面取向的方法。由于高光,即物体反射光的镜面分量,只在镜面方向附近被观察到,所以它只出现在物体表面的有限部分。为了获得整个物体表面的方向,我们采用了球形扩展光源。本文报道了其实验装置、形状恢复算法及其性能评价。
{"title":"Measurement of surface orientations of transparent objects using polarization in highlight","authors":"Megumi Saito, Yoichi Sato, K. Ikeuchi, H. Kashiwagi","doi":"10.1109/CVPR.1999.786967","DOIUrl":"https://doi.org/10.1109/CVPR.1999.786967","url":null,"abstract":"This paper proposes a method for obtaining surface orientations of transparent objects using polarization in highlight. Since the highlight, the specular component of reflection light from objects, is observed only near the specular direction, it appears merely on limited parts on an object surface. In order to obtain orientations of a whole object surface, we employ a spherical extended light source. This paper reports its experimental apparatus, a shape recovery algorithm, and its performance evaluation.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82393288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 102
期刊
Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1