首页 > 最新文献

2011 IEEE Workshop on Applications of Computer Vision (WACV)最新文献

英文 中文
Aligning surfaces without aligning surfaces 对准表面而不对准表面
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711500
Geoffrey Oxholm, K. Nishino
We introduce a novel method for matching and aligning 3D surfaces that do not have any overlapping surface information. When two matching surfaces do not overlap, all that remains in common between them is a thin strip along their borders. Aligning such fragments is challenging but crucial for various applications, such as reassembly of thin-shell ceramics from their broken pieces. Past work approach this problem by heavily relying on simplistic assumptions about the shape of the object, or its texture. Our method makes no such assumptions; instead, we leverage the geometric and photometric similarity of the matching surfaces along the break-line. We first encode the shape and color of the boundary contour of each fragment at various scales in a novel 2D representation. Reformulating contour matching as 2D image registration based on these scale-space images enables efficient and accurate break-line matching. We then align the fragments by estimating the rotation around the break-line through maximizing the geometric continuity across it with a least-squares minimization. We evaluate our method on real-word colonial artifacts recently excavated in Philadelphia, Pennsylvania. Our system dramatically increases the ease and efficiency at which users reassemble artifacts as we demonstrate on three different vessels.
本文提出了一种新的三维曲面匹配和对齐方法。当两个匹配的表面不重叠时,它们之间的共同之处就是沿着它们的边界有一条薄带。调整这些碎片具有挑战性,但对各种应用至关重要,例如从破碎的碎片中重新组装薄壳陶瓷。过去的工作通过严重依赖于对物体形状或纹理的简单假设来解决这个问题。我们的方法没有这样的假设;相反,我们利用沿折线的匹配表面的几何和光度相似性。我们首先用一种新的二维表示对每个片段在不同尺度上的边界轮廓的形状和颜色进行编码。在这些尺度空间图像的基础上,将轮廓匹配重新表述为二维图像配准,可以实现高效、准确的中断线匹配。然后,我们通过通过最小二乘最小化最大化跨越它的几何连续性来估计围绕断点的旋转来对齐碎片。我们用最近在宾夕法尼亚州费城出土的真实殖民文物来评估我们的方法。我们的系统极大地提高了用户重新组装工件的便利性和效率,正如我们在三种不同的容器上演示的那样。
{"title":"Aligning surfaces without aligning surfaces","authors":"Geoffrey Oxholm, K. Nishino","doi":"10.1109/WACV.2011.5711500","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711500","url":null,"abstract":"We introduce a novel method for matching and aligning 3D surfaces that do not have any overlapping surface information. When two matching surfaces do not overlap, all that remains in common between them is a thin strip along their borders. Aligning such fragments is challenging but crucial for various applications, such as reassembly of thin-shell ceramics from their broken pieces. Past work approach this problem by heavily relying on simplistic assumptions about the shape of the object, or its texture. Our method makes no such assumptions; instead, we leverage the geometric and photometric similarity of the matching surfaces along the break-line. We first encode the shape and color of the boundary contour of each fragment at various scales in a novel 2D representation. Reformulating contour matching as 2D image registration based on these scale-space images enables efficient and accurate break-line matching. We then align the fragments by estimating the rotation around the break-line through maximizing the geometric continuity across it with a least-squares minimization. We evaluate our method on real-word colonial artifacts recently excavated in Philadelphia, Pennsylvania. Our system dramatically increases the ease and efficiency at which users reassemble artifacts as we demonstrate on three different vessels.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124507570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
On the reliability of eye color as a soft biometric trait 眼睛颜色作为一种软生物特征的可靠性研究
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711507
A. Dantcheva, N. Erdogmus, J. Dugelay
This work studies eye color as a soft biometric trait and provides a novel insight about the influence of pertinent factors in this context, like color spaces, illumination and presence of glasses. A motivation for the paper is the fact that the human iris color is an essential facial trait for Caucasians, which can be employed in iris pattern recognition systems for pruning the search or in soft biometrics systems for person re-identification. Towards studying iris color as a soft biometric trait, we consider a system for automatic detection of eye color, based on standard facial images. The system entails automatic iris localization, followed by classification based on Gaussian Mixture Models with Expectation Maximization. We finally provide related detection results on the UBIRIS2 database employable in a real time eye color detection system.
这项工作将眼睛颜色作为一种软生物特征进行研究,并提供了一种关于相关因素在这种情况下的影响的新见解,如色彩空间、照明和眼镜的存在。本文的动机是人类虹膜颜色是高加索人的基本面部特征,可以在虹膜模式识别系统中用于修剪搜索或在软生物识别系统中用于人员再识别。为了研究虹膜颜色作为一种软生物特征,我们考虑了一种基于标准面部图像的眼睛颜色自动检测系统。该系统需要自动定位虹膜,然后基于期望最大化的高斯混合模型进行分类。最后,我们在UBIRIS2数据库上提供了相关的检测结果,可用于实时眼睛颜色检测系统。
{"title":"On the reliability of eye color as a soft biometric trait","authors":"A. Dantcheva, N. Erdogmus, J. Dugelay","doi":"10.1109/WACV.2011.5711507","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711507","url":null,"abstract":"This work studies eye color as a soft biometric trait and provides a novel insight about the influence of pertinent factors in this context, like color spaces, illumination and presence of glasses. A motivation for the paper is the fact that the human iris color is an essential facial trait for Caucasians, which can be employed in iris pattern recognition systems for pruning the search or in soft biometrics systems for person re-identification. Towards studying iris color as a soft biometric trait, we consider a system for automatic detection of eye color, based on standard facial images. The system entails automatic iris localization, followed by classification based on Gaussian Mixture Models with Expectation Maximization. We finally provide related detection results on the UBIRIS2 database employable in a real time eye color detection system.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125310425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Robust multi-view camera calibration for wide-baseline camera networks 面向宽基线摄像机网络的鲁棒多视点摄像机标定
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711521
Jens Puwein, R. Ziegler, Julia Vogel, M. Pollefeys
Real-world camera networks are often characterized by very wide baselines covering a wide range of viewpoints. We describe a method not only calibrating each camera sequence added to the system automatically, but also taking advantage of multi-view correspondences to make the entire calibration framework more robust. Novel camera sequences can be seamlessly integrated into the system at any time, adding to the robustness of future computations. One of the challenges consists in establishing correspondences between cameras. Initializing a bag of features from a calibrated frame, correspondences between cameras are established in a two-step procedure. First, affine invariant features of camera sequences are warped into a common coordinate frame and a coarse matching is obtained between the collected features and the incrementally built and updated bag of features. This allows us to warp images to a common view. Second, scale invariant features are extracted from the warped images. This leads to both more numerous and more accurate correspondences. Finally, the parameters are optimized in a bundle adjustment. Adding the feature descriptors and the optimized 3D positions to the bag of features, we obtain a feature-based scene abstraction, allowing for the calibration of novel sequences and the correction of drift in single-view calibration tracking. We demonstrate that our approach can deal with wide baselines. Novel sequences can seamlessly be integrated in the calibration framework.
现实世界的摄像机网络通常具有非常宽的基线,覆盖了广泛的视点。我们描述了一种方法,不仅自动校准添加到系统中的每个摄像机序列,而且利用多视图对应使整个校准框架更具鲁棒性。新的相机序列可以在任何时候无缝集成到系统中,增加了未来计算的鲁棒性。其中一个挑战是在相机之间建立通信。从校准的帧初始化特征包,相机之间的对应关系在两个步骤中建立。首先,将相机序列的仿射不变特征扭曲成一个公共坐标帧,并将收集到的特征与增量构建和更新的特征包进行粗匹配;这允许我们将图像扭曲成一个共同的视图。其次,从扭曲图像中提取尺度不变特征;这导致了更多的和更准确的通信。最后,在一束调整中对参数进行优化。将特征描述符和优化的3D位置添加到特征包中,我们获得了基于特征的场景抽象,允许对新序列进行校准,并在单视图校准跟踪中校正漂移。我们证明了我们的方法可以处理宽基线。新的序列可以无缝地集成到校准框架中。
{"title":"Robust multi-view camera calibration for wide-baseline camera networks","authors":"Jens Puwein, R. Ziegler, Julia Vogel, M. Pollefeys","doi":"10.1109/WACV.2011.5711521","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711521","url":null,"abstract":"Real-world camera networks are often characterized by very wide baselines covering a wide range of viewpoints. We describe a method not only calibrating each camera sequence added to the system automatically, but also taking advantage of multi-view correspondences to make the entire calibration framework more robust. Novel camera sequences can be seamlessly integrated into the system at any time, adding to the robustness of future computations. One of the challenges consists in establishing correspondences between cameras. Initializing a bag of features from a calibrated frame, correspondences between cameras are established in a two-step procedure. First, affine invariant features of camera sequences are warped into a common coordinate frame and a coarse matching is obtained between the collected features and the incrementally built and updated bag of features. This allows us to warp images to a common view. Second, scale invariant features are extracted from the warped images. This leads to both more numerous and more accurate correspondences. Finally, the parameters are optimized in a bundle adjustment. Adding the feature descriptors and the optimized 3D positions to the bag of features, we obtain a feature-based scene abstraction, allowing for the calibration of novel sequences and the correction of drift in single-view calibration tracking. We demonstrate that our approach can deal with wide baselines. Novel sequences can seamlessly be integrated in the calibration framework.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130305231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Fast and scalable keypoint recognition and image retrieval using binary codes 快速和可扩展的关键点识别和图像检索使用二进制代码
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711573
Jonathan Ventura, Tobias Höllerer
In this paper we report an evaluation of keypoint descriptor compression using as little as 16 bits to describe a single keypoint. We use spectral hashing to compress keypoint descriptors, and match them using the Hamming distance. By indexing the keypoints in a binary tree, we can quickly recognize keypoints with a very small database, and efficiently insert new keypoints. Our tests using image datasets with perspective distortion show the method to enable fast keypoint recognition and image retrieval with a small code size, and point towards potential applications for scalable visual SLAM on mobile phones.
在本文中,我们报告了一个关键点描述符压缩的评估,使用少至16位来描述单个关键点。我们使用谱散列压缩关键点描述符,并使用汉明距离进行匹配。通过索引二叉树中的关键点,我们可以在很小的数据库中快速识别关键点,并有效地插入新的关键点。我们使用具有视角失真的图像数据集进行的测试表明,该方法可以用较小的代码大小实现快速关键点识别和图像检索,并指出在手机上可扩展的视觉SLAM的潜在应用。
{"title":"Fast and scalable keypoint recognition and image retrieval using binary codes","authors":"Jonathan Ventura, Tobias Höllerer","doi":"10.1109/WACV.2011.5711573","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711573","url":null,"abstract":"In this paper we report an evaluation of keypoint descriptor compression using as little as 16 bits to describe a single keypoint. We use spectral hashing to compress keypoint descriptors, and match them using the Hamming distance. By indexing the keypoints in a binary tree, we can quickly recognize keypoints with a very small database, and efficiently insert new keypoints. Our tests using image datasets with perspective distortion show the method to enable fast keypoint recognition and image retrieval with a small code size, and point towards potential applications for scalable visual SLAM on mobile phones.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122160219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Augmented distinctive features for efficient image matching 增强显著特征,实现高效的图像匹配
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711478
Quan Wang, Wei Guan, Suya You
Finding corresponding image points is a challenging computer vision problem, especially for confusing scenes with surfaces of low textures or repeated patterns. Despite the well-known challenges of extracting conceptually meaningful high-level matching primitives, many recent works describe high-level image features such as edge groups, lines and regions, which are more distinctive than traditional local appearance based features, to tackle such difficult scenes. In this paper, we propose a different and more general approach, which treats the image matching problem as a recognition problem of spatially related image patch sets. We construct augmented semi-global descriptors (ordinal codes) based on subsets of scale and orientation invariant local keypoint descriptors. Tied ranking problem of ordinal codes is handled by increasingly keypoint sampling around image patch sets. Finally, similarities of augmented features are measured using Spearman correlation coefficient. Our proposed method is compatible with a large range of existing local image descriptors. Experimental results based on standard benchmark datasets and SURF descriptors have demonstrated its distinctiveness and effectiveness.
寻找相应的图像点是一个具有挑战性的计算机视觉问题,特别是对于具有低纹理表面或重复图案的混淆场景。尽管提取概念上有意义的高级匹配原语是众所周知的挑战,但最近的许多研究都描述了高级图像特征,如边缘组、线条和区域,这些特征比传统的基于局部外观的特征更有特色,以解决这些困难的场景。在本文中,我们提出了一种不同的更通用的方法,将图像匹配问题视为空间相关图像补丁集的识别问题。在尺度不变和方向不变局部关键点描述子子集的基础上构造增广半全局描述子(序数码)。通过对图像补丁集周围越来越多的关键点采样来解决有序码的并列排序问题。最后,利用Spearman相关系数度量增强特征的相似度。我们提出的方法与大量现有的局部图像描述符兼容。基于标准基准数据集和SURF描述符的实验结果证明了该方法的独特性和有效性。
{"title":"Augmented distinctive features for efficient image matching","authors":"Quan Wang, Wei Guan, Suya You","doi":"10.1109/WACV.2011.5711478","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711478","url":null,"abstract":"Finding corresponding image points is a challenging computer vision problem, especially for confusing scenes with surfaces of low textures or repeated patterns. Despite the well-known challenges of extracting conceptually meaningful high-level matching primitives, many recent works describe high-level image features such as edge groups, lines and regions, which are more distinctive than traditional local appearance based features, to tackle such difficult scenes. In this paper, we propose a different and more general approach, which treats the image matching problem as a recognition problem of spatially related image patch sets. We construct augmented semi-global descriptors (ordinal codes) based on subsets of scale and orientation invariant local keypoint descriptors. Tied ranking problem of ordinal codes is handled by increasingly keypoint sampling around image patch sets. Finally, similarities of augmented features are measured using Spearman correlation coefficient. Our proposed method is compatible with a large range of existing local image descriptors. Experimental results based on standard benchmark datasets and SURF descriptors have demonstrated its distinctiveness and effectiveness.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117032881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Generalized autofocus 广义的自动对焦
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711547
D. Vaquero, Natasha Gelfand, M. Tico, K. Pulli, M. Turk
All-in-focus imaging is a computational photography technique that produces images free of defocus blur by capturing a stack of images focused at different distances and merging them into a single sharp result. Current approaches assume that images have been captured offline, and that a reasonably powerful computer is available to process them. In contrast, we focus on the problem of how to capture such input stacks in an efficient and scene-adaptive fashion. Inspired by passive autofocus techniques, which select a single best plane of focus in the scene, we propose a method to automatically select a minimal set of images, focused at different depths, such that all objects in a given scene are in focus in at least one image. We aim to minimize both the amount of time spent metering the scene and capturing the images, and the total amount of high-resolution data that is captured. The algorithm first analyzes a set of low-resolution sharpness measurements of the scene while continuously varying the focus distance of the lens. From these measurements, we estimate the final lens positions required to capture all objects in the scene in acceptable focus. We demonstrate the use of our technique in a mobile computational photography scenario, where it is essential to minimize image capture time (as the camera is typically handheld) and processing time (as the computation and energy resources are limited).
全焦成像是一种计算摄影技术,通过捕捉在不同距离聚焦的一堆图像,并将它们合并成一个清晰的结果,产生无离焦模糊的图像。目前的方法假设图像是脱机捕获的,并且有一台功能相当强大的计算机可以处理它们。相比之下,我们关注的问题是如何以高效和场景自适应的方式捕获这些输入堆栈。受被动自动对焦技术的启发,我们提出了一种自动选择最小图像集的方法,这些图像集聚焦在不同的深度,这样给定场景中的所有物体都至少在一张图像中对焦。我们的目标是尽量减少测量场景和捕获图像所花费的时间,以及捕获的高分辨率数据的总量。该算法首先分析了一组低分辨率的场景清晰度测量值,同时连续改变镜头的焦距。从这些测量中,我们估计在可接受的焦点下捕获场景中所有物体所需的最终镜头位置。我们演示了在移动计算摄影场景中使用我们的技术,其中最小化图像捕获时间(因为相机通常是手持的)和处理时间(因为计算和能源资源有限)是必不可少的。
{"title":"Generalized autofocus","authors":"D. Vaquero, Natasha Gelfand, M. Tico, K. Pulli, M. Turk","doi":"10.1109/WACV.2011.5711547","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711547","url":null,"abstract":"All-in-focus imaging is a computational photography technique that produces images free of defocus blur by capturing a stack of images focused at different distances and merging them into a single sharp result. Current approaches assume that images have been captured offline, and that a reasonably powerful computer is available to process them. In contrast, we focus on the problem of how to capture such input stacks in an efficient and scene-adaptive fashion. Inspired by passive autofocus techniques, which select a single best plane of focus in the scene, we propose a method to automatically select a minimal set of images, focused at different depths, such that all objects in a given scene are in focus in at least one image. We aim to minimize both the amount of time spent metering the scene and capturing the images, and the total amount of high-resolution data that is captured. The algorithm first analyzes a set of low-resolution sharpness measurements of the scene while continuously varying the focus distance of the lens. From these measurements, we estimate the final lens positions required to capture all objects in the scene in acceptable focus. We demonstrate the use of our technique in a mobile computational photography scenario, where it is essential to minimize image capture time (as the camera is typically handheld) and processing time (as the computation and energy resources are limited).","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"212 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128163000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Object matching using feature aggregation over a frame sequence 在帧序列上使用特征聚合的对象匹配
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711489
Mahmoud Bassiouny, M. El-Saban
Object instance matching is a cornerstone component in many computer vision applications such as image search, augmented reality and unsupervised tagging. The common flow in these applications is to take an input image and match it against a database of previously enrolled images of objects of interest. This is usually difficult as one needs to capture an image corresponding to an object view already present in the database, especially in the case of 3D objects with high curvature where light reflection, viewpoint change and partial occlusion can significantly alter the appearance of the captured image. Rather than relying on having numerous views of each object in the database, we propose an alternative method of capturing a short video sequence scanning a certain object and utilize information from multiple frames to improve the chance of a successful match in the database. The matching step combines local features from a number of frames and incrementally forms a point cloud describing the object. We conduct experiments on a database of different object types showing promising matching results on both a privately collected set of videos and those freely available on the Web such that on YouTube. Increase in accuracy of up to 20% over the baseline of using a single frame matching is shown to be possible.
对象实例匹配是图像搜索、增强现实和无监督标记等许多计算机视觉应用的基础组件。这些应用程序中的常见流程是获取输入图像,并将其与先前注册的感兴趣对象图像的数据库进行匹配。这通常是困难的,因为需要捕获与数据库中已经存在的对象视图相对应的图像,特别是在具有高曲率的3D对象的情况下,光反射,视点变化和部分遮挡会显著改变捕获图像的外观。与其依赖于数据库中每个对象的多个视图,我们提出了一种替代方法,即捕获短视频序列,扫描某个对象并利用来自多帧的信息来提高数据库中成功匹配的机会。匹配步骤结合来自许多帧的局部特征,并逐渐形成描述对象的点云。我们在不同对象类型的数据库上进行了实验,在私人收集的视频集和在网络上免费提供的视频(如YouTube)上都显示出有希望的匹配结果。在使用单帧匹配的基础上,精度提高20%是可能的。
{"title":"Object matching using feature aggregation over a frame sequence","authors":"Mahmoud Bassiouny, M. El-Saban","doi":"10.1109/WACV.2011.5711489","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711489","url":null,"abstract":"Object instance matching is a cornerstone component in many computer vision applications such as image search, augmented reality and unsupervised tagging. The common flow in these applications is to take an input image and match it against a database of previously enrolled images of objects of interest. This is usually difficult as one needs to capture an image corresponding to an object view already present in the database, especially in the case of 3D objects with high curvature where light reflection, viewpoint change and partial occlusion can significantly alter the appearance of the captured image. Rather than relying on having numerous views of each object in the database, we propose an alternative method of capturing a short video sequence scanning a certain object and utilize information from multiple frames to improve the chance of a successful match in the database. The matching step combines local features from a number of frames and incrementally forms a point cloud describing the object. We conduct experiments on a database of different object types showing promising matching results on both a privately collected set of videos and those freely available on the Web such that on YouTube. Increase in accuracy of up to 20% over the baseline of using a single frame matching is shown to be possible.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"06 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115628406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
View context based 2D sketch-3D model alignment 查看基于上下文的2D草图- 3d模型对齐
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711482
Bo Li, H. Johan
2D sketch-3D model alignment is important for many applications such as sketch-based 3D model retrieval, sketch-based 3D modeling as well as model-based vision and recognition. In this paper, we propose a 2D sketch-3D model alignment algorithm using view context and shape context matching. A sketch consists of a set of curves. A 3D model is typically a 3D triangle mesh. It includes two main steps: precomputation and actual alignment. In the precomputation, we extract the view context features of a set of sample views for a 3D model to be aligned. To speed up the precomputation, two computationally efficient and rotation-invariant features, Zernike moments and Fourier descriptors are used to represent a view. In the actual alignment, we prune most sample views which are dissimilar to the sketch very quickly based on their view context similarities. Finally, to find an approximate pose, we only compare the sketch with a very small portion (e.g. 5% in our experiments) of the sample views based on shape context matching. Experiments on two types of datasets show that the algorithm can align 2D sketches with 3D models approximately.
2D草图-3D模型对齐对于基于草图的3D模型检索、基于草图的3D建模以及基于模型的视觉和识别等许多应用都很重要。在本文中,我们提出了一种基于视图上下文和形状上下文匹配的二维草图-三维模型对齐算法。草图由一组曲线组成。3D模型通常是一个3D三角形网格。它包括两个主要步骤:预计算和实际对齐。在预计算中,我们提取了一组样本视图的视图上下文特征,以便对3D模型进行对齐。为了加快预计算速度,使用了两个计算效率高且旋转不变的特征,即泽尼克矩和傅立叶描述子来表示视图。在实际对齐中,我们基于视图上下文相似度,非常快速地修剪与草图不相似的大多数示例视图。最后,为了找到一个近似的姿势,我们只将草图与基于形状上下文匹配的样本视图的很小一部分(例如在我们的实验中为5%)进行比较。在两类数据集上的实验表明,该算法可以近似地将二维草图与三维模型对齐。
{"title":"View context based 2D sketch-3D model alignment","authors":"Bo Li, H. Johan","doi":"10.1109/WACV.2011.5711482","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711482","url":null,"abstract":"2D sketch-3D model alignment is important for many applications such as sketch-based 3D model retrieval, sketch-based 3D modeling as well as model-based vision and recognition. In this paper, we propose a 2D sketch-3D model alignment algorithm using view context and shape context matching. A sketch consists of a set of curves. A 3D model is typically a 3D triangle mesh. It includes two main steps: precomputation and actual alignment. In the precomputation, we extract the view context features of a set of sample views for a 3D model to be aligned. To speed up the precomputation, two computationally efficient and rotation-invariant features, Zernike moments and Fourier descriptors are used to represent a view. In the actual alignment, we prune most sample views which are dissimilar to the sketch very quickly based on their view context similarities. Finally, to find an approximate pose, we only compare the sketch with a very small portion (e.g. 5% in our experiments) of the sample views based on shape context matching. Experiments on two types of datasets show that the algorithm can align 2D sketches with 3D models approximately.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"161 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121788939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Active stereo vision for improving long range hearing using a Laser Doppler Vibrometer 利用激光多普勒测振仪改善远距离听力的主动立体视觉
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711554
Tao Wang, Rui Li, Zhigang Zhu, Yufu Qu
Laser Doppler Vibrometers (LDVs) have been widely applied for detecting vibrations in applications such as mechanics, bridge inspection, biometrics, as well as long-range surveillance in which acoustic signatures can be obtained at a large distance. However, in both industrial and scientific applications, the LDVs are manually controlled in surface selection, laser focusing, and acoustic acquisition. In this paper, we propose an active stereo vision approach to facilitate fast and automated laser pointing and tracking for long-range LDV hearing. The system contains: 1) a mirror on a Pan-Tilt-Unit (PTU) to reflect the laser beam to any locations freely and quickly, and 2) two Pan-Tilt-Zoom (PTZ) cameras, one of which is mounted on the Pan-Tilt-Unit (PTU) and aligned with the laser beam synchronously. The distance measurement using the stereo vision system as well as triangulation between camera and the LDV laser beam allow us to fast focus the laser beam on selected surfaces and to obtain acoustic signals up to 200 meters in real time. We present some promising results with the collaborative visual and LDV measurements for laser pointing and focusing in order to achieve long range audio detection.
激光多普勒测振仪(ldv)已广泛应用于检测振动,如力学,桥梁检查,生物识别,以及远程监视,其中可以在很远的距离获得声学特征。然而,在工业和科学应用中,ldv在表面选择、激光聚焦和声学采集方面都是手动控制的。在本文中,我们提出了一种主动立体视觉方法,以促进远程LDV听力的快速和自动激光指向和跟踪。该系统包括:1)一个安装在Pan-Tilt-Unit (PTU)上的反射镜,可以自由快速地将激光束反射到任意位置;2)两个Pan-Tilt-Zoom (PTZ)摄像机,其中一个安装在Pan-Tilt-Unit (PTU)上,与激光束同步对准。使用立体视觉系统进行距离测量,以及相机和LDV激光束之间的三角测量,使我们能够将激光束快速聚焦在选定的表面上,并实时获得200米以内的声信号。我们提出了一些有希望的结果与协同视觉和LDV测量激光指向和聚焦,以实现远程音频检测。
{"title":"Active stereo vision for improving long range hearing using a Laser Doppler Vibrometer","authors":"Tao Wang, Rui Li, Zhigang Zhu, Yufu Qu","doi":"10.1109/WACV.2011.5711554","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711554","url":null,"abstract":"Laser Doppler Vibrometers (LDVs) have been widely applied for detecting vibrations in applications such as mechanics, bridge inspection, biometrics, as well as long-range surveillance in which acoustic signatures can be obtained at a large distance. However, in both industrial and scientific applications, the LDVs are manually controlled in surface selection, laser focusing, and acoustic acquisition. In this paper, we propose an active stereo vision approach to facilitate fast and automated laser pointing and tracking for long-range LDV hearing. The system contains: 1) a mirror on a Pan-Tilt-Unit (PTU) to reflect the laser beam to any locations freely and quickly, and 2) two Pan-Tilt-Zoom (PTZ) cameras, one of which is mounted on the Pan-Tilt-Unit (PTU) and aligned with the laser beam synchronously. The distance measurement using the stereo vision system as well as triangulation between camera and the LDV laser beam allow us to fast focus the laser beam on selected surfaces and to obtain acoustic signals up to 200 meters in real time. We present some promising results with the collaborative visual and LDV measurements for laser pointing and focusing in order to achieve long range audio detection.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131737338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
GPU accelerated one-pass algorithm for computing minimal rectangles of connected components 一种计算连通组件最小矩形的GPU加速单遍算法
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711542
L. Ríha, M. Manohar
The connected component labeling is an essential task for detecting moving objects and tracking them in video surveillance application. Since tracking algorithms are designed for real-time applications, efficiencies of the underlying algorithms become critical. In this paper we present a new one-pass algorithm for computing minimal binding rectangles of all the connected components of background foreground segmented video frames (binary data) using GPU accelerator. The given image frame is scanned once in raster scan mode and the background foreground transition information is stored in a directed-graph where each transition is represented by a node. This data structure contains the locations of object edges in every row, and it is used to detect connected components in the image and extract its main features, e.g. bounding box size and location, location of the centroid, real size, etc. Further we use GPU acceleration to speed up feature extraction from the image to a directed graph from which minimal bounding rectangles will be computed subsequently. Also we compare the performance of GPU acceleration (using Tesla C2050 accelerator card) with the performance of multi-core (up 24 cores) general purpose CPU implementation of the algorithm.
在视频监控应用中,连接构件的标记是检测和跟踪运动目标的重要环节。由于跟踪算法是为实时应用而设计的,因此底层算法的效率变得至关重要。本文提出了一种利用GPU加速器计算背景前景分割视频帧(二进制数据)中所有连接分量的最小绑定矩形的一遍算法。给定的图像帧以光栅扫描方式扫描一次,背景前景过渡信息存储在有向图中,其中每个过渡由一个节点表示。该数据结构包含每一行物体边缘的位置,用于检测图像中的连通分量,提取其主要特征,如边界框的大小和位置、质心的位置、实际尺寸等。进一步,我们使用GPU加速来加速从图像到有向图的特征提取,随后将从中计算最小边界矩形。我们还比较了GPU加速(使用Tesla C2050加速卡)与多核(最多24核)通用CPU实现算法的性能。
{"title":"GPU accelerated one-pass algorithm for computing minimal rectangles of connected components","authors":"L. Ríha, M. Manohar","doi":"10.1109/WACV.2011.5711542","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711542","url":null,"abstract":"The connected component labeling is an essential task for detecting moving objects and tracking them in video surveillance application. Since tracking algorithms are designed for real-time applications, efficiencies of the underlying algorithms become critical. In this paper we present a new one-pass algorithm for computing minimal binding rectangles of all the connected components of background foreground segmented video frames (binary data) using GPU accelerator. The given image frame is scanned once in raster scan mode and the background foreground transition information is stored in a directed-graph where each transition is represented by a node. This data structure contains the locations of object edges in every row, and it is used to detect connected components in the image and extract its main features, e.g. bounding box size and location, location of the centroid, real size, etc. Further we use GPU acceleration to speed up feature extraction from the image to a directed graph from which minimal bounding rectangles will be computed subsequently. Also we compare the performance of GPU acceleration (using Tesla C2050 accelerator card) with the performance of multi-core (up 24 cores) general purpose CPU implementation of the algorithm.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132043779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
2011 IEEE Workshop on Applications of Computer Vision (WACV)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1