首页 > 最新文献

2011 International Conference on Computer Vision最新文献

英文 中文
ORB: An efficient alternative to SIFT or SURF ORB: SIFT或SURF的有效替代方案
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126544
Ethan Rublee, V. Rabaud, K. Konolige, G. Bradski
Feature matching is at the base of many computer vision problems, such as object recognition or structure from motion. Current methods rely on costly descriptors for detection and matching. In this paper, we propose a very fast binary descriptor based on BRIEF, called ORB, which is rotation invariant and resistant to noise. We demonstrate through experiments how ORB is at two orders of magnitude faster than SIFT, while performing as well in many situations. The efficiency is tested on several real-world applications, including object detection and patch-tracking on a smart phone.
特征匹配是许多计算机视觉问题的基础,如物体识别或运动结构。目前的方法依赖于昂贵的描述符进行检测和匹配。本文提出了一种基于BRIEF的快速二进制描述子ORB,它具有旋转不变性和抗噪声性。我们通过实验证明ORB如何比SIFT快两个数量级,同时在许多情况下表现良好。这种效率已经在几个实际应用中进行了测试,包括智能手机上的物体检测和补丁跟踪。
{"title":"ORB: An efficient alternative to SIFT or SURF","authors":"Ethan Rublee, V. Rabaud, K. Konolige, G. Bradski","doi":"10.1109/ICCV.2011.6126544","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126544","url":null,"abstract":"Feature matching is at the base of many computer vision problems, such as object recognition or structure from motion. Current methods rely on costly descriptors for detection and matching. In this paper, we propose a very fast binary descriptor based on BRIEF, called ORB, which is rotation invariant and resistant to noise. We demonstrate through experiments how ORB is at two orders of magnitude faster than SIFT, while performing as well in many situations. The efficiency is tested on several real-world applications, including object detection and patch-tracking on a smart phone.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"25 1","pages":"2564-2571"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87290872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8700
Understanding scenes on many levels 从多个层面理解场景
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126260
Joseph Tighe, S. Lazebnik
This paper presents a framework for image parsing with multiple label sets. For example, we may want to simultaneously label every image region according to its basic-level object category (car, building, road, tree, etc.), superordinate category (animal, vehicle, manmade object, natural object, etc.), geometric orientation (horizontal, vertical, etc.), and material (metal, glass, wood, etc.). Some object regions may also be given part names (a car can have wheels, doors, windshield, etc.). We compute co-occurrence statistics between different label types of the same region to capture relationships such as “roads are horizontal,” “cars are made of metal,” “cars have wheels” but “horses have legs,” and so on. By incorporating these constraints into a Markov Random Field inference framework and jointly solving for all the label sets, we are able to improve the classification accuracy for all the label sets at once, achieving a richer form of image understanding.
提出了一种基于多标签集的图像解析框架。例如,我们可能希望同时根据每个图像区域的基本对象类别(汽车、建筑、道路、树木等)、上级类别(动物、车辆、人造物体、自然物体等)、几何方向(水平、垂直等)和材料(金属、玻璃、木材等)来标记每个图像区域。一些对象区域也可以被赋予部件名称(汽车可以有轮子、门、挡风玻璃等)。我们计算同一区域的不同标签类型之间的共现统计,以捕获诸如“道路是水平的”、“汽车是金属制成的”、“汽车有轮子”但“马有腿”等关系。通过将这些约束纳入马尔可夫随机场推理框架,并对所有标签集进行联合求解,我们可以一次提高所有标签集的分类精度,实现更丰富的图像理解形式。
{"title":"Understanding scenes on many levels","authors":"Joseph Tighe, S. Lazebnik","doi":"10.1109/ICCV.2011.6126260","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126260","url":null,"abstract":"This paper presents a framework for image parsing with multiple label sets. For example, we may want to simultaneously label every image region according to its basic-level object category (car, building, road, tree, etc.), superordinate category (animal, vehicle, manmade object, natural object, etc.), geometric orientation (horizontal, vertical, etc.), and material (metal, glass, wood, etc.). Some object regions may also be given part names (a car can have wheels, doors, windshield, etc.). We compute co-occurrence statistics between different label types of the same region to capture relationships such as “roads are horizontal,” “cars are made of metal,” “cars have wheels” but “horses have legs,” and so on. By incorporating these constraints into a Markov Random Field inference framework and jointly solving for all the label sets, we are able to improve the classification accuracy for all the label sets at once, achieving a richer form of image understanding.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"54 1","pages":"335-342"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73515628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
BiCoS: A Bi-level co-segmentation method for image classification BiCoS:一种用于图像分类的双水平共分割方法
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126546
Yuning Chai, V. Lempitsky, Andrew Zisserman
The objective of this paper is the unsupervised segmentation of image training sets into foreground and background in order to improve image classification performance. To this end we introduce a new scalable, alternation-based algorithm for co-segmentation, BiCoS, which is simpler than many of its predecessors, and yet has superior performance on standard benchmark image datasets.
本文的目标是对图像训练集进行前景和背景的无监督分割,以提高图像分类性能。为此,我们引入了一种新的可扩展的,基于交替的共分割算法,BiCoS,它比它的许多前辈更简单,但在标准基准图像数据集上具有优越的性能。
{"title":"BiCoS: A Bi-level co-segmentation method for image classification","authors":"Yuning Chai, V. Lempitsky, Andrew Zisserman","doi":"10.1109/ICCV.2011.6126546","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126546","url":null,"abstract":"The objective of this paper is the unsupervised segmentation of image training sets into foreground and background in order to improve image classification performance. To this end we introduce a new scalable, alternation-based algorithm for co-segmentation, BiCoS, which is simpler than many of its predecessors, and yet has superior performance on standard benchmark image datasets.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"14 1","pages":"2579-2586"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88670474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 201
Discovering object instances from scenes of Daily Living 从日常生活场景中发现对象实例
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126314
Hongwen Kang, M. Hebert, T. Kanade
We propose an approach to identify and segment objects from scenes that a person (or robot) encounters in Activities of Daily Living (ADL). Images collected in those cluttered scenes contain multiple objects. Each image provides only a partial, possibly very different view of each object. An object instance discovery program must be able to link pieces of visual information from multiple images and extract the consistent patterns.
我们提出了一种方法来识别和分割一个人(或机器人)在日常生活活动(ADL)中遇到的场景中的物体。在这些杂乱的场景中收集的图像包含多个物体。每张图像只提供了每个物体的部分,可能非常不同的视图。对象实例发现程序必须能够从多个图像中链接视觉信息片段并提取一致的模式。
{"title":"Discovering object instances from scenes of Daily Living","authors":"Hongwen Kang, M. Hebert, T. Kanade","doi":"10.1109/ICCV.2011.6126314","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126314","url":null,"abstract":"We propose an approach to identify and segment objects from scenes that a person (or robot) encounters in Activities of Daily Living (ADL). Images collected in those cluttered scenes contain multiple objects. Each image provides only a partial, possibly very different view of each object. An object instance discovery program must be able to link pieces of visual information from multiple images and extract the consistent patterns.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"1 1","pages":"762-769"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88858698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 61
Graph mode-based contextual kernels for robust SVM tracking 基于图模型的上下文核鲁棒支持向量机跟踪
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126364
Xi Li, A. Dick, Hanzi Wang, Chunhua Shen, A. Hengel
Visual tracking has been typically solved as a binary classification problem. Most existing trackers only consider the pairwise interactions between samples, and thereby ignore the higher-order contextual interactions, which may lead to the sensitivity to complicated factors such as noises, outliers, background clutters and so on. In this paper, we propose a visual tracker based on support vector machines (SVMs), for which a novel graph mode-based contextual kernel is designed to effectively capture the higher-order contextual information from samples. To do so, we first create a visual graph whose similarity matrix is determined by a baseline visual kernel. Second, a set of high-order contexts are discovered in the visual graph. The problem of discovering these high-order contexts is solved by seeking modes of the visual graph. Each graph mode corresponds to a vertex community termed as a high-order context. Third, we construct a contextual kernel that effectively captures the interaction information between the high-order contexts. Finally, this contextual kernel is embedded into SVMs for robust tracking. Experimental results on challenging videos demonstrate the effectiveness and robustness of the proposed tracker.
视觉跟踪通常作为一个二分类问题来解决。现有的大多数跟踪器只考虑样本之间的成对相互作用,而忽略了高阶上下文相互作用,这可能导致对噪声、离群值、背景杂波等复杂因素的敏感性。本文提出了一种基于支持向量机(svm)的视觉跟踪器,并设计了一种新的基于图模型的上下文核,以有效地捕获样本中的高阶上下文信息。为此,我们首先创建一个可视化图,其相似性矩阵由基线可视化内核确定。其次,在可视化图中发现一组高阶上下文。通过寻找可视化图的模式来解决这些高阶上下文的发现问题。每个图模式对应于一个称为高阶上下文的顶点群落。第三,构建上下文核,有效捕获高阶上下文之间的交互信息。最后,将上下文内核嵌入到支持向量机中进行鲁棒跟踪。挑战性视频的实验结果证明了该跟踪器的有效性和鲁棒性。
{"title":"Graph mode-based contextual kernels for robust SVM tracking","authors":"Xi Li, A. Dick, Hanzi Wang, Chunhua Shen, A. Hengel","doi":"10.1109/ICCV.2011.6126364","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126364","url":null,"abstract":"Visual tracking has been typically solved as a binary classification problem. Most existing trackers only consider the pairwise interactions between samples, and thereby ignore the higher-order contextual interactions, which may lead to the sensitivity to complicated factors such as noises, outliers, background clutters and so on. In this paper, we propose a visual tracker based on support vector machines (SVMs), for which a novel graph mode-based contextual kernel is designed to effectively capture the higher-order contextual information from samples. To do so, we first create a visual graph whose similarity matrix is determined by a baseline visual kernel. Second, a set of high-order contexts are discovered in the visual graph. The problem of discovering these high-order contexts is solved by seeking modes of the visual graph. Each graph mode corresponds to a vertex community termed as a high-order context. Third, we construct a contextual kernel that effectively captures the interaction information between the high-order contexts. Finally, this contextual kernel is embedded into SVMs for robust tracking. Experimental results on challenging videos demonstrate the effectiveness and robustness of the proposed tracker.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"26 1","pages":"1156-1163"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89128675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
Distributed cosegmentation via submodular optimization on anisotropic diffusion 基于各向异性扩散的次模优化分布共分割
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126239
Gunhee Kim, E. Xing, Li Fei-Fei, T. Kanade
The saliency of regions or objects in an image can be significantly boosted if they recur in multiple images. Leveraging this idea, cosegmentation jointly segments common regions from multiple images. In this paper, we propose CoSand, a distributed cosegmentation approach for a highly variable large-scale image collection. The segmentation task is modeled by temperature maximization on anisotropic heat diffusion, of which the temperature maximization with finite K heat sources corresponds to a K-way segmentation that maximizes the segmentation confidence of every pixel in an image. We show that our method takes advantage of a strong theoretic property in that the temperature under linear anisotropic diffusion is a submodular function; therefore, a greedy algorithm guarantees at least a constant factor approximation to the optimal solution for temperature maximization. Our theoretic result is successfully applied to scalable cosegmentation as well as diversity ranking and single-image segmentation. We evaluate CoSand on MSRC and ImageNet datasets, and show its competence both in competitive performance over previous work, and in much superior scalability.
如果图像中的区域或对象在多个图像中重复出现,则可以显著增强其显著性。利用这一思想,共分割将多个图像中的共同区域分割出来。在本文中,我们提出了CoSand,一种用于高度可变的大规模图像集合的分布式共分割方法。该分割任务采用各向异性热扩散的温度最大化模型,其中有限K个热源的温度最大化对应于K-way分割,使图像中每个像素的分割置信度最大化。我们的方法利用了一个很强的理论性质,即线性各向异性扩散下的温度是一个次模函数;因此,贪心算法至少保证了温度最大化最优解的常数因子近似值。我们的理论结果已成功地应用于可扩展共分割、多样性排序和单幅图像分割。我们在MSRC和ImageNet数据集上对CoSand进行了评估,并展示了它在竞争性能和可扩展性方面的能力。
{"title":"Distributed cosegmentation via submodular optimization on anisotropic diffusion","authors":"Gunhee Kim, E. Xing, Li Fei-Fei, T. Kanade","doi":"10.1109/ICCV.2011.6126239","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126239","url":null,"abstract":"The saliency of regions or objects in an image can be significantly boosted if they recur in multiple images. Leveraging this idea, cosegmentation jointly segments common regions from multiple images. In this paper, we propose CoSand, a distributed cosegmentation approach for a highly variable large-scale image collection. The segmentation task is modeled by temperature maximization on anisotropic heat diffusion, of which the temperature maximization with finite K heat sources corresponds to a K-way segmentation that maximizes the segmentation confidence of every pixel in an image. We show that our method takes advantage of a strong theoretic property in that the temperature under linear anisotropic diffusion is a submodular function; therefore, a greedy algorithm guarantees at least a constant factor approximation to the optimal solution for temperature maximization. Our theoretic result is successfully applied to scalable cosegmentation as well as diversity ranking and single-image segmentation. We evaluate CoSand on MSRC and ImageNet datasets, and show its competence both in competitive performance over previous work, and in much superior scalability.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"16 1","pages":"169-176"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81399288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 311
Locally rigid globally non-rigid surface registration 局部刚性全局非刚性曲面配准
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126411
Kent Fujiwara, K. Nishino, J. Takamatsu, Bo Zheng, K. Ikeuchi
We present a novel non-rigid surface registration method that achieves high accuracy and matches characteristic features without manual intervention. The key insight is to consider the entire shape as a collection of local structures that individually undergo rigid transformations to collectively deform the global structure. We realize this locally rigid but globally non-rigid surface registration with a newly derived dual-grid Free-form Deformation (FFD) framework. We first represent the source and target shapes with their signed distance fields (SDF). We then superimpose a sampling grid onto a conventional FFD grid that is dual to the control points. Each control point is then iteratively translated by a rigid transformation that minimizes the difference between two SDFs within the corresponding sampling region. The translated control points then interpolate the embedding space within the FFD grid and determine the overall deformation. The experimental results clearly demonstrate that our method is capable of overcoming the difficulty of preserving and matching local features.
提出了一种新的非刚性曲面配准方法,该方法可以在不需要人工干预的情况下实现高精度和特征匹配。关键的洞察力是将整个形状视为局部结构的集合,这些局部结构单独经历刚性转换以集体变形全局结构。我们利用新导出的双网格自由变形(FFD)框架实现了这种局部刚性而全局非刚性的曲面配准。我们首先用它们的符号距离域(SDF)表示源和目标形状。然后,我们将采样网格叠加到传统的FFD网格上,该网格与控制点对偶。然后,每个控制点都通过一个严格的转换来迭代地转换,该转换将相应采样区域内两个sdf之间的差异最小化。平移后的控制点然后在FFD网格内插值嵌入空间并确定整体变形。实验结果清楚地表明,我们的方法能够克服局部特征的保留和匹配困难。
{"title":"Locally rigid globally non-rigid surface registration","authors":"Kent Fujiwara, K. Nishino, J. Takamatsu, Bo Zheng, K. Ikeuchi","doi":"10.1109/ICCV.2011.6126411","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126411","url":null,"abstract":"We present a novel non-rigid surface registration method that achieves high accuracy and matches characteristic features without manual intervention. The key insight is to consider the entire shape as a collection of local structures that individually undergo rigid transformations to collectively deform the global structure. We realize this locally rigid but globally non-rigid surface registration with a newly derived dual-grid Free-form Deformation (FFD) framework. We first represent the source and target shapes with their signed distance fields (SDF). We then superimpose a sampling grid onto a conventional FFD grid that is dual to the control points. Each control point is then iteratively translated by a rigid transformation that minimizes the difference between two SDFs within the corresponding sampling region. The translated control points then interpolate the embedding space within the FFD grid and determine the overall deformation. The experimental results clearly demonstrate that our method is capable of overcoming the difficulty of preserving and matching local features.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"100 1","pages":"1527-1534"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84980846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Dense one-shot 3D reconstruction by detecting continuous regions with parallel line projection 利用平行线投影检测连续区域的密集单镜头三维重建
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126460
R. Sagawa, Hiroshi Kawasaki, S. Kiyota, Furukawa Ryo
3D scanning of moving objects has many applications, for example, marker-less motion capture, analysis on fluid dynamics, object explosion and so on. One of the approach to acquire accurate shape is a projector-camera system, especially the methods that reconstructs a shape by using a single image with static pattern is suitable for capturing fast moving object. In this paper, we propose a method that uses a grid pattern consisting of sets of parallel lines. The pattern is spatially encoded by a periodic color pattern. While informations are sparse in the camera image, the proposed method extracts the dense (pixel-wise) phase informations from the sparse pattern. As the result, continuous regions in the camera images can be extracted by analyzing the phase. Since there remain one DOF for each region, we propose the linear solution to eliminate the DOF by using geometric informations of the devices, i.e. epipolar constraint. In addition, solution space is finite because projected pattern consists of parallel lines with same intervals, the linear equation can be efficiently solved by integer least square method. In this paper, the formulations for both single and multiple projectors are presented. We evaluated the accuracy of correspondences and showed the comparison with respect to the number of projectors by simulation. Finally, the dense 3D reconstruction of moving objects are presented in the experiments.
运动物体的三维扫描有许多应用,如无标记运动捕捉、流体动力学分析、物体爆炸等。投影-摄像系统是获取精确形状的方法之一,特别是利用静态模式的单幅图像重建形状的方法适合于捕捉快速运动物体。在本文中,我们提出了一种使用由平行线组成的网格模式的方法。该模式由周期性颜色模式在空间上进行编码。虽然相机图像中的信息是稀疏的,但该方法从稀疏的模式中提取密集的(逐像素的)相位信息。通过相位分析,可以提取出相机图像中的连续区域。由于每个区域仍然存在一个自由度,因此我们提出了利用器件的几何信息即极面约束来消除自由度的线性解决方案。此外,由于投影模式由相同间隔的平行线组成,求解空间有限,线性方程可以用整数最小二乘法有效地求解。本文给出了单投影仪和多投影仪的计算公式。我们评估了对应的准确性,并通过模拟显示了相对于投影机数量的比较。最后,在实验中实现了运动物体的密集三维重建。
{"title":"Dense one-shot 3D reconstruction by detecting continuous regions with parallel line projection","authors":"R. Sagawa, Hiroshi Kawasaki, S. Kiyota, Furukawa Ryo","doi":"10.1109/ICCV.2011.6126460","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126460","url":null,"abstract":"3D scanning of moving objects has many applications, for example, marker-less motion capture, analysis on fluid dynamics, object explosion and so on. One of the approach to acquire accurate shape is a projector-camera system, especially the methods that reconstructs a shape by using a single image with static pattern is suitable for capturing fast moving object. In this paper, we propose a method that uses a grid pattern consisting of sets of parallel lines. The pattern is spatially encoded by a periodic color pattern. While informations are sparse in the camera image, the proposed method extracts the dense (pixel-wise) phase informations from the sparse pattern. As the result, continuous regions in the camera images can be extracted by analyzing the phase. Since there remain one DOF for each region, we propose the linear solution to eliminate the DOF by using geometric informations of the devices, i.e. epipolar constraint. In addition, solution space is finite because projected pattern consists of parallel lines with same intervals, the linear equation can be efficiently solved by integer least square method. In this paper, the formulations for both single and multiple projectors are presented. We evaluated the accuracy of correspondences and showed the comparison with respect to the number of projectors by simulation. Finally, the dense 3D reconstruction of moving objects are presented in the experiments.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"76 1","pages":"1911-1918"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89678538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 52
Recognizing jumbled images: The role of local and global information in image classification 混杂图像识别:局部和全局信息在图像分类中的作用
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126283
Devi Parikh
The performance of current state-of-the-art computer vision algorithms at image classification falls significantly short as compared to human abilities. To reduce this gap, it is important for the community to know what problems to solve, and not just how to solve them. Towards this goal, via the use of jumbled images, we strip apart two widely investigated aspects: local and global information in images, and identify the performance bottleneck. Interestingly, humans have been shown to reliably recognize jumbled images. The goal of our paper is to determine a functional model that mimics how humans recognize jumbled images i.e. exploit local information alone, and further evaluate if existing implementations of this computational model suffice to match human performance. Surprisingly, in our series of human studies and machine experiments, we find that a simple bag-of-words based majority-vote-like strategy is an accurate functional model of how humans recognize jumbled images. Moreover, a straightforward machine implementation of this model achieves accuracies similar to human subjects at classifying jumbled images. This indicates that perhaps existing machine vision techniques already leverage local information from images effectively, and future research efforts should be focused on more advanced modeling of global information.
目前最先进的计算机视觉算法在图像分类方面的表现与人类的能力相比明显不足。为了缩小这一差距,社区必须知道要解决什么问题,而不仅仅是如何解决问题。为了实现这一目标,通过使用混乱的图像,我们剥离了两个广泛研究的方面:图像中的局部和全局信息,并确定了性能瓶颈。有趣的是,人类已经被证明能够可靠地识别杂乱的图像。我们论文的目标是确定一个模拟人类如何识别混乱图像的功能模型,即单独利用局部信息,并进一步评估该计算模型的现有实现是否足以匹配人类的表现。令人惊讶的是,在我们的一系列人类研究和机器实验中,我们发现一个简单的基于词袋的多数投票策略是人类如何识别混乱图像的准确功能模型。此外,该模型的直接机器实现在对混乱图像进行分类时达到了与人类受试者相似的精度。这表明,也许现有的机器视觉技术已经有效地利用了图像中的局部信息,未来的研究工作应该集中在更先进的全局信息建模上。
{"title":"Recognizing jumbled images: The role of local and global information in image classification","authors":"Devi Parikh","doi":"10.1109/ICCV.2011.6126283","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126283","url":null,"abstract":"The performance of current state-of-the-art computer vision algorithms at image classification falls significantly short as compared to human abilities. To reduce this gap, it is important for the community to know what problems to solve, and not just how to solve them. Towards this goal, via the use of jumbled images, we strip apart two widely investigated aspects: local and global information in images, and identify the performance bottleneck. Interestingly, humans have been shown to reliably recognize jumbled images. The goal of our paper is to determine a functional model that mimics how humans recognize jumbled images i.e. exploit local information alone, and further evaluate if existing implementations of this computational model suffice to match human performance. Surprisingly, in our series of human studies and machine experiments, we find that a simple bag-of-words based majority-vote-like strategy is an accurate functional model of how humans recognize jumbled images. Moreover, a straightforward machine implementation of this model achieves accuracies similar to human subjects at classifying jumbled images. This indicates that perhaps existing machine vision techniques already leverage local information from images effectively, and future research efforts should be focused on more advanced modeling of global information.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"196 1","pages":"519-526"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79855633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
Globally optimal solution to multi-object tracking with merged measurements 融合测量的多目标跟踪全局最优解
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126532
João F. Henriques, Rui Caseiro, Jorge P. Batista
Multiple object tracking has been formulated recently as a global optimization problem, and solved efficiently with optimal methods such as the Hungarian Algorithm. A severe limitation is the inability to model multiple objects that are merged into a single measurement, and track them as a group, while retaining optimality. This work presents a new graph structure that encodes these multiple-match events as standard one-to-one matches, allowing computation of the solution in polynomial time. Since identities are lost when objects merge, an efficient method to identify groups is also presented, as a flow circulation problem. The problem of tracking individual objects across groups is then posed as a standard optimal assignment. Experiments show increased performance on the PETS 2006 and 2009 datasets compared to state-of-the-art algorithms.
近年来,多目标跟踪已被表述为一个全局优化问题,并通过匈牙利算法等优化方法得到了有效的求解。一个严重的限制是无法对合并为单个度量的多个对象进行建模,并在保持最优性的同时将它们作为一个组进行跟踪。这项工作提出了一种新的图结构,将这些多匹配事件编码为标准的一对一匹配,允许在多项式时间内计算解决方案。由于对象合并时会丢失身份,因此提出了一种有效的识别组的方法,作为流循环问题。然后将跨组跟踪单个对象的问题作为标准的最优分配。实验表明,与最先进的算法相比,PETS 2006和2009数据集的性能有所提高。
{"title":"Globally optimal solution to multi-object tracking with merged measurements","authors":"João F. Henriques, Rui Caseiro, Jorge P. Batista","doi":"10.1109/ICCV.2011.6126532","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126532","url":null,"abstract":"Multiple object tracking has been formulated recently as a global optimization problem, and solved efficiently with optimal methods such as the Hungarian Algorithm. A severe limitation is the inability to model multiple objects that are merged into a single measurement, and track them as a group, while retaining optimality. This work presents a new graph structure that encodes these multiple-match events as standard one-to-one matches, allowing computation of the solution in polynomial time. Since identities are lost when objects merge, an efficient method to identify groups is also presented, as a flow circulation problem. The problem of tracking individual objects across groups is then posed as a standard optimal assignment. Experiments show increased performance on the PETS 2006 and 2009 datasets compared to state-of-the-art algorithms.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"13 1","pages":"2470-2477"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80244644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 145
期刊
2011 International Conference on Computer Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1