首页 > 最新文献

2011 International Conference on Computer Vision最新文献

英文 中文
The medial feature detector: Stable regions from image boundaries 中间特征检测器:来自图像边界的稳定区域
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126436
Yannis Avrithis, Konstantinos Rapantzikos
We present a local feature detector that is able to detect regions of arbitrary scale and shape, without scale space construction. We compute a weighted distance map on image gradient, using our exact linear-time algorithm, a variant of group marching for Euclidean space. We find the weighted medial axis by extending residues, typically used in Voronoi skeletons. We decompose the medial axis into a graph representing image structure in terms of peaks and saddle points. A duality property enables reconstruction of regions using the same marching method. We greedily group regions taking both contrast and shape into account. On the way, we select regions according to our shape fragmentation factor, favoring those well enclosed by boundaries—even incomplete. We achieve state of the art performance in matching and retrieval experiments with reduced memory and computational requirements.
我们提出了一种局部特征检测器,它能够检测任意尺度和形状的区域,而不需要构建尺度空间。我们使用我们的精确线性时间算法计算图像梯度上的加权距离图,这是欧几里得空间的一种变体。我们通过扩展残基找到加权的内侧轴,通常用于Voronoi骨架。我们将中间轴分解成一个以峰和鞍点表示图像结构的图。对偶属性允许使用相同的行进方法重建区域。我们贪婪地对区域进行分组,同时考虑到对比度和形状。在这个过程中,我们根据我们的形状碎片因子来选择区域,偏爱那些被边界包围得很好的区域——甚至是不完整的区域。我们在匹配和检索实验中实现了最先进的性能,减少了内存和计算需求。
{"title":"The medial feature detector: Stable regions from image boundaries","authors":"Yannis Avrithis, Konstantinos Rapantzikos","doi":"10.1109/ICCV.2011.6126436","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126436","url":null,"abstract":"We present a local feature detector that is able to detect regions of arbitrary scale and shape, without scale space construction. We compute a weighted distance map on image gradient, using our exact linear-time algorithm, a variant of group marching for Euclidean space. We find the weighted medial axis by extending residues, typically used in Voronoi skeletons. We decompose the medial axis into a graph representing image structure in terms of peaks and saddle points. A duality property enables reconstruction of regions using the same marching method. We greedily group regions taking both contrast and shape into account. On the way, we select regions according to our shape fragmentation factor, favoring those well enclosed by boundaries—even incomplete. We achieve state of the art performance in matching and retrieval experiments with reduced memory and computational requirements.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"24 1","pages":"1724-1731"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86549554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Graph mode-based contextual kernels for robust SVM tracking 基于图模型的上下文核鲁棒支持向量机跟踪
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126364
Xi Li, A. Dick, Hanzi Wang, Chunhua Shen, A. Hengel
Visual tracking has been typically solved as a binary classification problem. Most existing trackers only consider the pairwise interactions between samples, and thereby ignore the higher-order contextual interactions, which may lead to the sensitivity to complicated factors such as noises, outliers, background clutters and so on. In this paper, we propose a visual tracker based on support vector machines (SVMs), for which a novel graph mode-based contextual kernel is designed to effectively capture the higher-order contextual information from samples. To do so, we first create a visual graph whose similarity matrix is determined by a baseline visual kernel. Second, a set of high-order contexts are discovered in the visual graph. The problem of discovering these high-order contexts is solved by seeking modes of the visual graph. Each graph mode corresponds to a vertex community termed as a high-order context. Third, we construct a contextual kernel that effectively captures the interaction information between the high-order contexts. Finally, this contextual kernel is embedded into SVMs for robust tracking. Experimental results on challenging videos demonstrate the effectiveness and robustness of the proposed tracker.
视觉跟踪通常作为一个二分类问题来解决。现有的大多数跟踪器只考虑样本之间的成对相互作用,而忽略了高阶上下文相互作用,这可能导致对噪声、离群值、背景杂波等复杂因素的敏感性。本文提出了一种基于支持向量机(svm)的视觉跟踪器,并设计了一种新的基于图模型的上下文核,以有效地捕获样本中的高阶上下文信息。为此,我们首先创建一个可视化图,其相似性矩阵由基线可视化内核确定。其次,在可视化图中发现一组高阶上下文。通过寻找可视化图的模式来解决这些高阶上下文的发现问题。每个图模式对应于一个称为高阶上下文的顶点群落。第三,构建上下文核,有效捕获高阶上下文之间的交互信息。最后,将上下文内核嵌入到支持向量机中进行鲁棒跟踪。挑战性视频的实验结果证明了该跟踪器的有效性和鲁棒性。
{"title":"Graph mode-based contextual kernels for robust SVM tracking","authors":"Xi Li, A. Dick, Hanzi Wang, Chunhua Shen, A. Hengel","doi":"10.1109/ICCV.2011.6126364","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126364","url":null,"abstract":"Visual tracking has been typically solved as a binary classification problem. Most existing trackers only consider the pairwise interactions between samples, and thereby ignore the higher-order contextual interactions, which may lead to the sensitivity to complicated factors such as noises, outliers, background clutters and so on. In this paper, we propose a visual tracker based on support vector machines (SVMs), for which a novel graph mode-based contextual kernel is designed to effectively capture the higher-order contextual information from samples. To do so, we first create a visual graph whose similarity matrix is determined by a baseline visual kernel. Second, a set of high-order contexts are discovered in the visual graph. The problem of discovering these high-order contexts is solved by seeking modes of the visual graph. Each graph mode corresponds to a vertex community termed as a high-order context. Third, we construct a contextual kernel that effectively captures the interaction information between the high-order contexts. Finally, this contextual kernel is embedded into SVMs for robust tracking. Experimental results on challenging videos demonstrate the effectiveness and robustness of the proposed tracker.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"26 1","pages":"1156-1163"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89128675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
Multiclass transfer learning from unconstrained priors 基于无约束先验的多类迁移学习
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126454
Jie Luo, T. Tommasi, B. Caputo
The vast majority of transfer learning methods proposed in the visual recognition domain over the last years addresses the problem of object category detection, assuming a strong control over the priors from which transfer is done. This is a strict condition, as it concretely limits the use of this type of approach in several settings: for instance, it does not allow in general to use off-the-shelf models as priors. Moreover, the lack of a multiclass formulation for most of the existing transfer learning algorithms prevents using them for object categorization problems, where their use might be beneficial, especially when the number of categories grows and it becomes harder to get enough annotated data for training standard learning methods. This paper presents a multiclass transfer learning algorithm that allows to take advantage of priors built over different features and with different learning methods than the one used for learning the new task. We use the priors as experts, and transfer their outputs to the new incoming samples as additional information. We cast the learning problem within the Multi Kernel Learning framework. The resulting formulation solves efficiently a joint optimization problem that determines from where and how much to transfer, with a principled multiclass formulation. Extensive experiments illustrate the value of this approach.
过去几年在视觉识别领域提出的绝大多数迁移学习方法都解决了对象类别检测的问题,假设对进行迁移的先验有很强的控制。这是一个严格的条件,因为它具体地限制了这种方法在几种情况下的使用:例如,它通常不允许使用现成的模型作为先验。此外,大多数现有的迁移学习算法缺乏多类公式,因此无法将它们用于对象分类问题,而在这些问题中,它们的使用可能是有益的,特别是当类别数量增长并且难以获得足够的注释数据来训练标准学习方法时。本文提出了一种多类迁移学习算法,该算法允许利用基于不同特征和不同学习方法构建的先验,而不是用于学习新任务的先验。我们使用先验作为专家,并将其输出作为附加信息传递给新的传入样本。我们将学习问题置于多核学习框架中。由此产生的公式有效地解决了一个联合优化问题,该问题确定了从哪里转移和转移多少,并具有原则性的多类公式。大量的实验证明了这种方法的价值。
{"title":"Multiclass transfer learning from unconstrained priors","authors":"Jie Luo, T. Tommasi, B. Caputo","doi":"10.1109/ICCV.2011.6126454","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126454","url":null,"abstract":"The vast majority of transfer learning methods proposed in the visual recognition domain over the last years addresses the problem of object category detection, assuming a strong control over the priors from which transfer is done. This is a strict condition, as it concretely limits the use of this type of approach in several settings: for instance, it does not allow in general to use off-the-shelf models as priors. Moreover, the lack of a multiclass formulation for most of the existing transfer learning algorithms prevents using them for object categorization problems, where their use might be beneficial, especially when the number of categories grows and it becomes harder to get enough annotated data for training standard learning methods. This paper presents a multiclass transfer learning algorithm that allows to take advantage of priors built over different features and with different learning methods than the one used for learning the new task. We use the priors as experts, and transfer their outputs to the new incoming samples as additional information. We cast the learning problem within the Multi Kernel Learning framework. The resulting formulation solves efficiently a joint optimization problem that determines from where and how much to transfer, with a principled multiclass formulation. Extensive experiments illustrate the value of this approach.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"1 1","pages":"1863-1870"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89473765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 111
Dense one-shot 3D reconstruction by detecting continuous regions with parallel line projection 利用平行线投影检测连续区域的密集单镜头三维重建
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126460
R. Sagawa, Hiroshi Kawasaki, S. Kiyota, Furukawa Ryo
3D scanning of moving objects has many applications, for example, marker-less motion capture, analysis on fluid dynamics, object explosion and so on. One of the approach to acquire accurate shape is a projector-camera system, especially the methods that reconstructs a shape by using a single image with static pattern is suitable for capturing fast moving object. In this paper, we propose a method that uses a grid pattern consisting of sets of parallel lines. The pattern is spatially encoded by a periodic color pattern. While informations are sparse in the camera image, the proposed method extracts the dense (pixel-wise) phase informations from the sparse pattern. As the result, continuous regions in the camera images can be extracted by analyzing the phase. Since there remain one DOF for each region, we propose the linear solution to eliminate the DOF by using geometric informations of the devices, i.e. epipolar constraint. In addition, solution space is finite because projected pattern consists of parallel lines with same intervals, the linear equation can be efficiently solved by integer least square method. In this paper, the formulations for both single and multiple projectors are presented. We evaluated the accuracy of correspondences and showed the comparison with respect to the number of projectors by simulation. Finally, the dense 3D reconstruction of moving objects are presented in the experiments.
运动物体的三维扫描有许多应用,如无标记运动捕捉、流体动力学分析、物体爆炸等。投影-摄像系统是获取精确形状的方法之一,特别是利用静态模式的单幅图像重建形状的方法适合于捕捉快速运动物体。在本文中,我们提出了一种使用由平行线组成的网格模式的方法。该模式由周期性颜色模式在空间上进行编码。虽然相机图像中的信息是稀疏的,但该方法从稀疏的模式中提取密集的(逐像素的)相位信息。通过相位分析,可以提取出相机图像中的连续区域。由于每个区域仍然存在一个自由度,因此我们提出了利用器件的几何信息即极面约束来消除自由度的线性解决方案。此外,由于投影模式由相同间隔的平行线组成,求解空间有限,线性方程可以用整数最小二乘法有效地求解。本文给出了单投影仪和多投影仪的计算公式。我们评估了对应的准确性,并通过模拟显示了相对于投影机数量的比较。最后,在实验中实现了运动物体的密集三维重建。
{"title":"Dense one-shot 3D reconstruction by detecting continuous regions with parallel line projection","authors":"R. Sagawa, Hiroshi Kawasaki, S. Kiyota, Furukawa Ryo","doi":"10.1109/ICCV.2011.6126460","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126460","url":null,"abstract":"3D scanning of moving objects has many applications, for example, marker-less motion capture, analysis on fluid dynamics, object explosion and so on. One of the approach to acquire accurate shape is a projector-camera system, especially the methods that reconstructs a shape by using a single image with static pattern is suitable for capturing fast moving object. In this paper, we propose a method that uses a grid pattern consisting of sets of parallel lines. The pattern is spatially encoded by a periodic color pattern. While informations are sparse in the camera image, the proposed method extracts the dense (pixel-wise) phase informations from the sparse pattern. As the result, continuous regions in the camera images can be extracted by analyzing the phase. Since there remain one DOF for each region, we propose the linear solution to eliminate the DOF by using geometric informations of the devices, i.e. epipolar constraint. In addition, solution space is finite because projected pattern consists of parallel lines with same intervals, the linear equation can be efficiently solved by integer least square method. In this paper, the formulations for both single and multiple projectors are presented. We evaluated the accuracy of correspondences and showed the comparison with respect to the number of projectors by simulation. Finally, the dense 3D reconstruction of moving objects are presented in the experiments.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"76 1","pages":"1911-1918"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89678538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 52
Discovering object instances from scenes of Daily Living 从日常生活场景中发现对象实例
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126314
Hongwen Kang, M. Hebert, T. Kanade
We propose an approach to identify and segment objects from scenes that a person (or robot) encounters in Activities of Daily Living (ADL). Images collected in those cluttered scenes contain multiple objects. Each image provides only a partial, possibly very different view of each object. An object instance discovery program must be able to link pieces of visual information from multiple images and extract the consistent patterns.
我们提出了一种方法来识别和分割一个人(或机器人)在日常生活活动(ADL)中遇到的场景中的物体。在这些杂乱的场景中收集的图像包含多个物体。每张图像只提供了每个物体的部分,可能非常不同的视图。对象实例发现程序必须能够从多个图像中链接视觉信息片段并提取一致的模式。
{"title":"Discovering object instances from scenes of Daily Living","authors":"Hongwen Kang, M. Hebert, T. Kanade","doi":"10.1109/ICCV.2011.6126314","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126314","url":null,"abstract":"We propose an approach to identify and segment objects from scenes that a person (or robot) encounters in Activities of Daily Living (ADL). Images collected in those cluttered scenes contain multiple objects. Each image provides only a partial, possibly very different view of each object. An object instance discovery program must be able to link pieces of visual information from multiple images and extract the consistent patterns.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"1 1","pages":"762-769"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88858698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 61
Are spatial and global constraints really necessary for segmentation? 空间和全局约束对于分割真的是必要的吗?
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126219
Aurélien Lucchi, Yunpeng Li, X. Boix, Kevin Smith, P. Fua
Many state-of-the-art segmentation algorithms rely on Markov or Conditional Random Field models designed to enforce spatial and global consistency constraints. This is often accomplished by introducing additional latent variables to the model, which can greatly increase its complexity. As a result, estimating the model parameters or computing the best maximum a posteriori (MAP) assignment becomes a computationally expensive task. In a series of experiments on the PASCAL and the MSRC datasets, we were unable to find evidence of a significant performance increase attributed to the introduction of such constraints. On the contrary, we found that similar levels of performance can be achieved using a much simpler design that essentially ignores these constraints. This more simple approach makes use of the same local and global features to leverage evidence from the image, but instead directly biases the preferences of individual pixels. While our investigation does not prove that spatial and consistency constraints are not useful in principle, it points to the conclusion that they should be validated in a larger context.
许多最先进的分割算法依赖于马尔可夫或条件随机场模型,旨在加强空间和全局一致性约束。这通常是通过向模型引入额外的潜在变量来实现的,这会大大增加模型的复杂性。因此,估计模型参数或计算最佳最大后验(MAP)分配成为一项计算成本很高的任务。在PASCAL和MSRC数据集上的一系列实验中,我们无法找到由于引入此类约束而显着提高性能的证据。相反,我们发现可以使用一个更简单的设计来实现类似的性能水平,这种设计基本上忽略了这些约束。这种更简单的方法利用相同的局部和全局特征来利用图像中的证据,但直接影响单个像素的偏好。虽然我们的调查并没有证明空间和一致性约束在原则上是无用的,但它指出了一个结论,即它们应该在更大的背景下得到验证。
{"title":"Are spatial and global constraints really necessary for segmentation?","authors":"Aurélien Lucchi, Yunpeng Li, X. Boix, Kevin Smith, P. Fua","doi":"10.1109/ICCV.2011.6126219","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126219","url":null,"abstract":"Many state-of-the-art segmentation algorithms rely on Markov or Conditional Random Field models designed to enforce spatial and global consistency constraints. This is often accomplished by introducing additional latent variables to the model, which can greatly increase its complexity. As a result, estimating the model parameters or computing the best maximum a posteriori (MAP) assignment becomes a computationally expensive task. In a series of experiments on the PASCAL and the MSRC datasets, we were unable to find evidence of a significant performance increase attributed to the introduction of such constraints. On the contrary, we found that similar levels of performance can be achieved using a much simpler design that essentially ignores these constraints. This more simple approach makes use of the same local and global features to leverage evidence from the image, but instead directly biases the preferences of individual pixels. While our investigation does not prove that spatial and consistency constraints are not useful in principle, it points to the conclusion that they should be validated in a larger context.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"47 1","pages":"9-16"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89839707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 72
Building a better probabilistic model of images by factorization 用因子分解法建立更好的图像概率模型
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126473
B. J. Culpepper, Jascha Narain Sohl-Dickstein, B. Olshausen
We describe a directed bilinear model that learns higher-order groupings among features of natural images. The model represents images in terms of two sets of latent variables: one set of variables represents which feature groups are active, while the other specifies the relative activity within groups. Such a factorized representation is beneficial because it is stable in response to small variations in the placement of features while still preserving information about relative spatial relationships. When trained on MNIST digits, the resulting representation provides state of the art performance in classification using a simple classifier. When trained on natural images, the model learns to group features according to proximity in position, orientation, and scale. The model achieves high log-likelihood (−94 nats), surpassing the current state of the art for natural images achievable with an mcRBM model.
我们描述了一个有向双线性模型,它学习自然图像特征之间的高阶分组。该模型用两组潜在变量表示图像:一组变量表示哪些特征组是活跃的,另一组变量表示组内的相对活跃度。这种因式表示是有益的,因为它在响应特征位置的小变化时是稳定的,同时仍然保留有关相对空间关系的信息。当在MNIST数字上训练时,结果表示使用简单的分类器提供了最先进的分类性能。当对自然图像进行训练时,该模型学习根据位置、方向和规模的接近度对特征进行分组。该模型实现了高对数似然(- 94 nats),超过了目前使用mcRBM模型可实现的自然图像的技术水平。
{"title":"Building a better probabilistic model of images by factorization","authors":"B. J. Culpepper, Jascha Narain Sohl-Dickstein, B. Olshausen","doi":"10.1109/ICCV.2011.6126473","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126473","url":null,"abstract":"We describe a directed bilinear model that learns higher-order groupings among features of natural images. The model represents images in terms of two sets of latent variables: one set of variables represents which feature groups are active, while the other specifies the relative activity within groups. Such a factorized representation is beneficial because it is stable in response to small variations in the placement of features while still preserving information about relative spatial relationships. When trained on MNIST digits, the resulting representation provides state of the art performance in classification using a simple classifier. When trained on natural images, the model learns to group features according to proximity in position, orientation, and scale. The model achieves high log-likelihood (−94 nats), surpassing the current state of the art for natural images achievable with an mcRBM model.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"64 1","pages":"2011-2017"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78790246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Image based detection of geometric changes in urban environments 基于图像的城市环境几何变化检测
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126515
Aparna Taneja, Luca Ballan, M. Pollefeys
In this paper, we propose an efficient technique to detect changes in the geometry of an urban environment using some images observing its current state. The proposed method can be used to significantly optimize the process of updating the 3D model of a city changing over time, by restricting this process to only those areas where changes are detected. With this application in mind, we designed our algorithm to specifically detect only structural changes in the environment, ignoring any changes in its appearance, and ignoring also all the changes which are not relevant for update purposes, such as cars, people etc. As a by-product, the algorithm also provides a coarse geometry of the detected changes. The performance of the proposed method was tested on four different kinds of urban environments and compared with two alternative techniques.
在本文中,我们提出了一种有效的技术来检测城市环境的几何变化,使用一些图像来观察其当前状态。所提出的方法可以通过将此过程限制在检测到变化的区域,从而显著优化城市随时间变化的3D模型更新过程。考虑到这个应用程序,我们设计的算法专门检测环境中的结构变化,忽略其外观的任何变化,并且忽略所有与更新目的无关的变化,例如汽车,人等。作为副产品,该算法还提供了检测到的变化的粗略几何形状。在四种不同的城市环境中测试了该方法的性能,并与两种替代技术进行了比较。
{"title":"Image based detection of geometric changes in urban environments","authors":"Aparna Taneja, Luca Ballan, M. Pollefeys","doi":"10.1109/ICCV.2011.6126515","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126515","url":null,"abstract":"In this paper, we propose an efficient technique to detect changes in the geometry of an urban environment using some images observing its current state. The proposed method can be used to significantly optimize the process of updating the 3D model of a city changing over time, by restricting this process to only those areas where changes are detected. With this application in mind, we designed our algorithm to specifically detect only structural changes in the environment, ignoring any changes in its appearance, and ignoring also all the changes which are not relevant for update purposes, such as cars, people etc. As a by-product, the algorithm also provides a coarse geometry of the detected changes. The performance of the proposed method was tested on four different kinds of urban environments and compared with two alternative techniques.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"17 1","pages":"2336-2343"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79454954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 89
Birdlets: Subordinate categorization using volumetric primitives and pose-normalized appearance 小鸟:使用体积原语和姿势标准化外观进行从属分类
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126238
Ryan Farrell, Om Oza, Ning Zhang, Vlad I. Morariu, Trevor Darrell, L. Davis
Subordinate-level categorization typically rests on establishing salient distinctions between part-level characteristics of objects, in contrast to basic-level categorization, where the presence or absence of parts is determinative. We develop an approach for subordinate categorization in vision, focusing on an avian domain due to the fine-grained structure of the category taxonomy for this domain. We explore a pose-normalized appearance model based on a volumetric poselet scheme. The variation in shape and appearance properties of these parts across a taxonomy provides the cues needed for subordinate categorization. Training pose detectors requires a relatively large amount of training data per category when done from scratch; using a subordinate-level approach, we exploit a pose classifier trained at the basic-level, and extract part appearance and shape information to build subordinate-level models. Our model associates the underlying image pattern parameters used for detection with corresponding volumetric part location, scale and orientation parameters. These parameters implicitly define a mapping from the image pixels into a pose-normalized appearance space, removing view and pose dependencies, facilitating fine-grained categorization from relatively few training examples.
从属级分类通常依赖于在对象的部分级特征之间建立显著的区别,与基本级分类相反,在基本级分类中,部分的存在与否是决定性的。我们开发了一种视觉上的从属分类方法,由于该领域的类别分类法具有细粒度结构,因此我们将重点放在鸟类领域。我们探索了一种基于体积姿态集方案的姿态归一化外观模型。这些部件在整个分类法中的形状和外观属性的变化为从属分类提供了所需的线索。从头开始训练姿势检测器时,每个类别需要相对大量的训练数据;采用从属级方法,利用在基本级训练的姿态分类器,提取零件外观和形状信息,构建从属级模型。我们的模型将用于检测的底层图像模式参数与相应的体积部件位置、比例和方向参数相关联。这些参数隐式地定义了从图像像素到姿势标准化外观空间的映射,消除了视图和姿势依赖关系,便于从相对较少的训练示例中进行细粒度分类。
{"title":"Birdlets: Subordinate categorization using volumetric primitives and pose-normalized appearance","authors":"Ryan Farrell, Om Oza, Ning Zhang, Vlad I. Morariu, Trevor Darrell, L. Davis","doi":"10.1109/ICCV.2011.6126238","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126238","url":null,"abstract":"Subordinate-level categorization typically rests on establishing salient distinctions between part-level characteristics of objects, in contrast to basic-level categorization, where the presence or absence of parts is determinative. We develop an approach for subordinate categorization in vision, focusing on an avian domain due to the fine-grained structure of the category taxonomy for this domain. We explore a pose-normalized appearance model based on a volumetric poselet scheme. The variation in shape and appearance properties of these parts across a taxonomy provides the cues needed for subordinate categorization. Training pose detectors requires a relatively large amount of training data per category when done from scratch; using a subordinate-level approach, we exploit a pose classifier trained at the basic-level, and extract part appearance and shape information to build subordinate-level models. Our model associates the underlying image pattern parameters used for detection with corresponding volumetric part location, scale and orientation parameters. These parameters implicitly define a mapping from the image pixels into a pose-normalized appearance space, removing view and pose dependencies, facilitating fine-grained categorization from relatively few training examples.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"21 1","pages":"161-168"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78492824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 217
Distributed cosegmentation via submodular optimization on anisotropic diffusion 基于各向异性扩散的次模优化分布共分割
Pub Date : 2011-11-06 DOI: 10.1109/ICCV.2011.6126239
Gunhee Kim, E. Xing, Li Fei-Fei, T. Kanade
The saliency of regions or objects in an image can be significantly boosted if they recur in multiple images. Leveraging this idea, cosegmentation jointly segments common regions from multiple images. In this paper, we propose CoSand, a distributed cosegmentation approach for a highly variable large-scale image collection. The segmentation task is modeled by temperature maximization on anisotropic heat diffusion, of which the temperature maximization with finite K heat sources corresponds to a K-way segmentation that maximizes the segmentation confidence of every pixel in an image. We show that our method takes advantage of a strong theoretic property in that the temperature under linear anisotropic diffusion is a submodular function; therefore, a greedy algorithm guarantees at least a constant factor approximation to the optimal solution for temperature maximization. Our theoretic result is successfully applied to scalable cosegmentation as well as diversity ranking and single-image segmentation. We evaluate CoSand on MSRC and ImageNet datasets, and show its competence both in competitive performance over previous work, and in much superior scalability.
如果图像中的区域或对象在多个图像中重复出现,则可以显著增强其显著性。利用这一思想,共分割将多个图像中的共同区域分割出来。在本文中,我们提出了CoSand,一种用于高度可变的大规模图像集合的分布式共分割方法。该分割任务采用各向异性热扩散的温度最大化模型,其中有限K个热源的温度最大化对应于K-way分割,使图像中每个像素的分割置信度最大化。我们的方法利用了一个很强的理论性质,即线性各向异性扩散下的温度是一个次模函数;因此,贪心算法至少保证了温度最大化最优解的常数因子近似值。我们的理论结果已成功地应用于可扩展共分割、多样性排序和单幅图像分割。我们在MSRC和ImageNet数据集上对CoSand进行了评估,并展示了它在竞争性能和可扩展性方面的能力。
{"title":"Distributed cosegmentation via submodular optimization on anisotropic diffusion","authors":"Gunhee Kim, E. Xing, Li Fei-Fei, T. Kanade","doi":"10.1109/ICCV.2011.6126239","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126239","url":null,"abstract":"The saliency of regions or objects in an image can be significantly boosted if they recur in multiple images. Leveraging this idea, cosegmentation jointly segments common regions from multiple images. In this paper, we propose CoSand, a distributed cosegmentation approach for a highly variable large-scale image collection. The segmentation task is modeled by temperature maximization on anisotropic heat diffusion, of which the temperature maximization with finite K heat sources corresponds to a K-way segmentation that maximizes the segmentation confidence of every pixel in an image. We show that our method takes advantage of a strong theoretic property in that the temperature under linear anisotropic diffusion is a submodular function; therefore, a greedy algorithm guarantees at least a constant factor approximation to the optimal solution for temperature maximization. Our theoretic result is successfully applied to scalable cosegmentation as well as diversity ranking and single-image segmentation. We evaluate CoSand on MSRC and ImageNet datasets, and show its competence both in competitive performance over previous work, and in much superior scalability.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"16 1","pages":"169-176"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81399288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 311
期刊
2011 International Conference on Computer Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1