首页 > 最新文献

2013 IEEE International Conference on Computer Vision最新文献

英文 中文
A Unified Rolling Shutter and Motion Blur Model for 3D Visual Registration 用于3D视觉配准的统一滚动快门和运动模糊模型
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.252
Maxime Meilland, T. Drummond, Andrew I. Comport
Motion blur and rolling shutter deformations both inhibit visual motion registration, whether it be due to a moving sensor or a moving target. Whilst both deformations exist simultaneously, no models have been proposed to handle them together. Furthermore, neither deformation has been considered previously in the context of monocular full-image 6 degrees of freedom registration or RGB-D structure and motion. As will be shown, rolling shutter deformation is observed when a camera moves faster than a single pixel in parallax between subsequent scan-lines. Blur is a function of the pixel exposure time and the motion vector. In this paper a complete dense 3D registration model will be derived to account for both motion blur and rolling shutter deformations simultaneously. Various approaches will be compared with respect to ground truth and live real-time performance will be demonstrated for complex scenarios where both blur and shutter deformations are dominant.
运动模糊和滚动快门变形都会抑制视觉运动注册,无论是由于移动的传感器还是移动的目标。虽然这两种变形同时存在,但没有提出将它们一起处理的模型。此外,在单眼全图像6自由度配准或RGB-D结构和运动的背景下,这两种变形都没有被考虑过。如图所示,当相机在后续扫描线之间的视差中移动速度超过单个像素时,就会观察到滚动快门变形。模糊是像素曝光时间和运动矢量的函数。本文将导出一个完整的密集三维配准模型,以同时考虑运动模糊和滚动快门变形。将对各种方法进行比较,并在模糊和快门变形占主导地位的复杂场景中演示实时性能。
{"title":"A Unified Rolling Shutter and Motion Blur Model for 3D Visual Registration","authors":"Maxime Meilland, T. Drummond, Andrew I. Comport","doi":"10.1109/ICCV.2013.252","DOIUrl":"https://doi.org/10.1109/ICCV.2013.252","url":null,"abstract":"Motion blur and rolling shutter deformations both inhibit visual motion registration, whether it be due to a moving sensor or a moving target. Whilst both deformations exist simultaneously, no models have been proposed to handle them together. Furthermore, neither deformation has been considered previously in the context of monocular full-image 6 degrees of freedom registration or RGB-D structure and motion. As will be shown, rolling shutter deformation is observed when a camera moves faster than a single pixel in parallax between subsequent scan-lines. Blur is a function of the pixel exposure time and the motion vector. In this paper a complete dense 3D registration model will be derived to account for both motion blur and rolling shutter deformations simultaneously. Various approaches will be compared with respect to ground truth and live real-time performance will be demonstrated for complex scenarios where both blur and shutter deformations are dominant.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84470001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
Codemaps - Segment, Classify and Search Objects Locally 代码映射-局部分割,分类和搜索对象
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.454
Zhenyang Li, E. Gavves, K. V. D. Sande, Cees G. M. Snoek, A. Smeulders
In this paper we aim for segmentation and classification of objects. We propose codemaps that are a joint formulation of the classification score and the local neighborhood it belongs to in the image. We obtain the codemap by reordering the encoding, pooling and classification steps over lattice elements. Other than existing linear decompositions who emphasize only the efficiency benefits for localized search, we make three novel contributions. As a preliminary, we provide a theoretical generalization of the sufficient mathematical conditions under which image encodings and classification becomes locally decomposable. As first novelty we introduce l2 normalization for arbitrarily shaped image regions, which is fast enough for semantic segmentation using our Fisher codemaps. Second, using the same lattice across images, we propose kernel pooling which embeds nonlinearities into codemaps for object classification by explicit or approximate feature mappings. Results demonstrate that l2 normalized Fisher codemaps improve the state-of-the-art in semantic segmentation for PASCAL VOC. For object classification the addition of nonlinearities brings us on par with the state-of-the-art, but is 3x faster. Because of the codemaps' inherent efficiency, we can reach significant speed-ups for localized search as well. We exploit the efficiency gain for our third novelty: object segment retrieval using a single query image only.
本文的目标是对目标进行分割和分类。我们提出了一种编码图,它是分类分数和它在图像中所属的局部邻域的联合表述。我们通过对格元素的编码、池化和分类步骤重新排序来获得码图。除了现有的线性分解只强调局部搜索的效率效益之外,我们做出了三个新的贡献。首先,我们从理论上概括了图像编码和分类可以局部分解的充分数学条件。作为第一个创新,我们为任意形状的图像区域引入了l2归一化,这对于使用我们的Fisher编码图进行语义分割来说足够快。其次,使用相同的栅格跨图像,我们提出了核池,将非线性嵌入到代码映射中,通过显式或近似特征映射进行对象分类。结果表明,l2规范化Fisher代码映射提高了PASCAL VOC的语义分割水平。对于对象分类,非线性的加入使我们达到了最先进的水平,但速度快了3倍。由于编码图固有的效率,我们也可以在本地化搜索中获得显著的加速。我们利用效率增益来实现第三个新功能:仅使用单个查询图像进行对象段检索。
{"title":"Codemaps - Segment, Classify and Search Objects Locally","authors":"Zhenyang Li, E. Gavves, K. V. D. Sande, Cees G. M. Snoek, A. Smeulders","doi":"10.1109/ICCV.2013.454","DOIUrl":"https://doi.org/10.1109/ICCV.2013.454","url":null,"abstract":"In this paper we aim for segmentation and classification of objects. We propose codemaps that are a joint formulation of the classification score and the local neighborhood it belongs to in the image. We obtain the codemap by reordering the encoding, pooling and classification steps over lattice elements. Other than existing linear decompositions who emphasize only the efficiency benefits for localized search, we make three novel contributions. As a preliminary, we provide a theoretical generalization of the sufficient mathematical conditions under which image encodings and classification becomes locally decomposable. As first novelty we introduce l2 normalization for arbitrarily shaped image regions, which is fast enough for semantic segmentation using our Fisher codemaps. Second, using the same lattice across images, we propose kernel pooling which embeds nonlinearities into codemaps for object classification by explicit or approximate feature mappings. Results demonstrate that l2 normalized Fisher codemaps improve the state-of-the-art in semantic segmentation for PASCAL VOC. For object classification the addition of nonlinearities brings us on par with the state-of-the-art, but is 3x faster. Because of the codemaps' inherent efficiency, we can reach significant speed-ups for localized search as well. We exploit the efficiency gain for our third novelty: object segment retrieval using a single query image only.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85145528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Tree Shape Priors with Connectivity Constraints Using Convex Relaxation on General Graphs 在一般图上使用凸松弛的带连通性约束的树形先验
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.290
Jan Stühmer, P. Schröder, D. Cremers
In this work we propose a novel method to include a connectivity prior into image segmentation that is based on a binary labeling of a directed graph, in this case a geodesic shortest path tree. Specifically we make two contributions: First, we construct a geodesic shortest path tree with a distance measure that is related to the image data and the bending energy of each path in the tree. Second, we include a connectivity prior in our segmentation model, that allows to segment not only a single elongated structure, but instead a whole connected branching tree. Because both our segmentation model and the connectivity constraint are convex a global optimal solution can be found. To this end, we generalize a recent primal-dual algorithm for continuous convex optimization to an arbitrary graph structure. To validate our method we present results on data from medical imaging in angiography and retinal blood vessel segmentation.
在这项工作中,我们提出了一种基于有向图的二值标记的新方法,该方法将连接预先包含在图像分割中,在这种情况下是测地线最短路径树。具体来说,我们做出了两个贡献:首先,我们构建了一个测地线最短路径树,该树具有与图像数据和树中每条路径的弯曲能量相关的距离度量。其次,我们在分割模型中包含了一个连接性,它不仅可以分割单个细长结构,还可以分割整个连接的分支树。由于分割模型和连通性约束都是凸的,因此可以找到全局最优解。为此,我们将最近的一种用于连续凸优化的原始对偶算法推广到任意图结构。为了验证我们的方法,我们给出了血管造影和视网膜血管分割的医学成像数据的结果。
{"title":"Tree Shape Priors with Connectivity Constraints Using Convex Relaxation on General Graphs","authors":"Jan Stühmer, P. Schröder, D. Cremers","doi":"10.1109/ICCV.2013.290","DOIUrl":"https://doi.org/10.1109/ICCV.2013.290","url":null,"abstract":"In this work we propose a novel method to include a connectivity prior into image segmentation that is based on a binary labeling of a directed graph, in this case a geodesic shortest path tree. Specifically we make two contributions: First, we construct a geodesic shortest path tree with a distance measure that is related to the image data and the bending energy of each path in the tree. Second, we include a connectivity prior in our segmentation model, that allows to segment not only a single elongated structure, but instead a whole connected branching tree. Because both our segmentation model and the connectivity constraint are convex a global optimal solution can be found. To this end, we generalize a recent primal-dual algorithm for continuous convex optimization to an arbitrary graph structure. To validate our method we present results on data from medical imaging in angiography and retinal blood vessel segmentation.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80845859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 53
GrabCut in One Cut GrabCut in One Cut
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.222
Meng Tang, Lena Gorelick, O. Veksler, Yuri Boykov
Among image segmentation algorithms there are two major groups: (a) methods assuming known appearance models and (b) methods estimating appearance models jointly with segmentation. Typically, the first group optimizes appearance log-likelihoods in combination with some spacial regularization. This problem is relatively simple and many methods guarantee globally optimal results. The second group treats model parameters as additional variables transforming simple segmentation energies into high-order NP-hard functionals (Zhu-Yuille, Chan-Vese, Grab Cut, etc). It is known that such methods indirectly minimize the appearance overlap between the segments. We propose a new energy term explicitly measuring L1 distance between the object and background appearance models that can be globally maximized in one graph cut. We show that in many applications our simple term makes NP-hard segmentation functionals unnecessary. Our one cut algorithm effectively replaces approximate iterative optimization techniques based on block coordinate descent.
在图像分割算法中有两大类:(a)假设已知外观模型的方法和(b)结合分割估计外观模型的方法。通常,第一组结合一些空间正则化优化外观对数似然。这个问题相对简单,许多方法都能保证全局最优的结果。第二组将模型参数作为附加变量,将简单分割能量转化为高阶NP-hard泛函数(Zhu-Yuille、Chan-Vese、Grab Cut等)。众所周知,这种方法间接地减少了片段之间的外观重叠。我们提出了一个新的能量项,明确地测量目标和背景外观模型之间的L1距离,可以在一个图切中全局最大化。我们表明,在许多应用中,我们的简单术语使np硬分割功能变得不必要。我们的一切算法有效地取代了基于块坐标下降的近似迭代优化技术。
{"title":"GrabCut in One Cut","authors":"Meng Tang, Lena Gorelick, O. Veksler, Yuri Boykov","doi":"10.1109/ICCV.2013.222","DOIUrl":"https://doi.org/10.1109/ICCV.2013.222","url":null,"abstract":"Among image segmentation algorithms there are two major groups: (a) methods assuming known appearance models and (b) methods estimating appearance models jointly with segmentation. Typically, the first group optimizes appearance log-likelihoods in combination with some spacial regularization. This problem is relatively simple and many methods guarantee globally optimal results. The second group treats model parameters as additional variables transforming simple segmentation energies into high-order NP-hard functionals (Zhu-Yuille, Chan-Vese, Grab Cut, etc). It is known that such methods indirectly minimize the appearance overlap between the segments. We propose a new energy term explicitly measuring L1 distance between the object and background appearance models that can be globally maximized in one graph cut. We show that in many applications our simple term makes NP-hard segmentation functionals unnecessary. Our one cut algorithm effectively replaces approximate iterative optimization techniques based on block coordinate descent.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83579402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 214
Dynamic Structured Model Selection 动态结构模型选择
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.330
David J. Weiss, Benjamin Sapp, B. Taskar
In many cases, the predictive power of structured models for for complex vision tasks is limited by a trade-off between the expressiveness and the computational tractability of the model. However, choosing this trade-off statically a priori is sub optimal, as images and videos in different settings vary tremendously in complexity. On the other hand, choosing the trade-off dynamically requires knowledge about the accuracy of different structured models on any given example. In this work, we propose a novel two-tier architecture that provides dynamic speed/accuracy trade-offs through a simple type of introspection. Our approach, which we call dynamic structured model selection (DMS), leverages typically intractable features in structured learning problems in order to automatically determine' which of several models should be used at test-time in order to maximize accuracy under a fixed budgetary constraint. We demonstrate DMS on two sequential modeling vision tasks, and we establish a new state-of-the-art in human pose estimation in video with an implementation that is roughly 23× faster than the previous standard implementation.
在许多情况下,复杂视觉任务的结构化模型的预测能力受到模型的表达性和计算可跟踪性之间的权衡的限制。然而,静态地先验地选择这种权衡是次优的,因为不同设置下的图像和视频的复杂性差异很大。另一方面,动态选择权衡需要了解任何给定示例上不同结构模型的准确性。在这项工作中,我们提出了一种新的两层架构,通过一种简单的自省提供动态的速度/精度权衡。我们的方法,我们称之为动态结构化模型选择(DMS),利用结构化学习问题中典型的棘手特征,以便自动确定在测试时应该使用几个模型中的哪个,以便在固定预算约束下最大化准确性。我们在两个顺序建模视觉任务上演示了DMS,并在视频中建立了一种新的人类姿态估计技术,其实现速度比以前的标准实现快大约23倍。
{"title":"Dynamic Structured Model Selection","authors":"David J. Weiss, Benjamin Sapp, B. Taskar","doi":"10.1109/ICCV.2013.330","DOIUrl":"https://doi.org/10.1109/ICCV.2013.330","url":null,"abstract":"In many cases, the predictive power of structured models for for complex vision tasks is limited by a trade-off between the expressiveness and the computational tractability of the model. However, choosing this trade-off statically a priori is sub optimal, as images and videos in different settings vary tremendously in complexity. On the other hand, choosing the trade-off dynamically requires knowledge about the accuracy of different structured models on any given example. In this work, we propose a novel two-tier architecture that provides dynamic speed/accuracy trade-offs through a simple type of introspection. Our approach, which we call dynamic structured model selection (DMS), leverages typically intractable features in structured learning problems in order to automatically determine' which of several models should be used at test-time in order to maximize accuracy under a fixed budgetary constraint. We demonstrate DMS on two sequential modeling vision tasks, and we establish a new state-of-the-art in human pose estimation in video with an implementation that is roughly 23× faster than the previous standard implementation.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78292976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Bird Part Localization Using Exemplar-Based Models with Enforced Pose and Subcategory Consistency 基于样本模型的鸟类部位定位,增强姿态和子类一致性
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.313
Jiongxin Liu, P. Belhumeur
In this paper, we propose a novel approach for bird part localization, targeting fine-grained categories with wide variations in appearance due to different poses (including aspect and orientation) and subcategories. As it is challenging to represent such variations across a large set of diverse samples with tractable parametric models, we turn to individual exemplars. Specifically, we extend the exemplar-based models in [4] by enforcing pose and subcategory consistency at the parts. During training, we build pose-specific detectors scoring part poses across subcategories, and subcategory-specific detectors scoring part appearance across poses. At the testing stage, likely exemplars are matched to the image, suggesting part locations whose pose and subcategory consistency are well-supported by the image cues. From these hypotheses, part configuration can be predicted with very high accuracy. Experimental results demonstrate significant performance gains from our method on an extensive dataset: CUB-200-2011 [30], for both localization and classification tasks.
在本文中,我们提出了一种新的鸟类部位定位方法,针对细粒度类别和子类别,这些类别由于不同的姿势(包括侧面和方向)而在外观上有很大变化。由于使用可处理的参数模型在大量不同样本中表示这种变化具有挑战性,因此我们转向单个示例。具体来说,我们在[4]中扩展了基于范例的模型,在零件上加强姿势和子类别的一致性。在训练过程中,我们构建了特定于姿势的检测器,对不同子类别的部分姿势进行评分,对不同子类别的部分外观进行评分。在测试阶段,可能的样例与图像相匹配,给出姿态和子类别一致性得到图像线索支持的零件位置。根据这些假设,零件结构可以以非常高的精度预测。实验结果表明,在广泛的数据集CUB-200-2011[30]上,我们的方法在定位和分类任务上都有显著的性能提升。
{"title":"Bird Part Localization Using Exemplar-Based Models with Enforced Pose and Subcategory Consistency","authors":"Jiongxin Liu, P. Belhumeur","doi":"10.1109/ICCV.2013.313","DOIUrl":"https://doi.org/10.1109/ICCV.2013.313","url":null,"abstract":"In this paper, we propose a novel approach for bird part localization, targeting fine-grained categories with wide variations in appearance due to different poses (including aspect and orientation) and subcategories. As it is challenging to represent such variations across a large set of diverse samples with tractable parametric models, we turn to individual exemplars. Specifically, we extend the exemplar-based models in [4] by enforcing pose and subcategory consistency at the parts. During training, we build pose-specific detectors scoring part poses across subcategories, and subcategory-specific detectors scoring part appearance across poses. At the testing stage, likely exemplars are matched to the image, suggesting part locations whose pose and subcategory consistency are well-supported by the image cues. From these hypotheses, part configuration can be predicted with very high accuracy. Experimental results demonstrate significant performance gains from our method on an extensive dataset: CUB-200-2011 [30], for both localization and classification tasks.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77776204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
Video Motion for Every Visible Point 视频运动的每一个可见点
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.306
Susanna Ricco, Carlo Tomasi
Dense motion of image points over many video frames can provide important information about the world. However, occlusions and drift make it impossible to compute long motion paths by merely concatenating optical flow vectors between consecutive frames. Instead, we solve for entire paths directly, and flag the frames in which each is visible. As in previous work, we anchor each path to a unique pixel which guarantees an even spatial distribution of paths. Unlike earlier methods, we allow paths to be anchored in any frame. By explicitly requiring that at least one visible path passes within a small neighborhood of every pixel, we guarantee complete coverage of all visible points in all frames. We achieve state-of-the-art results on real sequences including both rigid and non-rigid motions with significant occlusions.
图像点在许多视频帧上的密集运动可以提供关于世界的重要信息。然而,遮挡和漂移使得仅通过在连续帧之间连接光流矢量来计算长运动路径变得不可能。相反,我们直接求解整个路径,并标记每个路径可见的帧。在之前的工作中,我们将每条路径锚定到一个唯一的像素,以保证路径的均匀空间分布。与以前的方法不同,我们允许在任何框架中锚定路径。通过明确要求在每个像素的小邻域中至少有一条可见路径通过,我们保证在所有帧中完全覆盖所有可见点。我们在真实序列上取得了最先进的结果,包括具有显著遮挡的刚性和非刚性运动。
{"title":"Video Motion for Every Visible Point","authors":"Susanna Ricco, Carlo Tomasi","doi":"10.1109/ICCV.2013.306","DOIUrl":"https://doi.org/10.1109/ICCV.2013.306","url":null,"abstract":"Dense motion of image points over many video frames can provide important information about the world. However, occlusions and drift make it impossible to compute long motion paths by merely concatenating optical flow vectors between consecutive frames. Instead, we solve for entire paths directly, and flag the frames in which each is visible. As in previous work, we anchor each path to a unique pixel which guarantees an even spatial distribution of paths. Unlike earlier methods, we allow paths to be anchored in any frame. By explicitly requiring that at least one visible path passes within a small neighborhood of every pixel, we guarantee complete coverage of all visible points in all frames. We achieve state-of-the-art results on real sequences including both rigid and non-rigid motions with significant occlusions.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76768332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Conservation Tracking 保持跟踪
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.364
Martin Schiegg, Philipp Hanslovsky, Bernhard X. Kausler, L. Hufnagel, F. Hamprecht
The quality of any tracking-by-assignment hinges on the accuracy of the foregoing target detection / segmentation step. In many kinds of images, errors in this first stage are unavoidable. These errors then propagate to, and corrupt, the tracking result. Our main contribution is the first probabilistic graphical model that can explicitly account for over- and under segmentation errors even when the number of tracking targets is unknown and when they may divide, as in cell cultures. The tracking model we present implements global consistency constraints for the number of targets comprised by each detection and is solved to global optimality on reasonably large 2D+t and 3D+t datasets. In addition, we empirically demonstrate the effectiveness of a post processing that allows to establish target identity even across occlusion / under segmentation. The usefulness and efficiency of this new tracking method is demonstrated on three different and challenging 2D+t and 3D+t datasets from developmental biology.
任何分配跟踪的质量都取决于上述目标检测/分割步骤的准确性。在许多种类的图像中,这一阶段的错误是不可避免的。然后,这些错误会传播并破坏跟踪结果。我们的主要贡献是第一个概率图形模型,它可以明确地解释分割过度和分割不足的错误,即使在跟踪目标的数量未知以及它们可能分裂的情况下,如在细胞培养中。我们提出的跟踪模型实现了对每个检测组成的目标数量的全局一致性约束,并在相当大的2D+t和3D+t数据集上求解全局最优性。此外,我们通过经验证明了后处理的有效性,即使在遮挡/分割下也可以建立目标身份。这种新的跟踪方法的有效性和效率在发育生物学的三个不同的和具有挑战性的2D+t和3D+t数据集上得到了证明。
{"title":"Conservation Tracking","authors":"Martin Schiegg, Philipp Hanslovsky, Bernhard X. Kausler, L. Hufnagel, F. Hamprecht","doi":"10.1109/ICCV.2013.364","DOIUrl":"https://doi.org/10.1109/ICCV.2013.364","url":null,"abstract":"The quality of any tracking-by-assignment hinges on the accuracy of the foregoing target detection / segmentation step. In many kinds of images, errors in this first stage are unavoidable. These errors then propagate to, and corrupt, the tracking result. Our main contribution is the first probabilistic graphical model that can explicitly account for over- and under segmentation errors even when the number of tracking targets is unknown and when they may divide, as in cell cultures. The tracking model we present implements global consistency constraints for the number of targets comprised by each detection and is solved to global optimality on reasonably large 2D+t and 3D+t datasets. In addition, we empirically demonstrate the effectiveness of a post processing that allows to establish target identity even across occlusion / under segmentation. The usefulness and efficiency of this new tracking method is demonstrated on three different and challenging 2D+t and 3D+t datasets from developmental biology.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82158568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 63
Image Co-segmentation via Consistent Functional Maps 基于一致功能映射的图像共分割
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.110
F. Wang, Qi-Xing Huang, L. Guibas
Joint segmentation of image sets has great importance for object recognition, image classification, and image retrieval. In this paper, we aim to jointly segment a set of images starting from a small number of labeled images or none at all. To allow the images to share segmentation information with each other, we build a network that contains segmented as well as unsegmented images, and extract functional maps between connected image pairs based on image appearance features. These functional maps act as general property transporters between the images and, in particular, are used to transfer segmentations. We define and operate in a reduced functional space optimized so that the functional maps approximately satisfy cycle-consistency under composition in the network. A joint optimization framework is proposed to simultaneously generate all segmentation functions over the images so that they both align with local segmentation cues in each particular image, and agree with each other under network transportation. This formulation allows us to extract segmentations even with no training data, but can also exploit such data when available. The collective effect of the joint processing using functional maps leads to accurate information sharing among images and yields superior segmentation results, as shown on the iCoseg, MSRC, and PASCAL data sets.
图像集的联合分割对于目标识别、图像分类和图像检索具有重要意义。在本文中,我们的目标是从少量标记图像或根本没有标记图像开始联合分割一组图像。为了使图像之间能够共享分割信息,我们构建了一个包含分割图像和未分割图像的网络,并根据图像的外观特征提取连接图像对之间的功能映射。这些功能映射充当图像之间的一般属性传输器,特别是用于传输分割。我们定义了一个简化的功能空间,并对其进行了优化,使网络中的功能映射近似地满足组合条件下的循环一致性。提出了一种联合优化框架,在图像上同时生成所有的分割函数,使其在每个特定图像上既与局部分割线索对齐,又在网络传输下相互一致。这个公式允许我们在没有训练数据的情况下提取分割,但也可以在可用的情况下利用这些数据。如iCoseg、MSRC和PASCAL数据集所示,使用功能图的联合处理的集体效应导致图像之间准确的信息共享,并产生卓越的分割结果。
{"title":"Image Co-segmentation via Consistent Functional Maps","authors":"F. Wang, Qi-Xing Huang, L. Guibas","doi":"10.1109/ICCV.2013.110","DOIUrl":"https://doi.org/10.1109/ICCV.2013.110","url":null,"abstract":"Joint segmentation of image sets has great importance for object recognition, image classification, and image retrieval. In this paper, we aim to jointly segment a set of images starting from a small number of labeled images or none at all. To allow the images to share segmentation information with each other, we build a network that contains segmented as well as unsegmented images, and extract functional maps between connected image pairs based on image appearance features. These functional maps act as general property transporters between the images and, in particular, are used to transfer segmentations. We define and operate in a reduced functional space optimized so that the functional maps approximately satisfy cycle-consistency under composition in the network. A joint optimization framework is proposed to simultaneously generate all segmentation functions over the images so that they both align with local segmentation cues in each particular image, and agree with each other under network transportation. This formulation allows us to extract segmentations even with no training data, but can also exploit such data when available. The collective effect of the joint processing using functional maps leads to accurate information sharing among images and yields superior segmentation results, as shown on the iCoseg, MSRC, and PASCAL data sets.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82180219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 126
From Large Scale Image Categorization to Entry-Level Categories 从大规模图像分类到入门级分类
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.344
Vicente Ordonez, Jia Deng, Yejin Choi, A. Berg, Tamara L. Berg
Entry level categories - the labels people will use to name an object - were originally defined and studied by psychologists in the 1980s. In this paper we study entry-level categories at a large scale and learn the first models for predicting entry-level categories for images. Our models combine visual recognition predictions with proxies for word "naturalness" mined from the enormous amounts of text on the web. We demonstrate the usefulness of our models for predicting nouns (entry-level words) associated with images by people. We also learn mappings between concepts predicted by existing visual recognition systems and entry-level concepts that could be useful for improving human-focused applications such as natural language image description or retrieval.
入门级分类——人们用来命名一个物体的标签——最初是由心理学家在20世纪80年代定义和研究的。本文研究了大尺度的入门级分类,学习了预测图像入门级分类的第一个模型。我们的模型将视觉识别预测与从网络上大量文本中挖掘的单词“自然度”代理相结合。我们展示了我们的模型在预测人们与图像相关的名词(入门级单词)方面的实用性。我们还学习了现有视觉识别系统预测的概念与入门级概念之间的映射,这些概念可能有助于改进以人为中心的应用程序,如自然语言图像描述或检索。
{"title":"From Large Scale Image Categorization to Entry-Level Categories","authors":"Vicente Ordonez, Jia Deng, Yejin Choi, A. Berg, Tamara L. Berg","doi":"10.1109/ICCV.2013.344","DOIUrl":"https://doi.org/10.1109/ICCV.2013.344","url":null,"abstract":"Entry level categories - the labels people will use to name an object - were originally defined and studied by psychologists in the 1980s. In this paper we study entry-level categories at a large scale and learn the first models for predicting entry-level categories for images. Our models combine visual recognition predictions with proxies for word \"naturalness\" mined from the enormous amounts of text on the web. We demonstrate the usefulness of our models for predicting nouns (entry-level words) associated with images by people. We also learn mappings between concepts predicted by existing visual recognition systems and entry-level concepts that could be useful for improving human-focused applications such as natural language image description or retrieval.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82448227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 113
期刊
2013 IEEE International Conference on Computer Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1