首页 > 最新文献

2013 IEEE International Conference on Computer Vision最新文献

英文 中文
SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels SUN3D:基于SfM和目标标签的大空间重构数据库
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.458
Jianxiong Xiao, Andrew Owens, A. Torralba
Existing scene understanding datasets contain only a limited set of views of a place, and they lack representations of complete 3D spaces. In this paper, we introduce SUN3D, a large-scale RGB-D video database with camera pose and object labels, capturing the full 3D extent of many places. The tasks that go into constructing such a dataset are difficult in isolation -- hand-labeling videos is painstaking, and structure from motion (SfM) is unreliable for large spaces. But if we combine them together, we make the dataset construction task much easier. First, we introduce an intuitive labeling tool that uses a partial reconstruction to propagate labels from one frame to another. Then we use the object labels to fix errors in the reconstruction. For this, we introduce a generalization of bundle adjustment that incorporates object-to-object correspondences. This algorithm works by constraining points for the same object from different frames to lie inside a fixed-size bounding box, parameterized by its rotation and translation. The SUN3D database, the source code for the generalized bundle adjustment, and the web-based 3D annotation tool are all available at http://sun3d.cs.princeton.edu.
现有的场景理解数据集只包含一个地方的有限视图集,而且它们缺乏完整的3D空间的表示。在本文中,我们介绍了SUN3D,一个大规模的RGB-D视频数据库,具有相机姿态和物体标签,捕获了许多地方的全三维范围。构建这样一个数据集的任务是困难的——手工标记视频是艰苦的,而运动结构(SfM)对于大空间是不可靠的。但是如果我们把它们结合在一起,我们就会使数据集构建任务变得容易得多。首先,我们引入了一个直观的标记工具,该工具使用部分重建将标签从一帧传播到另一帧。然后,我们使用对象标签来修复重建中的错误。为此,我们引入了包含对象对对象对应的束调整的泛化。该算法的工作原理是将来自不同帧的同一对象的点约束在固定大小的边界框内,通过其旋转和平移参数化。SUN3D数据库、广义束调整的源代码和基于web的3D注释工具都可以在http://sun3d.cs.princeton.edu上获得。
{"title":"SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels","authors":"Jianxiong Xiao, Andrew Owens, A. Torralba","doi":"10.1109/ICCV.2013.458","DOIUrl":"https://doi.org/10.1109/ICCV.2013.458","url":null,"abstract":"Existing scene understanding datasets contain only a limited set of views of a place, and they lack representations of complete 3D spaces. In this paper, we introduce SUN3D, a large-scale RGB-D video database with camera pose and object labels, capturing the full 3D extent of many places. The tasks that go into constructing such a dataset are difficult in isolation -- hand-labeling videos is painstaking, and structure from motion (SfM) is unreliable for large spaces. But if we combine them together, we make the dataset construction task much easier. First, we introduce an intuitive labeling tool that uses a partial reconstruction to propagate labels from one frame to another. Then we use the object labels to fix errors in the reconstruction. For this, we introduce a generalization of bundle adjustment that incorporates object-to-object correspondences. This algorithm works by constraining points for the same object from different frames to lie inside a fixed-size bounding box, parameterized by its rotation and translation. The SUN3D database, the source code for the generalized bundle adjustment, and the web-based 3D annotation tool are all available at http://sun3d.cs.princeton.edu.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"162 1","pages":"1625-1632"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73804295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 684
Detecting Curved Symmetric Parts Using a Deformable Disc Model 利用可变形圆盘模型检测弯曲对称零件
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.220
T. S. Lee, S. Fidler, Sven J. Dickinson
Symmetry is a powerful shape regularity that's been exploited by perceptual grouping researchers in both human and computer vision to recover part structure from an image without a priori knowledge of scene content. Drawing on the concept of a medial axis, defined as the locus of centers of maximal inscribed discs that sweep out a symmetric part, we model part recovery as the search for a sequence of deformable maximal inscribed disc hypotheses generated from a multiscale super pixel segmentation, a framework proposed by LEV09. However, we learn affinities between adjacent super pixels in a space that's invariant to bending and tapering along the symmetry axis, enabling us to capture a wider class of symmetric parts. Moreover, we introduce a global cost that perceptually integrates the hypothesis space by combining a pair wise and a higher-level smoothing term, which we minimize globally using dynamic programming. The new framework is demonstrated on two datasets, and is shown to significantly outperform the baseline LEV09.
对称性是一种强大的形状规则,被人类和计算机视觉的感知分组研究人员利用,在没有先验的场景内容知识的情况下,从图像中恢复部分结构。根据中轴线的概念(定义为扫描对称部分的最大内切盘的中心轨迹),我们将部分恢复建模为搜索由多尺度超像素分割(由LEV09提出的框架)生成的一系列可变形的最大内切盘假设。然而,我们在一个沿对称轴弯曲和变细不变的空间中学习相邻超级像素之间的亲和力,使我们能够捕获更广泛的对称部分。此外,我们引入了一个全局代价,该代价通过结合对和更高级别平滑项来感知地集成假设空间,并使用动态规划对其进行全局最小化。新框架在两个数据集上进行了演示,并被证明显著优于基线水平。
{"title":"Detecting Curved Symmetric Parts Using a Deformable Disc Model","authors":"T. S. Lee, S. Fidler, Sven J. Dickinson","doi":"10.1109/ICCV.2013.220","DOIUrl":"https://doi.org/10.1109/ICCV.2013.220","url":null,"abstract":"Symmetry is a powerful shape regularity that's been exploited by perceptual grouping researchers in both human and computer vision to recover part structure from an image without a priori knowledge of scene content. Drawing on the concept of a medial axis, defined as the locus of centers of maximal inscribed discs that sweep out a symmetric part, we model part recovery as the search for a sequence of deformable maximal inscribed disc hypotheses generated from a multiscale super pixel segmentation, a framework proposed by LEV09. However, we learn affinities between adjacent super pixels in a space that's invariant to bending and tapering along the symmetry axis, enabling us to capture a wider class of symmetric parts. Moreover, we introduce a global cost that perceptually integrates the hypothesis space by combining a pair wise and a higher-level smoothing term, which we minimize globally using dynamic programming. The new framework is demonstrated on two datasets, and is shown to significantly outperform the baseline LEV09.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"98 1","pages":"1753-1760"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74667975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
Query-Adaptive Asymmetrical Dissimilarities for Visual Object Retrieval 面向视觉对象检索的查询自适应不对称不相似性
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.214
Cai-Zhi Zhu, H. Jégou, S. Satoh
Visual object retrieval aims at retrieving, from a collection of images, all those in which a given query object appears. It is inherently asymmetric: the query object is mostly included in the database image, while the converse is not necessarily true. However, existing approaches mostly compare the images with symmetrical measures, without considering the different roles of query and database. This paper first measure the extent of asymmetry on large-scale public datasets reflecting this task. Considering the standard bag-of-words representation, we then propose new asymmetrical dissimilarities accounting for the different inlier ratios associated with query and database images. These asymmetrical measures depend on the query, yet they are compatible with an inverted file structure, without noticeably impacting search efficiency. Our experiments show the benefit of our approach, and show that the visual object retrieval task is better treated asymmetrically, in the spirit of state-of-the-art text retrieval.
视觉对象检索旨在从图像集合中检索所有出现给定查询对象的图像。它本质上是不对称的:查询对象大多包含在数据库映像中,而反之则不一定正确。然而,现有的方法大多采用对称度量对图像进行比较,没有考虑查询和数据库的不同作用。本文首先测量了反映这一任务的大规模公共数据集的不对称程度。考虑到标准的词袋表示,我们提出了新的不对称不相似性,考虑到与查询和数据库图像相关的不同内嵌比。这些不对称的度量取决于查询,但它们与反向文件结构兼容,不会显著影响搜索效率。我们的实验表明了我们的方法的好处,并表明视觉对象检索任务更好地处理不对称,在最先进的文本检索的精神。
{"title":"Query-Adaptive Asymmetrical Dissimilarities for Visual Object Retrieval","authors":"Cai-Zhi Zhu, H. Jégou, S. Satoh","doi":"10.1109/ICCV.2013.214","DOIUrl":"https://doi.org/10.1109/ICCV.2013.214","url":null,"abstract":"Visual object retrieval aims at retrieving, from a collection of images, all those in which a given query object appears. It is inherently asymmetric: the query object is mostly included in the database image, while the converse is not necessarily true. However, existing approaches mostly compare the images with symmetrical measures, without considering the different roles of query and database. This paper first measure the extent of asymmetry on large-scale public datasets reflecting this task. Considering the standard bag-of-words representation, we then propose new asymmetrical dissimilarities accounting for the different inlier ratios associated with query and database images. These asymmetrical measures depend on the query, yet they are compatible with an inverted file structure, without noticeably impacting search efficiency. Our experiments show the benefit of our approach, and show that the visual object retrieval task is better treated asymmetrically, in the spirit of state-of-the-art text retrieval.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"18 1","pages":"1705-1712"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75739663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 59
A General Dense Image Matching Framework Combining Direct and Feature-Based Costs 结合直接代价和特征代价的通用密集图像匹配框架
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.30
Jim Braux-Zin, R. Dupont, A. Bartoli
Dense motion field estimation (typically optical flow, stereo disparity and surface registration) is a key computer vision problem. Many solutions have been proposed to compute small or large displacements, narrow or wide baseline stereo disparity, but a unified methodology is still lacking. We here introduce a general framework that robustly combines direct and feature-based matching. The feature-based cost is built around a novel robust distance function that handles key points and ``weak'' features such as segments. It allows us to use putative feature matches which may contain mismatches to guide dense motion estimation out of local minima. Our framework uses a robust direct data term (AD-Census). It is implemented with a powerful second order Total Generalized Variation regularization with external and self-occlusion reasoning. Our framework achieves state of the art performance in several cases (standard optical flow benchmarks, wide-baseline stereo and non-rigid surface registration). Our framework has a modular design that customizes to specific application needs.
密集运动场估计(通常是光流、立体视差和表面配准)是一个关键的计算机视觉问题。对于计算大位移或小位移、宽基线或窄基线立体视差,已经提出了许多解决方案,但仍然缺乏统一的方法。我们在这里介绍了一个通用框架,它将直接匹配和基于特征的匹配健壮地结合在一起。基于特征的成本是围绕一个新的鲁棒距离函数建立的,该函数处理关键点和“弱”特征(如片段)。它允许我们使用可能包含不匹配的假定特征匹配来引导密集运动估计脱离局部最小值。我们的框架使用健壮的直接数据项(AD-Census)。该算法采用一种强大的二阶全广义变分正则化方法,并结合外部和自闭塞推理实现。我们的框架在几种情况下实现了最先进的性能(标准光流基准,宽基线立体和非刚性表面配准)。我们的框架采用模块化设计,可以根据特定的应用程序需求进行定制。
{"title":"A General Dense Image Matching Framework Combining Direct and Feature-Based Costs","authors":"Jim Braux-Zin, R. Dupont, A. Bartoli","doi":"10.1109/ICCV.2013.30","DOIUrl":"https://doi.org/10.1109/ICCV.2013.30","url":null,"abstract":"Dense motion field estimation (typically optical flow, stereo disparity and surface registration) is a key computer vision problem. Many solutions have been proposed to compute small or large displacements, narrow or wide baseline stereo disparity, but a unified methodology is still lacking. We here introduce a general framework that robustly combines direct and feature-based matching. The feature-based cost is built around a novel robust distance function that handles key points and ``weak'' features such as segments. It allows us to use putative feature matches which may contain mismatches to guide dense motion estimation out of local minima. Our framework uses a robust direct data term (AD-Census). It is implemented with a powerful second order Total Generalized Variation regularization with external and self-occlusion reasoning. Our framework achieves state of the art performance in several cases (standard optical flow benchmarks, wide-baseline stereo and non-rigid surface registration). Our framework has a modular design that customizes to specific application needs.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"61 1","pages":"185-192"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72585974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 56
Nested Shape Descriptors 嵌套形状描述符
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.152
J. Byrne, Jianbo Shi
In this paper, we propose a new family of binary local feature descriptors called nested shape descriptors. These descriptors are constructed by pooling oriented gradients over a large geometric structure called the Hawaiian earring, which is constructed with a nested correlation structure that enables a new robust local distance function called the nesting distance. This distance function is unique to the nested descriptor and provides robustness to outliers from order statistics. In this paper, we define the nested shape descriptor family and introduce a specific member called the seed-of-life descriptor. We perform a trade study to determine optimal descriptor parameters for the task of image matching. Finally, we evaluate performance compared to state-of-the-art local feature descriptors on the VGG-Affine image matching benchmark, showing significant performance gains. Our descriptor is the first binary descriptor to outperform SIFT on this benchmark.
本文提出了一种新的二元局部特征描述符,称为嵌套形状描述符。这些描述符是通过在称为Hawaiian earring的大型几何结构上池化定向梯度来构建的,该结构是用嵌套相关结构构建的,该结构支持称为嵌套距离的新的鲁棒局部距离函数。这个距离函数对于嵌套的描述符来说是唯一的,它为顺序统计的异常值提供了鲁棒性。在本文中,我们定义了嵌套形状描述符族,并引入了一个特定的成员,称为生命种子描述符。我们进行了一项贸易研究,以确定图像匹配任务的最佳描述符参数。最后,我们在vgg -仿射图像匹配基准上与最先进的局部特征描述符进行了性能评估,显示出显著的性能提升。我们的描述符是第一个在此基准测试中优于SIFT的二进制描述符。
{"title":"Nested Shape Descriptors","authors":"J. Byrne, Jianbo Shi","doi":"10.1109/ICCV.2013.152","DOIUrl":"https://doi.org/10.1109/ICCV.2013.152","url":null,"abstract":"In this paper, we propose a new family of binary local feature descriptors called nested shape descriptors. These descriptors are constructed by pooling oriented gradients over a large geometric structure called the Hawaiian earring, which is constructed with a nested correlation structure that enables a new robust local distance function called the nesting distance. This distance function is unique to the nested descriptor and provides robustness to outliers from order statistics. In this paper, we define the nested shape descriptor family and introduce a specific member called the seed-of-life descriptor. We perform a trade study to determine optimal descriptor parameters for the task of image matching. Finally, we evaluate performance compared to state-of-the-art local feature descriptors on the VGG-Affine image matching benchmark, showing significant performance gains. Our descriptor is the first binary descriptor to outperform SIFT on this benchmark.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"2 1","pages":"1201-1208"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75018597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Partial Sum Minimization of Singular Values in RPCA for Low-Level Vision 基于低层次视觉的RPCA奇异值部分和最小化算法
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.25
Tae-Hyun Oh, Hyeongwoo Kim, Yu-Wing Tai, J. Bazin, In-So Kweon
Robust Principal Component Analysis (RPCA) via rank minimization is a powerful tool for recovering underlying low-rank structure of clean data corrupted with sparse noise/outliers. In many low-level vision problems, not only it is known that the underlying structure of clean data is low-rank, but the exact rank of clean data is also known. Yet, when applying conventional rank minimization for those problems, the objective function is formulated in a way that does not fully utilize a priori target rank information about the problems. This observation motivates us to investigate whether there is a better alternative solution when using rank minimization. In this paper, instead of minimizing the nuclear norm, we propose to minimize the partial sum of singular values. The proposed objective function implicitly encourages the target rank constraint in rank minimization. Our experimental analyses show that our approach performs better than conventional rank minimization when the number of samples is deficient, while the solutions obtained by the two approaches are almost identical when the number of samples is more than sufficient. We apply our approach to various low-level vision problems, e.g. high dynamic range imaging, photometric stereo and image alignment, and show that our results outperform those obtained by the conventional nuclear norm rank minimization method.
通过秩最小化的鲁棒主成分分析(RPCA)是恢复被稀疏噪声/异常值损坏的干净数据的潜在低秩结构的强大工具。在许多低级视觉问题中,不仅知道干净数据的底层结构是低秩的,而且还知道干净数据的确切秩。然而,当对这些问题应用传统的秩最小化时,目标函数的制定方式并没有充分利用有关问题的先验目标秩信息。这一观察结果促使我们去研究在使用秩最小化时是否有更好的替代解决方案。在本文中,我们提出了奇异值的部分和的最小化,而不是最小化核范数。所提出的目标函数隐含地鼓励了秩最小化中的目标秩约束。我们的实验分析表明,当样本数量不足时,我们的方法比传统的秩最小化方法性能更好,而当样本数量大于足够时,两种方法得到的解几乎相同。我们将该方法应用于各种低级视觉问题,如高动态范围成像,光度立体和图像对齐,并表明我们的结果优于传统的核范数秩最小化方法。
{"title":"Partial Sum Minimization of Singular Values in RPCA for Low-Level Vision","authors":"Tae-Hyun Oh, Hyeongwoo Kim, Yu-Wing Tai, J. Bazin, In-So Kweon","doi":"10.1109/ICCV.2013.25","DOIUrl":"https://doi.org/10.1109/ICCV.2013.25","url":null,"abstract":"Robust Principal Component Analysis (RPCA) via rank minimization is a powerful tool for recovering underlying low-rank structure of clean data corrupted with sparse noise/outliers. In many low-level vision problems, not only it is known that the underlying structure of clean data is low-rank, but the exact rank of clean data is also known. Yet, when applying conventional rank minimization for those problems, the objective function is formulated in a way that does not fully utilize a priori target rank information about the problems. This observation motivates us to investigate whether there is a better alternative solution when using rank minimization. In this paper, instead of minimizing the nuclear norm, we propose to minimize the partial sum of singular values. The proposed objective function implicitly encourages the target rank constraint in rank minimization. Our experimental analyses show that our approach performs better than conventional rank minimization when the number of samples is deficient, while the solutions obtained by the two approaches are almost identical when the number of samples is more than sufficient. We apply our approach to various low-level vision problems, e.g. high dynamic range imaging, photometric stereo and image alignment, and show that our results outperform those obtained by the conventional nuclear norm rank minimization method.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"35 1","pages":"145-152"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75121871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 69
Multi-view Object Segmentation in Space and Time 空间和时间的多视图目标分割
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.328
Abdelaziz Djelouah, Jean-Sébastien Franco, Edmond Boyer, F. Clerc, P. Pérez
In this paper, we address the problem of object segmentation in multiple views or videos when two or more viewpoints of the same scene are available. We propose a new approach that propagates segmentation coherence information in both space and time, hence allowing evidences in one image to be shared over the complete set. To this aim the segmentation is cast as a single efficient labeling problem over space and time with graph cuts. In contrast to most existing multi-view segmentation methods that rely on some form of dense reconstruction, ours only requires a sparse 3D sampling to propagate information between viewpoints. The approach is thoroughly evaluated on standard multi-view datasets, as well as on videos. With static views, results compete with state of the art methods but they are achieved with significantly fewer viewpoints. With multiple videos, we report results that demonstrate the benefit of segmentation propagation through temporal cues.
在本文中,我们解决了当同一场景的两个或多个视点可用时,多个视点或视频中的目标分割问题。我们提出了一种在空间和时间上传播分割相干信息的新方法,从而允许在完整的集合上共享一张图像中的证据。为了达到这个目的,分割被视为一个单一的有效的标记问题在空间和时间上与图切割。与大多数现有的依赖于某种形式的密集重建的多视图分割方法相比,我们的方法只需要一个稀疏的3D采样来在视点之间传播信息。该方法在标准的多视图数据集以及视频上进行了彻底的评估。使用静态视图,结果与最先进的方法竞争,但它们是用更少的视点实现的。通过多个视频,我们报告了通过时间线索进行分割传播的好处。
{"title":"Multi-view Object Segmentation in Space and Time","authors":"Abdelaziz Djelouah, Jean-Sébastien Franco, Edmond Boyer, F. Clerc, P. Pérez","doi":"10.1109/ICCV.2013.328","DOIUrl":"https://doi.org/10.1109/ICCV.2013.328","url":null,"abstract":"In this paper, we address the problem of object segmentation in multiple views or videos when two or more viewpoints of the same scene are available. We propose a new approach that propagates segmentation coherence information in both space and time, hence allowing evidences in one image to be shared over the complete set. To this aim the segmentation is cast as a single efficient labeling problem over space and time with graph cuts. In contrast to most existing multi-view segmentation methods that rely on some form of dense reconstruction, ours only requires a sparse 3D sampling to propagate information between viewpoints. The approach is thoroughly evaluated on standard multi-view datasets, as well as on videos. With static views, results compete with state of the art methods but they are achieved with significantly fewer viewpoints. With multiple videos, we report results that demonstrate the benefit of segmentation propagation through temporal cues.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"33 1","pages":"2640-2647"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76587043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
PM-Huber: PatchMatch with Huber Regularization for Stereo Matching PM-Huber:用Huber正则化进行立体匹配的PatchMatch
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.293
Philipp Heise, S. Klose, B. Jensen, Alois Knoll
Most stereo correspondence algorithms match support windows at integer-valued disparities and assume a constant disparity value within the support window. The recently proposed Patch Match stereo algorithm by Bleyer et al. overcomes this limitation of previous algorithms by directly estimating planes. This work presents a method that integrates the Patch Match stereo algorithm into a variational smoothing formulation using quadratic relaxation. The resulting algorithm allows the explicit regularization of the disparity and normal gradients using the estimated plane parameters. Evaluation of our method in the Middlebury benchmark shows that our method outperforms the traditional integer-valued disparity strategy as well as the original algorithm and its variants in sub-pixel accurate disparity estimation.
大多数立体对应算法在整数值视差处匹配支持窗口,并在支持窗口内假定视差值恒定。最近Bleyer等人提出的Patch Match立体算法通过直接估计平面,克服了以往算法的这一局限性。这项工作提出了一种方法,将Patch匹配立体算法集成到使用二次松弛的变分平滑公式中。所得到的算法允许使用估计的平面参数对视差和法向梯度进行显式正则化。在Middlebury基准测试中的评估表明,我们的方法在亚像素精确视差估计方面优于传统的整数值视差策略以及原始算法及其变体。
{"title":"PM-Huber: PatchMatch with Huber Regularization for Stereo Matching","authors":"Philipp Heise, S. Klose, B. Jensen, Alois Knoll","doi":"10.1109/ICCV.2013.293","DOIUrl":"https://doi.org/10.1109/ICCV.2013.293","url":null,"abstract":"Most stereo correspondence algorithms match support windows at integer-valued disparities and assume a constant disparity value within the support window. The recently proposed Patch Match stereo algorithm by Bleyer et al. overcomes this limitation of previous algorithms by directly estimating planes. This work presents a method that integrates the Patch Match stereo algorithm into a variational smoothing formulation using quadratic relaxation. The resulting algorithm allows the explicit regularization of the disparity and normal gradients using the estimated plane parameters. Evaluation of our method in the Middlebury benchmark shows that our method outperforms the traditional integer-valued disparity strategy as well as the original algorithm and its variants in sub-pixel accurate disparity estimation.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"45 1","pages":"2360-2367"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77540809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 160
Dynamic Probabilistic Volumetric Models 动态概率体积模型
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.68
Ali O. Ulusoy, O. Biris, J. Mundy
This paper presents a probabilistic volumetric framework for image based modeling of general dynamic 3-d scenes. The framework is targeted towards high quality modeling of complex scenes evolving over thousands of frames. Extensive storage and computational resources are required in processing large scale space-time (4-d) data. Existing methods typically store separate 3-d models at each time step and do not address such limitations. A novel 4-d representation is proposed that adaptively subdivides in space and time to explain the appearance of 3-d dynamic surfaces. This representation is shown to achieve compression of 4-d data and provide efficient spatio-temporal processing. The advances of the proposed framework is demonstrated on standard datasets using free-viewpoint video and 3-d tracking applications.
本文提出了一种基于图像的概率体框架,用于一般动态三维场景的建模。该框架的目标是对数千帧的复杂场景进行高质量建模。处理大规模时空(4d)数据需要大量的存储和计算资源。现有的方法通常在每个时间步存储单独的3-d模型,并且没有解决这种限制。提出了一种在空间和时间上自适应细分的新的四维表示来解释三维动态曲面的外观。这种表示方式可以实现对四维数据的压缩,并提供高效的时空处理。在使用自由视点视频和三维跟踪应用程序的标准数据集上演示了所提出框架的进展。
{"title":"Dynamic Probabilistic Volumetric Models","authors":"Ali O. Ulusoy, O. Biris, J. Mundy","doi":"10.1109/ICCV.2013.68","DOIUrl":"https://doi.org/10.1109/ICCV.2013.68","url":null,"abstract":"This paper presents a probabilistic volumetric framework for image based modeling of general dynamic 3-d scenes. The framework is targeted towards high quality modeling of complex scenes evolving over thousands of frames. Extensive storage and computational resources are required in processing large scale space-time (4-d) data. Existing methods typically store separate 3-d models at each time step and do not address such limitations. A novel 4-d representation is proposed that adaptively subdivides in space and time to explain the appearance of 3-d dynamic surfaces. This representation is shown to achieve compression of 4-d data and provide efficient spatio-temporal processing. The advances of the proposed framework is demonstrated on standard datasets using free-viewpoint video and 3-d tracking applications.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"35 1","pages":"505-512"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81798680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
A Generalized Low-Rank Appearance Model for Spatio-temporally Correlated Rain Streaks 时空相关雨条的广义低阶出现模式
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.247
Yi-Lei Chen, Chiou-Ting Hsu
In this paper, we propose a novel low-rank appearance model for removing rain streaks. Different from previous work, our method needs neither rain pixel detection nor time-consuming dictionary learning stage. Instead, as rain streaks usually reveal similar and repeated patterns on imaging scene, we propose and generalize a low-rank model from matrix to tensor structure in order to capture the spatio-temporally correlated rain streaks. With the appearance model, we thus remove rain streaks from image/video (and also other high-order image structure) in a unified way. Our experimental results demonstrate competitive (or even better) visual quality and efficient run-time in comparison with state of the art.
在本文中,我们提出了一种新的低阶外观模型来去除雨纹。与以往的工作不同,我们的方法既不需要雨像素检测,也不需要耗时的字典学习阶段。相反,由于雨条在成像场景中通常呈现相似和重复的模式,我们提出并推广了从矩阵到张量结构的低秩模型,以捕获时空相关的雨条。通过外观模型,我们可以统一地从图像/视频(以及其他高阶图像结构)中去除雨纹。我们的实验结果显示,与目前的技术相比,具有竞争力(甚至更好)的视觉质量和高效的运行时间。
{"title":"A Generalized Low-Rank Appearance Model for Spatio-temporally Correlated Rain Streaks","authors":"Yi-Lei Chen, Chiou-Ting Hsu","doi":"10.1109/ICCV.2013.247","DOIUrl":"https://doi.org/10.1109/ICCV.2013.247","url":null,"abstract":"In this paper, we propose a novel low-rank appearance model for removing rain streaks. Different from previous work, our method needs neither rain pixel detection nor time-consuming dictionary learning stage. Instead, as rain streaks usually reveal similar and repeated patterns on imaging scene, we propose and generalize a low-rank model from matrix to tensor structure in order to capture the spatio-temporally correlated rain streaks. With the appearance model, we thus remove rain streaks from image/video (and also other high-order image structure) in a unified way. Our experimental results demonstrate competitive (or even better) visual quality and efficient run-time in comparison with state of the art.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"15 1","pages":"1968-1975"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81825146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 364
期刊
2013 IEEE International Conference on Computer Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1