首页 > 最新文献

2015 IEEE International Conference on Computer Vision (ICCV)最新文献

英文 中文
RGB-Guided Hyperspectral Image Upsampling rgb制导高光谱图像上采样
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.43
HyeokHyen Kwon, Yu-Wing Tai
Hyperspectral imaging usually lack of spatial resolution due to limitations of hardware design of imaging sensors. On the contrary, latest imaging sensors capture a RGB image with resolution of multiple times larger than a hyperspectral image. In this paper, we present an algorithm to enhance and upsample the resolution of hyperspectral images. Our algorithm consists of two stages: spatial upsampling stage and spectrum substitution stage. The spatial upsampling stage is guided by a high resolution RGB image of the same scene, and the spectrum substitution stage utilizes sparse coding to locally refine the upsampled hyperspectral image through dictionary substitution. Experiments show that our algorithm is highly effective and has outperformed state-of-the-art matrix factorization based approaches.
由于成像传感器硬件设计的限制,高光谱成像通常缺乏空间分辨率。相反,最新的成像传感器捕获的RGB图像的分辨率是高光谱图像的数倍。本文提出了一种提高高光谱图像分辨率的算法。该算法包括两个阶段:空间上采样阶段和频谱替换阶段。空间上采样阶段以同一场景的高分辨率RGB图像为指导,光谱替换阶段利用稀疏编码,通过字典替换对上采样的高光谱图像进行局部细化。实验表明,我们的算法是非常有效的,并且优于最先进的基于矩阵分解的方法。
{"title":"RGB-Guided Hyperspectral Image Upsampling","authors":"HyeokHyen Kwon, Yu-Wing Tai","doi":"10.1109/ICCV.2015.43","DOIUrl":"https://doi.org/10.1109/ICCV.2015.43","url":null,"abstract":"Hyperspectral imaging usually lack of spatial resolution due to limitations of hardware design of imaging sensors. On the contrary, latest imaging sensors capture a RGB image with resolution of multiple times larger than a hyperspectral image. In this paper, we present an algorithm to enhance and upsample the resolution of hyperspectral images. Our algorithm consists of two stages: spatial upsampling stage and spectrum substitution stage. The spatial upsampling stage is guided by a high resolution RGB image of the same scene, and the spectrum substitution stage utilizes sparse coding to locally refine the upsampled hyperspectral image through dictionary substitution. Experiments show that our algorithm is highly effective and has outperformed state-of-the-art matrix factorization based approaches.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"1 1","pages":"307-315"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87079185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 49
Contractive Rectifier Networks for Nonlinear Maximum Margin Classification 非线性最大余量分类的收缩整流网络
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.289
S. An, Munawar Hayat, S. H. Khan, Bennamoun, F. Boussaïd, Ferdous Sohel
To find the optimal nonlinear separating boundary with maximum margin in the input data space, this paper proposes Contractive Rectifier Networks (CRNs), wherein the hidden-layer transformations are restricted to be contraction mappings. The contractive constraints ensure that the achieved separating margin in the input space is larger than or equal to the separating margin in the output layer. The training of the proposed CRNs is formulated as a linear support vector machine (SVM) in the output layer, combined with two or more contractive hidden layers. Effective algorithms have been proposed to address the optimization challenges arising from contraction constraints. Experimental results on MNIST, CIFAR-10, CIFAR-100 and MIT-67 datasets demonstrate that the proposed contractive rectifier networks consistently outperform their conventional unconstrained rectifier network counterparts.
为了在输入数据空间中寻找具有最大边界的最优非线性分离边界,本文提出了压缩整流网络(CRNs),其中隐藏层变换被限制为收缩映射。收缩约束确保在输入空间中实现的分离边界大于或等于输出层的分离边界。所提出的crn的训练被表述为输出层中的线性支持向量机(SVM),结合两个或多个收缩隐藏层。已经提出了有效的算法来解决由收缩约束引起的优化挑战。在MNIST、CIFAR-10、CIFAR-100和MIT-67数据集上的实验结果表明,所提出的收缩整流网络始终优于传统的无约束整流网络。
{"title":"Contractive Rectifier Networks for Nonlinear Maximum Margin Classification","authors":"S. An, Munawar Hayat, S. H. Khan, Bennamoun, F. Boussaïd, Ferdous Sohel","doi":"10.1109/ICCV.2015.289","DOIUrl":"https://doi.org/10.1109/ICCV.2015.289","url":null,"abstract":"To find the optimal nonlinear separating boundary with maximum margin in the input data space, this paper proposes Contractive Rectifier Networks (CRNs), wherein the hidden-layer transformations are restricted to be contraction mappings. The contractive constraints ensure that the achieved separating margin in the input space is larger than or equal to the separating margin in the output layer. The training of the proposed CRNs is formulated as a linear support vector machine (SVM) in the output layer, combined with two or more contractive hidden layers. Effective algorithms have been proposed to address the optimization challenges arising from contraction constraints. Experimental results on MNIST, CIFAR-10, CIFAR-100 and MIT-67 datasets demonstrate that the proposed contractive rectifier networks consistently outperform their conventional unconstrained rectifier network counterparts.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"2011 1","pages":"2515-2523"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86346738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Towards Pointless Structure from Motion: 3D Reconstruction and Camera Parameters from General 3D Curves 从运动走向无意义的结构:从一般3D曲线的3D重建和相机参数
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.272
Irina Nurutdinova, A. Fitzgibbon
Modern structure from motion (SfM) remains dependent on point features to recover camera positions, meaning that reconstruction is severely hampered in low-texture environments, for example scanning a plain coffee cup on an uncluttered table. We show how 3D curves can be used to refine camera position estimation in challenging low-texture scenes. In contrast to previous work, we allow the curves to be partially observed in all images, meaning that for the first time, curve-based SfM can be demonstrated in realistic scenes. The algorithm is based on bundle adjustment, so needs an initial estimate, but even a poor estimate from a few point correspondences can be substantially improved by including curves, suggesting that this method would benefit many existing systems.
现代运动结构(SfM)仍然依赖于点特征来恢复相机位置,这意味着重建在低纹理环境中严重受阻,例如扫描整洁桌子上的普通咖啡杯。我们展示了如何在具有挑战性的低纹理场景中使用3D曲线来改进相机位置估计。与之前的工作相反,我们允许在所有图像中部分观察曲线,这意味着第一次可以在现实场景中展示基于曲线的SfM。该算法基于束平差,因此需要一个初始估计,但即使是几个点对应的差估计也可以通过包含曲线而得到很大的改善,这表明该方法将使许多现有系统受益。
{"title":"Towards Pointless Structure from Motion: 3D Reconstruction and Camera Parameters from General 3D Curves","authors":"Irina Nurutdinova, A. Fitzgibbon","doi":"10.1109/ICCV.2015.272","DOIUrl":"https://doi.org/10.1109/ICCV.2015.272","url":null,"abstract":"Modern structure from motion (SfM) remains dependent on point features to recover camera positions, meaning that reconstruction is severely hampered in low-texture environments, for example scanning a plain coffee cup on an uncluttered table. We show how 3D curves can be used to refine camera position estimation in challenging low-texture scenes. In contrast to previous work, we allow the curves to be partially observed in all images, meaning that for the first time, curve-based SfM can be demonstrated in realistic scenes. The algorithm is based on bundle adjustment, so needs an initial estimate, but even a poor estimate from a few point correspondences can be substantially improved by including curves, suggesting that this method would benefit many existing systems.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"35 1","pages":"2363-2371"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87425208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Multiple Hypothesis Tracking Revisited 重新审视多重假设跟踪
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.533
Chanho Kim, Fuxin Li, A. Ciptadi, James M. Rehg
This paper revisits the classical multiple hypotheses tracking (MHT) algorithm in a tracking-by-detection framework. The success of MHT largely depends on the ability to maintain a small list of potential hypotheses, which can be facilitated with the accurate object detectors that are currently available. We demonstrate that a classical MHT implementation from the 90's can come surprisingly close to the performance of state-of-the-art methods on standard benchmark datasets. In order to further utilize the strength of MHT in exploiting higher-order information, we introduce a method for training online appearance models for each track hypothesis. We show that appearance models can be learned efficiently via a regularized least squares framework, requiring only a few extra operations for each hypothesis branch. We obtain state-of-the-art results on popular tracking-by-detection datasets such as PETS and the recent MOT challenge.
本文在检测跟踪框架下重新研究了经典的多假设跟踪(MHT)算法。MHT的成功在很大程度上取决于维持一小部分潜在假设的能力,这可以通过目前可用的精确目标探测器来促进。我们证明了90年代的经典MHT实现可以惊人地接近标准基准数据集上最先进方法的性能。为了进一步利用MHT在挖掘高阶信息方面的优势,我们引入了一种训练每个轨道假设的在线外观模型的方法。我们证明了外观模型可以通过正则化最小二乘框架有效地学习,只需要对每个假设分支进行一些额外的操作。我们在流行的检测跟踪数据集(如pet和最近的MOT挑战)上获得了最先进的结果。
{"title":"Multiple Hypothesis Tracking Revisited","authors":"Chanho Kim, Fuxin Li, A. Ciptadi, James M. Rehg","doi":"10.1109/ICCV.2015.533","DOIUrl":"https://doi.org/10.1109/ICCV.2015.533","url":null,"abstract":"This paper revisits the classical multiple hypotheses tracking (MHT) algorithm in a tracking-by-detection framework. The success of MHT largely depends on the ability to maintain a small list of potential hypotheses, which can be facilitated with the accurate object detectors that are currently available. We demonstrate that a classical MHT implementation from the 90's can come surprisingly close to the performance of state-of-the-art methods on standard benchmark datasets. In order to further utilize the strength of MHT in exploiting higher-order information, we introduce a method for training online appearance models for each track hypothesis. We show that appearance models can be learned efficiently via a regularized least squares framework, requiring only a few extra operations for each hypothesis branch. We obtain state-of-the-art results on popular tracking-by-detection datasets such as PETS and the recent MOT challenge.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"104 1","pages":"4696-4704"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80830221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 571
Airborne Three-Dimensional Cloud Tomography 航空三维云层析成像
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.386
Aviad Levis, Y. Schechner, Amit Aides, A. Davis
We seek to sense the three dimensional (3D) volumetric distribution of scatterers in a heterogenous medium. An important case study for such a medium is the atmosphere. Atmospheric contents and their role in Earth's radiation balance have significant uncertainties with regards to scattering components: aerosols and clouds. Clouds, made of water droplets, also lead to local effects as precipitation and shadows. Our sensing approach is computational tomography using passive multi-angular imagery. For light-matter interaction that accounts for multiple-scattering, we use the 3D radiative transfer equation as a forward model. Volumetric recovery by inverting this model suffers from a computational bottleneck on large scales, which include many unknowns. Steps taken make this tomography tractable, without approximating the scattering order or angle range.
我们试图在非均质介质中感知散射体的三维(3D)体积分布。研究这种介质的一个重要案例是大气。大气含量及其在地球辐射平衡中的作用在散射成分:气溶胶和云方面具有很大的不确定性。由水滴构成的云,也会导致降水和阴影等局部效应。我们的传感方法是使用被动多角度图像的计算机断层扫描。对于考虑多重散射的光-物质相互作用,我们使用三维辐射传递方程作为正演模型。在包含许多未知因素的大尺度下,通过反演该模型进行体积恢复存在计算瓶颈。所采取的步骤使该层析成像易于处理,而无需近似散射顺序或角度范围。
{"title":"Airborne Three-Dimensional Cloud Tomography","authors":"Aviad Levis, Y. Schechner, Amit Aides, A. Davis","doi":"10.1109/ICCV.2015.386","DOIUrl":"https://doi.org/10.1109/ICCV.2015.386","url":null,"abstract":"We seek to sense the three dimensional (3D) volumetric distribution of scatterers in a heterogenous medium. An important case study for such a medium is the atmosphere. Atmospheric contents and their role in Earth's radiation balance have significant uncertainties with regards to scattering components: aerosols and clouds. Clouds, made of water droplets, also lead to local effects as precipitation and shadows. Our sensing approach is computational tomography using passive multi-angular imagery. For light-matter interaction that accounts for multiple-scattering, we use the 3D radiative transfer equation as a forward model. Volumetric recovery by inverting this model suffers from a computational bottleneck on large scales, which include many unknowns. Steps taken make this tomography tractable, without approximating the scattering order or angle range.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"22 1","pages":"3379-3387"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81162086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 69
The HCI Stereo Metrics: Geometry-Aware Performance Analysis of Stereo Algorithms HCI立体度量:立体算法的几何感知性能分析
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.245
Katrin Honauer, L. Maier-Hein, D. Kondermann
Performance characterization of stereo methods is mandatory to decide which algorithm is useful for which application. Prevalent benchmarks mainly use the root mean squared error (RMS) with respect to ground truth disparity maps to quantify algorithm performance. We show that the RMS is of limited expressiveness for algorithm selection and introduce the HCI Stereo Metrics. These metrics assess stereo results by harnessing three semantic cues: depth discontinuities, planar surfaces, and fine geometric structures. For each cue, we extract the relevant set of pixels from existing ground truth. We then apply our evaluation functions to quantify characteristics such as edge fattening and surface smoothness. We demonstrate that our approach supports practitioners in selecting the most suitable algorithm for their application. Using the new Middlebury dataset, we show that rankings based on our metrics reveal specific algorithm strengths and weaknesses which are not quantified by existing metrics. We finally show how stacked bar charts and radar charts visually support multidimensional performance evaluation. An interactive stereo benchmark based on the proposed metrics and visualizations is available at: http://hci.iwr.uni-heidelberg.de/stereometrics.
立体方法的性能表征是必须的,以确定哪种算法对哪种应用程序有用。普遍的基准测试主要使用相对于真实差值映射的均方根误差(RMS)来量化算法的性能。我们证明了RMS对算法选择的表达能力有限,并介绍了HCI立体度量。这些指标通过利用三个语义线索来评估立体效果:深度不连续、平面和精细几何结构。对于每个线索,我们从现有的ground truth中提取相关的像素集。然后,我们应用我们的评估函数来量化特征,如边缘增肥和表面平滑。我们证明,我们的方法支持从业者选择最适合他们的应用算法。使用新的Middlebury数据集,我们显示基于我们的指标的排名揭示了现有指标无法量化的特定算法优势和劣势。我们最后展示了堆叠条形图和雷达图如何在视觉上支持多维性能评估。基于建议的度量和可视化的交互式立体基准可以在:http://hci.iwr.uni-heidelberg.de/stereometrics上获得。
{"title":"The HCI Stereo Metrics: Geometry-Aware Performance Analysis of Stereo Algorithms","authors":"Katrin Honauer, L. Maier-Hein, D. Kondermann","doi":"10.1109/ICCV.2015.245","DOIUrl":"https://doi.org/10.1109/ICCV.2015.245","url":null,"abstract":"Performance characterization of stereo methods is mandatory to decide which algorithm is useful for which application. Prevalent benchmarks mainly use the root mean squared error (RMS) with respect to ground truth disparity maps to quantify algorithm performance. We show that the RMS is of limited expressiveness for algorithm selection and introduce the HCI Stereo Metrics. These metrics assess stereo results by harnessing three semantic cues: depth discontinuities, planar surfaces, and fine geometric structures. For each cue, we extract the relevant set of pixels from existing ground truth. We then apply our evaluation functions to quantify characteristics such as edge fattening and surface smoothness. We demonstrate that our approach supports practitioners in selecting the most suitable algorithm for their application. Using the new Middlebury dataset, we show that rankings based on our metrics reveal specific algorithm strengths and weaknesses which are not quantified by existing metrics. We finally show how stacked bar charts and radar charts visually support multidimensional performance evaluation. An interactive stereo benchmark based on the proposed metrics and visualizations is available at: http://hci.iwr.uni-heidelberg.de/stereometrics.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"30 1","pages":"2120-2128"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81434092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Query Adaptive Similarity Measure for RGB-D Object Recognition RGB-D对象识别的查询自适应相似度度量
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.25
Yanhua Cheng, Rui Cai, Chi Zhang, Zhiwei Li, Xin Zhao, Kaiqi Huang, Y. Rui
This paper studies the problem of improving the top-1 accuracy of RGB-D object recognition. Despite of the impressive top-5 accuracies achieved by existing methods, their top-1 accuracies are not very satisfactory. The reasons are in two-fold: (1) existing similarity measures are sensitive to object pose and scale changes, as well as intra-class variations, and (2) effectively fusing RGB and depth cues is still an open problem. To address these problems, this paper first proposes a new similarity measure based on dense matching, through which objects in comparison are warped and aligned, to better tolerate variations. Towards RGB and depth fusion, we argue that a constant and golden weight doesn't exist. The two modalities have varying contributions when comparing objects from different categories. To capture such a dynamic characteristic, a group of matchers equipped with various fusion weights is constructed, to explore the responses of dense matching under different fusion configurations. All the response scores are finally merged following a learning-to-combination way, which provides quite good generalization ability in practice. The proposed approach win the best results on several public benchmarks, e.g., achieves 92.7% top-1 test accuracy on the Washington RGB-D object dataset, with a 5.1% improvement over the state-of-the-art.
本文研究了提高RGB-D目标识别的top-1精度的问题。尽管现有方法取得了令人印象深刻的前5名精度,但它们的前1名精度并不十分令人满意。原因有两方面:(1)现有的相似性度量对物体姿态和尺度变化以及类内变化敏感;(2)有效融合RGB和深度线索仍然是一个悬而未决的问题。为了解决这些问题,本文首先提出了一种基于密集匹配的相似性度量方法,通过对比较对象进行扭曲和对齐,以更好地容忍变化。对于RGB和深度融合,我们认为不存在恒定的黄金权重。当比较来自不同类别的对象时,这两种模式有不同的贡献。为了捕捉这一动态特性,构建了一组配备不同融合权值的匹配器,探索不同融合配置下密集匹配的响应。所有的回答分数最终按照学习到组合的方式进行合并,在实践中具有很好的泛化能力。所提出的方法在几个公共基准测试中获得了最佳结果,例如,在华盛顿RGB-D对象数据集上达到了92.7%的top-1测试精度,比最先进的方法提高了5.1%。
{"title":"Query Adaptive Similarity Measure for RGB-D Object Recognition","authors":"Yanhua Cheng, Rui Cai, Chi Zhang, Zhiwei Li, Xin Zhao, Kaiqi Huang, Y. Rui","doi":"10.1109/ICCV.2015.25","DOIUrl":"https://doi.org/10.1109/ICCV.2015.25","url":null,"abstract":"This paper studies the problem of improving the top-1 accuracy of RGB-D object recognition. Despite of the impressive top-5 accuracies achieved by existing methods, their top-1 accuracies are not very satisfactory. The reasons are in two-fold: (1) existing similarity measures are sensitive to object pose and scale changes, as well as intra-class variations, and (2) effectively fusing RGB and depth cues is still an open problem. To address these problems, this paper first proposes a new similarity measure based on dense matching, through which objects in comparison are warped and aligned, to better tolerate variations. Towards RGB and depth fusion, we argue that a constant and golden weight doesn't exist. The two modalities have varying contributions when comparing objects from different categories. To capture such a dynamic characteristic, a group of matchers equipped with various fusion weights is constructed, to explore the responses of dense matching under different fusion configurations. All the response scores are finally merged following a learning-to-combination way, which provides quite good generalization ability in practice. The proposed approach win the best results on several public benchmarks, e.g., achieves 92.7% top-1 test accuracy on the Washington RGB-D object dataset, with a 5.1% improvement over the state-of-the-art.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"12 1","pages":"145-153"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83881035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Motion Trajectory Segmentation via Minimum Cost Multicuts 基于最小成本多路分割的运动轨迹分割
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.374
M. Keuper, Bjoern Andres, T. Brox
For the segmentation of moving objects in videos, the analysis of long-term point trajectories has been very popular recently. In this paper, we formulate the segmentation of a video sequence based on point trajectories as a minimum cost multicut problem. Unlike the commonly used spectral clustering formulation, the minimum cost multicut formulation gives natural rise to optimize not only for a cluster assignment but also for the number of clusters while allowing for varying cluster sizes. In this setup, we provide a method to create a long-term point trajectory graph with attractive and repulsive binary terms and outperform state-of-the-art methods based on spectral clustering on the FBMS-59 dataset and on the motion subtask of the VSB100 dataset.
对于视频中运动物体的分割,长期点轨迹分析是近年来非常流行的一种方法。在本文中,我们将基于点轨迹的视频序列分割作为一个最小代价多切问题。与常用的光谱聚类公式不同,最小成本多切口公式不仅可以优化聚类分配,还可以优化聚类数量,同时允许不同的聚类大小。在此设置中,我们提供了一种方法来创建具有吸引和排斥二元项的长期点轨迹图,并且优于基于FBMS-59数据集和VSB100数据集的运动子任务的基于谱聚类的最先进方法。
{"title":"Motion Trajectory Segmentation via Minimum Cost Multicuts","authors":"M. Keuper, Bjoern Andres, T. Brox","doi":"10.1109/ICCV.2015.374","DOIUrl":"https://doi.org/10.1109/ICCV.2015.374","url":null,"abstract":"For the segmentation of moving objects in videos, the analysis of long-term point trajectories has been very popular recently. In this paper, we formulate the segmentation of a video sequence based on point trajectories as a minimum cost multicut problem. Unlike the commonly used spectral clustering formulation, the minimum cost multicut formulation gives natural rise to optimize not only for a cluster assignment but also for the number of clusters while allowing for varying cluster sizes. In this setup, we provide a method to create a long-term point trajectory graph with attractive and repulsive binary terms and outperform state-of-the-art methods based on spectral clustering on the FBMS-59 dataset and on the motion subtask of the VSB100 dataset.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"16 1","pages":"3271-3279"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82680178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 188
Robust Image Segmentation Using Contour-Guided Color Palettes 使用轮廓引导调色板的鲁棒图像分割
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.189
Xiang Fu, Chien-Yi Wang, Chen Chen, Changhu Wang, C.-C. Jay Kuo
The contour-guided color palette (CCP) is proposed for robust image segmentation. It efficiently integrates contour and color cues of an image. To find representative colors of an image, color samples along long contours between regions, similar in spirit to machine learning methodology that focus on samples near decision boundaries, are collected followed by the mean-shift (MS) algorithm in the sampled color space to achieve an image-dependent color palette. This color palette provides a preliminary segmentation in the spatial domain, which is further fine-tuned by post-processing techniques such as leakage avoidance, fake boundary removal, and small region mergence. Segmentation performances of CCP and MS are compared and analyzed. While CCP offers an acceptable standalone segmentation result, it can be further integrated into the framework of layered spectral segmentation to produce a more robust segmentation. The superior performance of CCP-based segmentation algorithm is demonstrated by experiments on the Berkeley Segmentation Dataset.
提出了轮廓引导调色板(CCP)的鲁棒图像分割方法。它有效地整合了图像的轮廓和颜色线索。为了找到图像的代表性颜色,沿着区域之间的长轮廓收集颜色样本,在精神上类似于专注于决策边界附近样本的机器学习方法,然后在采样颜色空间中使用mean-shift (MS)算法来实现依赖于图像的调色板。这个调色板在空间域中提供了一个初步的分割,通过后处理技术(如避免泄漏、假边界去除和小区域合并)进一步微调。对比分析了CCP和MS的分割性能。虽然CCP提供了一个可接受的独立分割结果,但它可以进一步集成到分层光谱分割框架中,以产生更鲁棒的分割。在Berkeley分割数据集上的实验证明了基于ccp的分割算法的优越性能。
{"title":"Robust Image Segmentation Using Contour-Guided Color Palettes","authors":"Xiang Fu, Chien-Yi Wang, Chen Chen, Changhu Wang, C.-C. Jay Kuo","doi":"10.1109/ICCV.2015.189","DOIUrl":"https://doi.org/10.1109/ICCV.2015.189","url":null,"abstract":"The contour-guided color palette (CCP) is proposed for robust image segmentation. It efficiently integrates contour and color cues of an image. To find representative colors of an image, color samples along long contours between regions, similar in spirit to machine learning methodology that focus on samples near decision boundaries, are collected followed by the mean-shift (MS) algorithm in the sampled color space to achieve an image-dependent color palette. This color palette provides a preliminary segmentation in the spatial domain, which is further fine-tuned by post-processing techniques such as leakage avoidance, fake boundary removal, and small region mergence. Segmentation performances of CCP and MS are compared and analyzed. While CCP offers an acceptable standalone segmentation result, it can be further integrated into the framework of layered spectral segmentation to produce a more robust segmentation. The superior performance of CCP-based segmentation algorithm is demonstrated by experiments on the Berkeley Segmentation Dataset.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"1 1","pages":"1618-1625"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90045167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Opening the Black Box: Hierarchical Sampling Optimization for Estimating Human Hand Pose 打开黑箱:手部姿态估计的分层采样优化
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.380
Danhang Tang, Jonathan Taylor, Pushmeet Kohli, Cem Keskin, Tae-Kyun Kim, J. Shotton
We address the problem of hand pose estimation, formulated as an inverse problem. Typical approaches optimize an energy function over pose parameters using a 'black box' image generation procedure. This procedure knows little about either the relationships between the parameters or the form of the energy function. In this paper, we show that we can significantly improving upon black box optimization by exploiting high-level knowledge of the structure of the parameters and using a local surrogate energy function. Our new framework, hierarchical sampling optimization, consists of a sequence of predictors organized into a kinematic hierarchy. Each predictor is conditioned on its ancestors, and generates a set of samples over a subset of the pose parameters. The highly-efficient surrogate energy is used to select among samples. Having evaluated the full hierarchy, the partial pose samples are concatenated to generate a full-pose hypothesis. Several hypotheses are generated using the same procedure, and finally the original full energy function selects the best result. Experimental evaluation on three publically available datasets show that our method is particularly impressive in low-compute scenarios where it significantly outperforms all other state-of-the-art methods.
我们解决的问题,手的姿态估计,公式化为一个逆问题。典型的方法是使用“黑盒”图像生成过程优化姿态参数上的能量函数。这个过程对参数之间的关系或能量函数的形式知之甚少。在本文中,我们证明了我们可以通过利用参数结构的高级知识和使用局部替代能量函数来显着改进黑盒优化。我们的新框架,分层抽样优化,由一系列的预测组织成一个运动层次结构。每个预测器都以其祖先为条件,并在姿态参数的子集上生成一组样本。利用高效的替代能量对样本进行选择。在评估了完整的层次结构之后,将部分姿态样本连接起来以生成一个完整姿态假设。采用相同的过程生成多个假设,最后由原全能量函数选择最佳结果。在三个公开可用的数据集上的实验评估表明,我们的方法在低计算场景中特别令人印象深刻,它明显优于所有其他最先进的方法。
{"title":"Opening the Black Box: Hierarchical Sampling Optimization for Estimating Human Hand Pose","authors":"Danhang Tang, Jonathan Taylor, Pushmeet Kohli, Cem Keskin, Tae-Kyun Kim, J. Shotton","doi":"10.1109/ICCV.2015.380","DOIUrl":"https://doi.org/10.1109/ICCV.2015.380","url":null,"abstract":"We address the problem of hand pose estimation, formulated as an inverse problem. Typical approaches optimize an energy function over pose parameters using a 'black box' image generation procedure. This procedure knows little about either the relationships between the parameters or the form of the energy function. In this paper, we show that we can significantly improving upon black box optimization by exploiting high-level knowledge of the structure of the parameters and using a local surrogate energy function. Our new framework, hierarchical sampling optimization, consists of a sequence of predictors organized into a kinematic hierarchy. Each predictor is conditioned on its ancestors, and generates a set of samples over a subset of the pose parameters. The highly-efficient surrogate energy is used to select among samples. Having evaluated the full hierarchy, the partial pose samples are concatenated to generate a full-pose hypothesis. Several hypotheses are generated using the same procedure, and finally the original full energy function selects the best result. Experimental evaluation on three publically available datasets show that our method is particularly impressive in low-compute scenarios where it significantly outperforms all other state-of-the-art methods.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"11 1","pages":"3325-3333"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90270774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 144
期刊
2015 IEEE International Conference on Computer Vision (ICCV)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1