首页 > 最新文献

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)最新文献

英文 中文
Unconstrained Face Alignment via Cascaded Compositional Learning 基于级联成分学习的无约束人脸对齐
Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.371
Shizhan Zhu, Cheng Li, Chen Change Loy, Xiaoou Tang
We present a practical approach to address the problem of unconstrained face alignment for a single image. In our unconstrained problem, we need to deal with large shape and appearance variations under extreme head poses and rich shape deformation. To equip cascaded regressors with the capability to handle global shape variation and irregular appearance-shape relation in the unconstrained scenario, we partition the optimisation space into multiple domains of homogeneous descent, and predict a shape as a composition of estimations from multiple domain-specific regressors. With a specially formulated learning objective and a novel tree splitting function, our approach is capable of estimating a robust and meaningful composition. In addition to achieving state-of-the-art accuracy over existing approaches, our framework is also an efficient solution (350 FPS), thanks to the on-the-fly domain exclusion mechanism and the capability of leveraging the fast pixel feature.
我们提出了一种实用的方法来解决单幅图像的无约束人脸对齐问题。在我们的无约束问题中,我们需要处理极端头部姿态和丰富形状变形下的大形状和外观变化。为了使级联回归量具有处理无约束场景下全局形状变化和不规则外观形状关系的能力,我们将优化空间划分为多个均匀下降的域,并将多个域特定回归量的估计组合预测形状。通过特殊制定的学习目标和新颖的树分裂函数,我们的方法能够估计出鲁棒且有意义的组合。除了在现有方法上实现最先进的精度之外,我们的框架也是一个高效的解决方案(350 FPS),这要归功于动态域排除机制和利用快速像素特性的能力。
{"title":"Unconstrained Face Alignment via Cascaded Compositional Learning","authors":"Shizhan Zhu, Cheng Li, Chen Change Loy, Xiaoou Tang","doi":"10.1109/CVPR.2016.371","DOIUrl":"https://doi.org/10.1109/CVPR.2016.371","url":null,"abstract":"We present a practical approach to address the problem of unconstrained face alignment for a single image. In our unconstrained problem, we need to deal with large shape and appearance variations under extreme head poses and rich shape deformation. To equip cascaded regressors with the capability to handle global shape variation and irregular appearance-shape relation in the unconstrained scenario, we partition the optimisation space into multiple domains of homogeneous descent, and predict a shape as a composition of estimations from multiple domain-specific regressors. With a specially formulated learning objective and a novel tree splitting function, our approach is capable of estimating a robust and meaningful composition. In addition to achieving state-of-the-art accuracy over existing approaches, our framework is also an efficient solution (350 FPS), thanks to the on-the-fly domain exclusion mechanism and the capability of leveraging the fast pixel feature.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"C-31 1","pages":"3409-3417"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84443379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 173
Learned Binary Spectral Shape Descriptor for 3D Shape Correspondence 学习了三维形状对应的二元光谱形状描述符
Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.360
J. Xie, M. Wang, Yi Fang
Dense 3D shape correspondence is an important problem in computer vision and computer graphics. Recently, the local shape descriptor based 3D shape correspondence approaches have been widely studied, where the local shape descriptor is a real-valued vector to characterize the geometrical structure of the shape. Different from these realvalued local shape descriptors, in this paper, we propose to learn a novel binary spectral shape descriptor with the deep neural network for 3D shape correspondence. The binary spectral shape descriptor can require less storage space and enable fast matching. First, based on the eigenvectors of the Laplace-Beltrami operator, we construct a neural network to form a nonlinear spectral representation to characterize the shape. Then, for the defined positive and negative points on the shapes, we train the constructed neural network by minimizing the errors between the outputs and their corresponding binary descriptors, minimizing the variations of the outputs of the positive points and maximizing the variations of the outputs of the negative points, simultaneously. Finally, we binarize the output of the neural network to form the binary spectral shape descriptor for shape correspondence. The proposed binary spectral shape descriptor is evaluated on the SCAPE and TOSCA 3D shape datasets for shape correspondence. The experimental results demonstrate the effectiveness of the proposed binary shape descriptor for the shape correspondence task.
密集三维形状对应是计算机视觉和计算机图形学中的一个重要问题。近年来,基于局部形状描述子的三维形状对应方法得到了广泛的研究,其中局部形状描述子是表征形状几何结构的实值向量。与这些重值的局部形状描述子不同,本文提出了一种新的基于深度神经网络的三维形状对应二元谱形状描述子。二元光谱形状描述符需要更少的存储空间,能够实现快速匹配。首先,基于拉普拉斯-贝尔特拉米算子的特征向量,构建神经网络,形成非线性谱表示来表征形状;然后,对于已定义的形状上的正负点,我们通过最小化输出与其对应的二进制描述子之间的误差,最小化正负点输出的变化量,同时最大化负点输出的变化量来训练构建的神经网络。最后,我们对神经网络的输出进行二值化,形成用于形状对应的二谱形状描述符。在SCAPE和TOSCA三维形状数据集上评估了所提出的二元光谱形状描述符的形状对应性。实验结果证明了所提出的二元形状描述符在形状对应任务中的有效性。
{"title":"Learned Binary Spectral Shape Descriptor for 3D Shape Correspondence","authors":"J. Xie, M. Wang, Yi Fang","doi":"10.1109/CVPR.2016.360","DOIUrl":"https://doi.org/10.1109/CVPR.2016.360","url":null,"abstract":"Dense 3D shape correspondence is an important problem in computer vision and computer graphics. Recently, the local shape descriptor based 3D shape correspondence approaches have been widely studied, where the local shape descriptor is a real-valued vector to characterize the geometrical structure of the shape. Different from these realvalued local shape descriptors, in this paper, we propose to learn a novel binary spectral shape descriptor with the deep neural network for 3D shape correspondence. The binary spectral shape descriptor can require less storage space and enable fast matching. First, based on the eigenvectors of the Laplace-Beltrami operator, we construct a neural network to form a nonlinear spectral representation to characterize the shape. Then, for the defined positive and negative points on the shapes, we train the constructed neural network by minimizing the errors between the outputs and their corresponding binary descriptors, minimizing the variations of the outputs of the positive points and maximizing the variations of the outputs of the negative points, simultaneously. Finally, we binarize the output of the neural network to form the binary spectral shape descriptor for shape correspondence. The proposed binary spectral shape descriptor is evaluated on the SCAPE and TOSCA 3D shape datasets for shape correspondence. The experimental results demonstrate the effectiveness of the proposed binary shape descriptor for the shape correspondence task.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"148 Pt 7 1","pages":"3309-3317"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84084001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
A Deeper Look at Saliency: Feature Contrast, Semantics, and Beyond 更深入地了解显著性:特征对比、语义及其他
Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.62
Neil D. B. Bruce, Christopher Catton, Sasa Janjic
In this paper we consider the problem of visual saliency modeling, including both human gaze prediction and salient object segmentation. The overarching goal of the paper is to identify high level considerations relevant to deriving more sophisticated visual saliency models. A deep learning model based on fully convolutional networks (FCNs) is presented, which shows very favorable performance across a wide variety of benchmarks relative to existing proposals. We also demonstrate that the manner in which training data is selected, and ground truth treated is critical to resulting model behaviour. Recent efforts have explored the relationship between human gaze and salient objects, and we also examine this point further in the context of FCNs. Close examination of the proposed and alternative models serves as a vehicle for identifying problems important to developing more comprehensive models going forward.
在本文中,我们考虑了视觉显著性建模问题,包括人类凝视预测和显著性目标分割。本文的首要目标是确定与推导更复杂的视觉显著性模型相关的高级考虑因素。提出了一种基于全卷积网络(fcn)的深度学习模型,相对于现有的建议,该模型在各种基准测试中表现出非常好的性能。我们还证明了选择训练数据和处理基础真值的方式对最终模型行为至关重要。最近的研究已经探索了人类凝视和显著物体之间的关系,我们也在fcn的背景下进一步研究了这一点。对所建议的模型和备选模型的仔细检查可以作为识别问题的工具,这些问题对于开发更全面的模型至关重要。
{"title":"A Deeper Look at Saliency: Feature Contrast, Semantics, and Beyond","authors":"Neil D. B. Bruce, Christopher Catton, Sasa Janjic","doi":"10.1109/CVPR.2016.62","DOIUrl":"https://doi.org/10.1109/CVPR.2016.62","url":null,"abstract":"In this paper we consider the problem of visual saliency modeling, including both human gaze prediction and salient object segmentation. The overarching goal of the paper is to identify high level considerations relevant to deriving more sophisticated visual saliency models. A deep learning model based on fully convolutional networks (FCNs) is presented, which shows very favorable performance across a wide variety of benchmarks relative to existing proposals. We also demonstrate that the manner in which training data is selected, and ground truth treated is critical to resulting model behaviour. Recent efforts have explored the relationship between human gaze and salient objects, and we also examine this point further in the context of FCNs. Close examination of the proposed and alternative models serves as a vehicle for identifying problems important to developing more comprehensive models going forward.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"21 1","pages":"516-524"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80997842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 55
Video Segmentation via Object Flow 通过对象流的视频分割
Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.423
Yi-Hsuan Tsai, Ming-Hsuan Yang, Michael J. Black
Video object segmentation is challenging due to fast moving objects, deforming shapes, and cluttered backgrounds. Optical flow can be used to propagate an object segmentation over time but, unfortunately, flow is often inaccurate, particularly around object boundaries. Such boundaries are precisely where we want our segmentation to be accurate. To obtain accurate segmentation across time, we propose an efficient algorithm that considers video segmentation and optical flow estimation simultaneously. For video segmentation, we formulate a principled, multiscale, spatio-temporal objective function that uses optical flow to propagate information between frames. For optical flow estimation, particularly at object boundaries, we compute the flow independently in the segmented regions and recompose the results. We call the process object flow and demonstrate the effectiveness of jointly optimizing optical flow and video segmentation using an iterative scheme. Experiments on the SegTrack v2 and Youtube-Objects datasets show that the proposed algorithm performs favorably against the other state-of-the-art methods.
由于快速移动的物体、变形的形状和杂乱的背景,视频对象分割是具有挑战性的。光流可以用来随着时间传播物体分割,但不幸的是,光流通常是不准确的,特别是在物体边界附近。这样的边界正是我们希望分割准确的地方。为了获得准确的跨时间分割,我们提出了一种同时考虑视频分割和光流估计的高效算法。对于视频分割,我们制定了一个原则性的,多尺度的,时空目标函数,它使用光流在帧之间传播信息。对于光流估计,特别是在目标边界处,我们在分割区域中独立计算光流并重新组合结果。我们将该过程称为对象流,并证明了使用迭代方案联合优化光流和视频分割的有效性。在SegTrack v2和Youtube-Objects数据集上的实验表明,该算法与其他最先进的方法相比表现良好。
{"title":"Video Segmentation via Object Flow","authors":"Yi-Hsuan Tsai, Ming-Hsuan Yang, Michael J. Black","doi":"10.1109/CVPR.2016.423","DOIUrl":"https://doi.org/10.1109/CVPR.2016.423","url":null,"abstract":"Video object segmentation is challenging due to fast moving objects, deforming shapes, and cluttered backgrounds. Optical flow can be used to propagate an object segmentation over time but, unfortunately, flow is often inaccurate, particularly around object boundaries. Such boundaries are precisely where we want our segmentation to be accurate. To obtain accurate segmentation across time, we propose an efficient algorithm that considers video segmentation and optical flow estimation simultaneously. For video segmentation, we formulate a principled, multiscale, spatio-temporal objective function that uses optical flow to propagate information between frames. For optical flow estimation, particularly at object boundaries, we compute the flow independently in the segmented regions and recompose the results. We call the process object flow and demonstrate the effectiveness of jointly optimizing optical flow and video segmentation using an iterative scheme. Experiments on the SegTrack v2 and Youtube-Objects datasets show that the proposed algorithm performs favorably against the other state-of-the-art methods.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"51 1","pages":"3899-3908"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82998354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 325
One-Shot Learning of Scene Locations via Feature Trajectory Transfer 基于特征轨迹转移的场景位置一次性学习
Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.16
R. Kwitt, S. Hegenbart, M. Niethammer
The appearance of (outdoor) scenes changes considerably with the strength of certain transient attributes, such as "rainy", "dark" or "sunny". Obviously, this also affects the representation of an image in feature space, e.g., as activations at a certain CNN layer, and consequently impacts scene recognition performance. In this work, we investigate the variability in these transient attributes as a rich source of information for studying how image representations change as a function of attribute strength. In particular, we leverage a recently introduced dataset with fine-grain annotations to estimate feature trajectories for a collection of transient attributes and then show how these trajectories can be transferred to new image representations. This enables us to synthesize new data along the transferred trajectories with respect to the dimensions of the space spanned by the transient attributes. Applicability of this concept is demonstrated on the problem of oneshot recognition of scene locations. We show that data synthesized via feature trajectory transfer considerably boosts recognition performance, (1) with respect to baselines and (2) in combination with state-of-the-art approaches in oneshot learning.
(室外)场景的外观随着某些短暂属性的强度而发生很大变化,例如“下雨”、“黑暗”或“阳光明媚”。显然,这也会影响图像在特征空间中的表示,例如在某个CNN层上的激活,从而影响场景识别性能。在这项工作中,我们研究了这些瞬态属性的可变性,作为研究图像表示如何作为属性强度的函数而变化的丰富信息来源。特别是,我们利用最近引入的带有细粒度注释的数据集来估计瞬态属性集合的特征轨迹,然后展示如何将这些轨迹转移到新的图像表示中。这使我们能够根据瞬态属性所跨越的空间的维度,沿传递轨迹合成新的数据。该概念在场景位置一次性识别问题上的适用性得到了验证。我们表明,通过特征轨迹转移合成的数据大大提高了识别性能,(1)相对于基线,(2)与最先进的一次性学习方法相结合。
{"title":"One-Shot Learning of Scene Locations via Feature Trajectory Transfer","authors":"R. Kwitt, S. Hegenbart, M. Niethammer","doi":"10.1109/CVPR.2016.16","DOIUrl":"https://doi.org/10.1109/CVPR.2016.16","url":null,"abstract":"The appearance of (outdoor) scenes changes considerably with the strength of certain transient attributes, such as \"rainy\", \"dark\" or \"sunny\". Obviously, this also affects the representation of an image in feature space, e.g., as activations at a certain CNN layer, and consequently impacts scene recognition performance. In this work, we investigate the variability in these transient attributes as a rich source of information for studying how image representations change as a function of attribute strength. In particular, we leverage a recently introduced dataset with fine-grain annotations to estimate feature trajectories for a collection of transient attributes and then show how these trajectories can be transferred to new image representations. This enables us to synthesize new data along the transferred trajectories with respect to the dimensions of the space spanned by the transient attributes. Applicability of this concept is demonstrated on the problem of oneshot recognition of scene locations. We show that data synthesized via feature trajectory transfer considerably boosts recognition performance, (1) with respect to baselines and (2) in combination with state-of-the-art approaches in oneshot learning.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"61 40 1","pages":"78-86"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90789944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
You Lead, We Exceed: Labor-Free Video Concept Learning by Jointly Exploiting Web Videos and Images 您领先,我们超越:联合开发网络视频和图像的无用功视频概念学习
Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.106
Chuang Gan, Ting Yao, Kuiyuan Yang, Yi Yang, Tao Mei
Video concept learning often requires a large set oftraining samples. In practice, however, acquiring noise-free training labels with sufficient positive examples is very expensive. A plausible solution for training data collection is by sampling from the vast quantities of images and videos on the Web. Such a solution is motivated by the assumption that the retrieved images or videos are highly correlated with the query. Still, a number ofchallenges remain. First, Web videos are often untrimmed. Thus, only parts of the videos are relevant to the query. Second, the retrieved Web images are always highly relevant to the issued query. However, thoughtlessly utilizing the images in the video domain may even hurt the performance due to the well-known semantic drift and domain gap problems. As a result, a valid question is how Web images and videos interact for video concept learning. In this paper, we propose a Lead-Exceed Neural Network (LENN), which reinforces the training on Web images and videos in a curriculum manner. Specifically, the training proceeds by inputting frames of Web videos to obtain a network. The Web images are then filtered by the learnt network and the selected images are additionally fed into the network to enhance the architecture and further trim the videos. In addition, Long Short-Term Memory (LSTM) can be applied on the trimmed videos to explore temporal information. Encouraging results are reported on UCFIOl, TRECVID 2013 and 2014 MEDTest in the context ofboth action recognition and event detection. Without using human annotated exemplars, our proposed LENN can achieve 74.4% accuracy on UCFIOI dataset.
视频概念学习通常需要大量的训练样本。然而,在实践中,获取具有足够正例的无噪声训练标签是非常昂贵的。训练数据收集的一个可行的解决方案是从Web上大量的图像和视频中采样。这种解决方案的动机是假设检索到的图像或视频与查询高度相关。尽管如此,仍然存在一些挑战。首先,网络视频通常是未经修饰的。因此,只有部分视频与查询相关。其次,检索到的Web图像总是与发出的查询高度相关。然而,由于众所周知的语义漂移和域间隙问题,在视频域中不加考虑地利用图像甚至会损害性能。因此,一个有效的问题是网络图像和视频如何在视频概念学习中相互作用。在本文中,我们提出了一个超前-超越神经网络(LENN),以课程的方式加强对网络图像和视频的训练。具体来说,训练是通过输入Web视频的帧来获得一个网络。然后,网络对网络图像进行过滤,并将选中的图像馈送到网络中,以增强网络结构并进一步修剪视频。此外,长短期记忆(LSTM)可以应用于修剪后的视频来探索时间信息。UCFIOl、TRECVID 2013和2014 MEDTest在动作识别和事件检测方面都取得了令人鼓舞的结果。在不使用人类标注样本的情况下,我们提出的LENN在UCFIOI数据集上的准确率达到74.4%。
{"title":"You Lead, We Exceed: Labor-Free Video Concept Learning by Jointly Exploiting Web Videos and Images","authors":"Chuang Gan, Ting Yao, Kuiyuan Yang, Yi Yang, Tao Mei","doi":"10.1109/CVPR.2016.106","DOIUrl":"https://doi.org/10.1109/CVPR.2016.106","url":null,"abstract":"Video concept learning often requires a large set oftraining samples. In practice, however, acquiring noise-free training labels with sufficient positive examples is very expensive. A plausible solution for training data collection is by sampling from the vast quantities of images and videos on the Web. Such a solution is motivated by the assumption that the retrieved images or videos are highly correlated with the query. Still, a number ofchallenges remain. First, Web videos are often untrimmed. Thus, only parts of the videos are relevant to the query. Second, the retrieved Web images are always highly relevant to the issued query. However, thoughtlessly utilizing the images in the video domain may even hurt the performance due to the well-known semantic drift and domain gap problems. As a result, a valid question is how Web images and videos interact for video concept learning. In this paper, we propose a Lead-Exceed Neural Network (LENN), which reinforces the training on Web images and videos in a curriculum manner. Specifically, the training proceeds by inputting frames of Web videos to obtain a network. The Web images are then filtered by the learnt network and the selected images are additionally fed into the network to enhance the architecture and further trim the videos. In addition, Long Short-Term Memory (LSTM) can be applied on the trimmed videos to explore temporal information. Encouraging results are reported on UCFIOl, TRECVID 2013 and 2014 MEDTest in the context ofboth action recognition and event detection. Without using human annotated exemplars, our proposed LENN can achieve 74.4% accuracy on UCFIOI dataset.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"49 1","pages":"923-932"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90992724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 108
In Defense of Sparse Tracking: Circulant Sparse Tracker 为稀疏跟踪辩护:循环稀疏跟踪器
Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.421
Tianzhu Zhang, Adel Bibi, Bernard Ghanem
Sparse representation has been introduced to visual tracking by finding the best target candidate with minimal reconstruction error within the particle filter framework. However, most sparse representation based trackers have high computational cost, less than promising tracking performance, and limited feature representation. To deal with the above issues, we propose a novel circulant sparse tracker (CST), which exploits circulant target templates. Because of the circulant structure property, CST has the following advantages: (1) It can refine and reduce particles using circular shifts of target templates. (2) The optimization can be efficiently solved entirely in the Fourier domain. (3) High dimensional features can be embedded into CST to significantly improve tracking performance without sacrificing much computation time. Both qualitative and quantitative evaluations on challenging benchmark sequences demonstrate that CST performs better than all other sparse trackers and favorably against state-of-the-art methods.
将稀疏表示引入到视觉跟踪中,在粒子滤波框架内找到重构误差最小的最佳候选目标。然而,大多数基于稀疏表示的跟踪器计算成本高,跟踪性能不理想,特征表示有限。为了解决上述问题,我们提出了一种利用循环目标模板的循环稀疏跟踪器(CST)。由于CST的循环结构特性,它具有以下优点:(1)利用目标模板的圆位移来细化和减少颗粒。(2)优化可以在傅里叶域中完全有效地求解。(3)将高维特征嵌入到CST中,在不牺牲大量计算时间的情况下显著提高跟踪性能。对具有挑战性的基准序列的定性和定量评估表明,CST的性能优于所有其他稀疏跟踪器,并且优于最先进的方法。
{"title":"In Defense of Sparse Tracking: Circulant Sparse Tracker","authors":"Tianzhu Zhang, Adel Bibi, Bernard Ghanem","doi":"10.1109/CVPR.2016.421","DOIUrl":"https://doi.org/10.1109/CVPR.2016.421","url":null,"abstract":"Sparse representation has been introduced to visual tracking by finding the best target candidate with minimal reconstruction error within the particle filter framework. However, most sparse representation based trackers have high computational cost, less than promising tracking performance, and limited feature representation. To deal with the above issues, we propose a novel circulant sparse tracker (CST), which exploits circulant target templates. Because of the circulant structure property, CST has the following advantages: (1) It can refine and reduce particles using circular shifts of target templates. (2) The optimization can be efficiently solved entirely in the Fourier domain. (3) High dimensional features can be embedded into CST to significantly improve tracking performance without sacrificing much computation time. Both qualitative and quantitative evaluations on challenging benchmark sequences demonstrate that CST performs better than all other sparse trackers and favorably against state-of-the-art methods.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"8 1","pages":"3880-3888"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79049555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 135
Robust Visual Place Recognition with Graph Kernels 基于图核的鲁棒视觉位置识别
Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.491
E. Stumm, Christopher Mei, S. Lacroix, Juan I. Nieto, M. Hutter, R. Siegwart
A novel method for visual place recognition is introduced and evaluated, demonstrating robustness to perceptual aliasing and observation noise. This is achieved by increasing discrimination through a more structured representation of visual observations. Estimation of observation likelihoods are based on graph kernel formulations, utilizing both the structural and visual information encoded in covisibility graphs. The proposed probabilistic model is able to circumvent the typically difficult and expensive posterior normalization procedure by exploiting the information available in visual observations. Furthermore, the place recognition complexity is independent of the size of the map. Results show improvements over the state-of-theart on a diverse set of both public datasets and novel experiments, highlighting the benefit of the approach.
介绍了一种新的视觉位置识别方法,并对其进行了评估,证明了该方法对感知混叠和观测噪声的鲁棒性。这是通过更加结构化的视觉观察来增加歧视来实现的。观测似然的估计是基于图核公式,利用结构和视觉信息编码在共可见性图。所提出的概率模型能够通过利用视觉观测中的信息来规避典型的困难和昂贵的后验归一化过程。此外,地点识别的复杂度与地图的大小无关。结果表明,在不同的公共数据集和新实验集上,该方法比现有方法有所改进,突出了该方法的优点。
{"title":"Robust Visual Place Recognition with Graph Kernels","authors":"E. Stumm, Christopher Mei, S. Lacroix, Juan I. Nieto, M. Hutter, R. Siegwart","doi":"10.1109/CVPR.2016.491","DOIUrl":"https://doi.org/10.1109/CVPR.2016.491","url":null,"abstract":"A novel method for visual place recognition is introduced and evaluated, demonstrating robustness to perceptual aliasing and observation noise. This is achieved by increasing discrimination through a more structured representation of visual observations. Estimation of observation likelihoods are based on graph kernel formulations, utilizing both the structural and visual information encoded in covisibility graphs. The proposed probabilistic model is able to circumvent the typically difficult and expensive posterior normalization procedure by exploiting the information available in visual observations. Furthermore, the place recognition complexity is independent of the size of the map. Results show improvements over the state-of-theart on a diverse set of both public datasets and novel experiments, highlighting the benefit of the approach.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"117 1","pages":"4535-4544"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79965919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
Large-Pose Face Alignment via CNN-Based Dense 3D Model Fitting 基于cnn密集三维模型拟合的大姿态人脸对齐
Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.454
Amin Jourabloo, Xiaoming Liu
Large-pose face alignment is a very challenging problem in computer vision, which is used as a prerequisite for many important vision tasks, e.g, face recognition and 3D face reconstruction. Recently, there have been a few attempts to solve this problem, but still more research is needed to achieve highly accurate results. In this paper, we propose a face alignment method for large-pose face images, by combining the powerful cascaded CNN regressor method and 3DMM. We formulate the face alignment as a 3DMM fitting problem, where the camera projection matrix and 3D shape parameters are estimated by a cascade of CNN-based regressors. The dense 3D shape allows us to design pose-invariant appearance features for effective CNN learning. Extensive experiments are conducted on the challenging databases (AFLW and AFW), with comparison to the state of the art.
大姿态人脸对齐是计算机视觉中一个非常具有挑战性的问题,它是人脸识别和三维人脸重建等许多重要视觉任务的先决条件。最近,已经有一些尝试来解决这个问题,但仍然需要更多的研究来获得高度精确的结果。本文将强大的级联CNN回归方法与3DMM相结合,提出了一种大姿态人脸图像的人脸对齐方法。我们将人脸对准描述为一个3DMM拟合问题,其中相机投影矩阵和3D形状参数通过基于cnn的级联回归估计。密集的3D形状允许我们设计姿态不变的外观特征,以有效地进行CNN学习。在具有挑战性的数据库(AFLW和AFW)上进行了广泛的实验,并与最新的状态进行了比较。
{"title":"Large-Pose Face Alignment via CNN-Based Dense 3D Model Fitting","authors":"Amin Jourabloo, Xiaoming Liu","doi":"10.1109/CVPR.2016.454","DOIUrl":"https://doi.org/10.1109/CVPR.2016.454","url":null,"abstract":"Large-pose face alignment is a very challenging problem in computer vision, which is used as a prerequisite for many important vision tasks, e.g, face recognition and 3D face reconstruction. Recently, there have been a few attempts to solve this problem, but still more research is needed to achieve highly accurate results. In this paper, we propose a face alignment method for large-pose face images, by combining the powerful cascaded CNN regressor method and 3DMM. We formulate the face alignment as a 3DMM fitting problem, where the camera projection matrix and 3D shape parameters are estimated by a cascade of CNN-based regressors. The dense 3D shape allows us to design pose-invariant appearance features for effective CNN learning. Extensive experiments are conducted on the challenging databases (AFLW and AFW), with comparison to the state of the art.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"3 1","pages":"4188-4196"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88791807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 304
Contour Detection in Unstructured 3D Point Clouds 非结构化三维点云的轮廓检测
Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.178
Timo Hackel, J. D. Wegner, K. Schindler
We describe a method to automatically detect contours, i.e. lines along which the surface orientation sharply changes, in large-scale outdoor point clouds. Contours are important intermediate features for structuring point clouds and converting them into high-quality surface or solid models, and are extensively used in graphics and mapping applications. Yet, detecting them in unstructured, inhomogeneous point clouds turns out to be surprisingly difficult, and existing line detection algorithms largely fail. We approach contour extraction as a two-stage discriminative learning problem. In the first stage, a contour score for each individual point is predicted with a binary classifier, using a set of features extracted from the point's neighborhood. The contour scores serve as a basis to construct an overcomplete graph of candidate contours. The second stage selects an optimal set of contours from the candidates. This amounts to a further binary classification in a higher-order MRF, whose cliques encode a preference for connected contours and penalize loose ends. The method can handle point clouds > 107 points in a couple of minutes, and vastly outperforms a baseline that performs Canny-style edge detection on a range image representation of the point cloud.
我们描述了一种在大规模室外点云中自动检测轮廓的方法,即沿表面方向急剧变化的线。等高线是构建点云并将其转化为高质量曲面或实体模型的重要中间特征,在图形和测绘应用中得到了广泛的应用。然而,在非结构化、非均匀的点云中检测它们是非常困难的,现有的线检测算法在很大程度上失败了。我们将轮廓提取作为一个两阶段判别学习问题。在第一阶段,使用从点的邻域提取的一组特征,用二值分类器预测每个单独点的轮廓分数。轮廓分数作为构建候选轮廓的过完备图的基础。第二阶段从候选轮廓中选择一组最优轮廓。这相当于在高阶MRF中进一步的二元分类,其派系编码对连接轮廓的偏好,并惩罚松散的末端。该方法可以在几分钟内处理107个点云,并且大大优于在点云的范围图像表示上执行canny风格边缘检测的基线。
{"title":"Contour Detection in Unstructured 3D Point Clouds","authors":"Timo Hackel, J. D. Wegner, K. Schindler","doi":"10.1109/CVPR.2016.178","DOIUrl":"https://doi.org/10.1109/CVPR.2016.178","url":null,"abstract":"We describe a method to automatically detect contours, i.e. lines along which the surface orientation sharply changes, in large-scale outdoor point clouds. Contours are important intermediate features for structuring point clouds and converting them into high-quality surface or solid models, and are extensively used in graphics and mapping applications. Yet, detecting them in unstructured, inhomogeneous point clouds turns out to be surprisingly difficult, and existing line detection algorithms largely fail. We approach contour extraction as a two-stage discriminative learning problem. In the first stage, a contour score for each individual point is predicted with a binary classifier, using a set of features extracted from the point's neighborhood. The contour scores serve as a basis to construct an overcomplete graph of candidate contours. The second stage selects an optimal set of contours from the candidates. This amounts to a further binary classification in a higher-order MRF, whose cliques encode a preference for connected contours and penalize loose ends. The method can handle point clouds > 107 points in a couple of minutes, and vastly outperforms a baseline that performs Canny-style edge detection on a range image representation of the point cloud.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"29 1","pages":"1610-1618"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87346556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 146
期刊
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1