首页 > 最新文献

2015 IEEE International Conference on Computer Vision (ICCV)最新文献

英文 中文
Joint Camera Clustering and Surface Segmentation for Large-Scale Multi-view Stereo 大尺度多视点立体联合相机聚类与曲面分割
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.241
Runze Zhang, Shiwei Li, Tian Fang, Siyu Zhu, Long Quan
In this paper, we propose an optimal decomposition approach to large-scale multi-view stereo from an initial sparse reconstruction. The success of the approach depends on the introduction of surface-segmentation-based camera clustering rather than sparse-point-based camera clustering, which suffers from the problems of non-uniform reconstruction coverage ratio and high redundancy. In details, we introduce three criteria for camera clustering and surface segmentation for reconstruction, and then we formulate these criteria into an energy minimization problem under constraints. To solve this problem, we propose a joint optimization in a hierarchical framework to obtain the final surface segments and corresponding optimal camera clusters. On each level of the hierarchical framework, the camera clustering problem is formulated as a parameter estimation problem of a probability model solved by a General Expectation-Maximization algorithm and the surface segmentation problem is formulated as a Markov Random Field model based on the probability estimated by the previous camera clustering process. The experiments on several Internet datasets and aerial photo datasets demonstrate that the proposed approach method generates more uniform and complete dense reconstruction with less redundancy, resulting in more efficient multi-view stereo algorithm.
本文提出了一种基于初始稀疏重建的大规模多视点立体图像的最优分解方法。该方法的成功取决于引入基于表面分割的相机聚类,而不是基于稀疏点的相机聚类,后者存在重建覆盖率不均匀和冗余度高的问题。详细介绍了用于重建的相机聚类和曲面分割的三个准则,并将这些准则转化为约束条件下的能量最小化问题。为了解决这一问题,我们提出了一种分层框架下的联合优化方法,以获得最终的曲面段和相应的最优相机簇。在每一层次框架中,将摄像机聚类问题表述为一个概率模型的参数估计问题,该概率模型由一般期望最大化算法求解;将曲面分割问题表述为一个基于前一摄像机聚类过程估计的概率的马尔可夫随机场模型。在多个互联网数据集和航空照片数据集上的实验表明,该方法产生的密集重构更均匀、更完整,冗余更少,从而提高了多视点立体图像算法的效率。
{"title":"Joint Camera Clustering and Surface Segmentation for Large-Scale Multi-view Stereo","authors":"Runze Zhang, Shiwei Li, Tian Fang, Siyu Zhu, Long Quan","doi":"10.1109/ICCV.2015.241","DOIUrl":"https://doi.org/10.1109/ICCV.2015.241","url":null,"abstract":"In this paper, we propose an optimal decomposition approach to large-scale multi-view stereo from an initial sparse reconstruction. The success of the approach depends on the introduction of surface-segmentation-based camera clustering rather than sparse-point-based camera clustering, which suffers from the problems of non-uniform reconstruction coverage ratio and high redundancy. In details, we introduce three criteria for camera clustering and surface segmentation for reconstruction, and then we formulate these criteria into an energy minimization problem under constraints. To solve this problem, we propose a joint optimization in a hierarchical framework to obtain the final surface segments and corresponding optimal camera clusters. On each level of the hierarchical framework, the camera clustering problem is formulated as a parameter estimation problem of a probability model solved by a General Expectation-Maximization algorithm and the surface segmentation problem is formulated as a Markov Random Field model based on the probability estimated by the previous camera clustering process. The experiments on several Internet datasets and aerial photo datasets demonstrate that the proposed approach method generates more uniform and complete dense reconstruction with less redundancy, resulting in more efficient multi-view stereo algorithm.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"82 1","pages":"2084-2092"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83392891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Discrete Tabu Search for Graph Matching 图匹配的离散禁忌搜索
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.21
Kamil Adamczewski, Yumin Suh, Kyoung Mu Lee
Graph matching is a fundamental problem in computer vision. In this paper, we propose a novel graph matching algorithm based on tabu search [13]. The proposed method solves graph matching problem by casting it into an equivalent weighted maximum clique problem of the corresponding association graph, which we further penalize through introducing negative weights. Subsequent tabu search optimization allows for overcoming the convention of using positive weights. The method's distinct feature is that it utilizes the history of search to make more strategic decisions while looking for the optimal solution, thus effectively escaping local optima and in practice achieving superior results. The proposed method, unlike the existing algorithms, enables direct optimization in the original discrete space while encouraging rather than artificially enforcing hard one-to-one constraint, thus resulting in better solution. The experiments demonstrate the robustness of the algorithm in a variety of settings, presenting the state-of-the-art results. The code is available at http://cv.snu.ac.kr/research/~DTSGM/.
图匹配是计算机视觉中的一个基本问题。在本文中,我们提出了一种新的基于禁忌搜索的图匹配算法[13]。该方法通过将图匹配问题转化为对应关联图的等价加权最大团问题来解决图匹配问题,并通过引入负权对其进行惩罚。随后的禁忌搜索优化允许克服使用正权重的惯例。该方法的显著特点是在寻找最优解的同时,利用搜索历史进行更多的战略性决策,从而有效地避免了局部最优,在实践中获得了更优的结果。与现有算法不同的是,该方法可以在原始离散空间中直接优化,同时鼓励而不是人为地强加硬一对一约束,从而产生更好的解。实验证明了该算法在各种设置下的鲁棒性,给出了最先进的结果。代码可在http://cv.snu.ac.kr/research/~DTSGM/上获得。
{"title":"Discrete Tabu Search for Graph Matching","authors":"Kamil Adamczewski, Yumin Suh, Kyoung Mu Lee","doi":"10.1109/ICCV.2015.21","DOIUrl":"https://doi.org/10.1109/ICCV.2015.21","url":null,"abstract":"Graph matching is a fundamental problem in computer vision. In this paper, we propose a novel graph matching algorithm based on tabu search [13]. The proposed method solves graph matching problem by casting it into an equivalent weighted maximum clique problem of the corresponding association graph, which we further penalize through introducing negative weights. Subsequent tabu search optimization allows for overcoming the convention of using positive weights. The method's distinct feature is that it utilizes the history of search to make more strategic decisions while looking for the optimal solution, thus effectively escaping local optima and in practice achieving superior results. The proposed method, unlike the existing algorithms, enables direct optimization in the original discrete space while encouraging rather than artificially enforcing hard one-to-one constraint, thus resulting in better solution. The experiments demonstrate the robustness of the algorithm in a variety of settings, presenting the state-of-the-art results. The code is available at http://cv.snu.ac.kr/research/~DTSGM/.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"83 1","pages":"109-117"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78689662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 49
Learning Large-Scale Automatic Image Colorization 学习大规模自动图像着色
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.72
A. Deshpande, Jason Rock, D. Forsyth
We describe an automated method for image colorization that learns to colorize from examples. Our method exploits a LEARCH framework to train a quadratic objective function in the chromaticity maps, comparable to a Gaussian random field. The coefficients of the objective function are conditioned on image features, using a random forest. The objective function admits correlations on long spatial scales, and can control spatial error in the colorization of the image. Images are then colorized by minimizing this objective function. We demonstrate that our method strongly outperforms a natural baseline on large-scale experiments with images of real scenes using a demanding loss function. We demonstrate that learning a model that is conditioned on scene produces improved results. We show how to incorporate a desired color histogram into the objective function, and that doing so can lead to further improvements in results.
我们描述了一种自动的图像着色方法,该方法从示例中学习着色。我们的方法利用LEARCH框架在色度图中训练二次目标函数,类似于高斯随机场。目标函数的系数以图像特征为条件,使用随机森林。目标函数允许在长空间尺度上的相关性,并且可以控制图像着色的空间误差。然后通过最小化这个目标函数对图像进行着色。我们证明了我们的方法在使用苛刻的损失函数的真实场景图像的大规模实验中明显优于自然基线。我们证明了学习一个以场景为条件的模型会产生更好的结果。我们展示了如何将所需的颜色直方图合并到目标函数中,并且这样做可以导致结果的进一步改进。
{"title":"Learning Large-Scale Automatic Image Colorization","authors":"A. Deshpande, Jason Rock, D. Forsyth","doi":"10.1109/ICCV.2015.72","DOIUrl":"https://doi.org/10.1109/ICCV.2015.72","url":null,"abstract":"We describe an automated method for image colorization that learns to colorize from examples. Our method exploits a LEARCH framework to train a quadratic objective function in the chromaticity maps, comparable to a Gaussian random field. The coefficients of the objective function are conditioned on image features, using a random forest. The objective function admits correlations on long spatial scales, and can control spatial error in the colorization of the image. Images are then colorized by minimizing this objective function. We demonstrate that our method strongly outperforms a natural baseline on large-scale experiments with images of real scenes using a demanding loss function. We demonstrate that learning a model that is conditioned on scene produces improved results. We show how to incorporate a desired color histogram into the objective function, and that doing so can lead to further improvements in results.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"13 1","pages":"567-575"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72824127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 206
Class-Specific Image Deblurring 特定类的图像去模糊
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.64
Saeed Anwar, C. P. Huynh, F. Porikli
In image deblurring, a fundamental problem is that the blur kernel suppresses a number of spatial frequencies that are difficult to recover reliably. In this paper, we explore the potential of a class-specific image prior for recovering spatial frequencies attenuated by the blurring process. Specifically, we devise a prior based on the class-specific subspace of image intensity responses to band-pass filters. We learn that the aggregation of these subspaces across all frequency bands serves as a good class-specific prior for the restoration of frequencies that cannot be recovered with generic image priors. In an extensive validation, our method, equipped with the above prior, yields greater image quality than many state-of-the-art methods by up to 5 dB in terms of image PSNR, across various image categories including portraits, cars, cats, pedestrians and household objects.
在图像去模糊中,一个基本问题是模糊核抑制了一些难以可靠恢复的空间频率。在本文中,我们探讨了一类特定的图像先验恢复空间频率衰减的模糊过程的潜力。具体来说,我们设计了一个基于类特定子空间的图像强度响应带通滤波器的先验。我们了解到,这些子空间在所有频带上的聚合可以作为一个很好的类特定先验,用于恢复用一般图像先验无法恢复的频率。在广泛的验证中,我们的方法配备了上述先验,在图像PSNR方面比许多最先进的方法产生更高的图像质量,在各种图像类别中,包括人像,汽车,猫,行人和家居物品,图像质量最高可达5 dB。
{"title":"Class-Specific Image Deblurring","authors":"Saeed Anwar, C. P. Huynh, F. Porikli","doi":"10.1109/ICCV.2015.64","DOIUrl":"https://doi.org/10.1109/ICCV.2015.64","url":null,"abstract":"In image deblurring, a fundamental problem is that the blur kernel suppresses a number of spatial frequencies that are difficult to recover reliably. In this paper, we explore the potential of a class-specific image prior for recovering spatial frequencies attenuated by the blurring process. Specifically, we devise a prior based on the class-specific subspace of image intensity responses to band-pass filters. We learn that the aggregation of these subspaces across all frequency bands serves as a good class-specific prior for the restoration of frequencies that cannot be recovered with generic image priors. In an extensive validation, our method, equipped with the above prior, yields greater image quality than many state-of-the-art methods by up to 5 dB in terms of image PSNR, across various image categories including portraits, cars, cats, pedestrians and household objects.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"9 1","pages":"495-503"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72913260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
Globally Optimal 2D-3D Registration from Points or Lines without Correspondences 全局最优的2D-3D配准从点或线没有对应
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.244
Mark Brown, David Windridge, Jean-Yves Guillemaut
We present a novel approach to 2D-3D registration from points or lines without correspondences. While there exist established solutions in the case where correspondences are known, there are many situations where it is not possible to reliably extract such correspondences across modalities, thus requiring the use of a correspondence-free registration algorithm. Existing correspondence-free methods rely on local search strategies and consequently have no guarantee of finding the optimal solution. In contrast, we present the first globally optimal approach to 2D-3D registration without correspondences, achieved by a Branch-and-Bound algorithm. Furthermore, a deterministic annealing procedure is proposed to speed up the nested branch-and-bound algorithm used. The theoretical and practical advantages this brings are demonstrated on a range of synthetic and real data where it is observed that the proposed approach is significantly more robust to high proportions of outliers compared to existing approaches.
我们提出了一种新的方法,从点或线不对应的2D-3D注册。虽然在已知对应关系的情况下存在既定的解决方案,但在许多情况下,不可能可靠地跨模态提取这种对应关系,因此需要使用无对应关系的配准算法。现有的无对应方法依赖于局部搜索策略,因此不能保证找到最优解。相比之下,我们提出了第一个全局最优的2D-3D配准方法,没有对应,通过分支定界算法实现。此外,为了提高嵌套分支定界算法的求解速度,提出了一种确定性退火算法。这带来的理论和实践优势在一系列合成和实际数据中得到了证明,在这些数据中可以观察到,与现有方法相比,所提出的方法对高比例的异常值具有更强的鲁棒性。
{"title":"Globally Optimal 2D-3D Registration from Points or Lines without Correspondences","authors":"Mark Brown, David Windridge, Jean-Yves Guillemaut","doi":"10.1109/ICCV.2015.244","DOIUrl":"https://doi.org/10.1109/ICCV.2015.244","url":null,"abstract":"We present a novel approach to 2D-3D registration from points or lines without correspondences. While there exist established solutions in the case where correspondences are known, there are many situations where it is not possible to reliably extract such correspondences across modalities, thus requiring the use of a correspondence-free registration algorithm. Existing correspondence-free methods rely on local search strategies and consequently have no guarantee of finding the optimal solution. In contrast, we present the first globally optimal approach to 2D-3D registration without correspondences, achieved by a Branch-and-Bound algorithm. Furthermore, a deterministic annealing procedure is proposed to speed up the nested branch-and-bound algorithm used. The theoretical and practical advantages this brings are demonstrated on a range of synthetic and real data where it is observed that the proposed approach is significantly more robust to high proportions of outliers compared to existing approaches.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"51 1","pages":"2111-2119"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76757602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
FaceDirector: Continuous Control of Facial Performance in Video FaceDirector:在视频中连续控制面部表现
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.453
Charles Malleson, J. Bazin, Oliver Wang, D. Bradley, T. Beeler, A. Hilton, A. Sorkine-Hornung
We present a method to continuously blend between multiple facial performances of an actor, which can contain different facial expressions or emotional states. As an example, given sad and angry video takes of a scene, our method empowers the movie director to specify arbitrary weighted combinations and smooth transitions between the two takes in post-production. Our contributions include (1) a robust nonlinear audio-visual synchronization technique that exploits complementary properties of audio and visual cues to automatically determine robust, dense spatiotemporal correspondences between takes, and (2) a seamless facial blending approach that provides the director full control to interpolate timing, facial expression, and local appearance, in order to generate novel performances after filming. In contrast to most previous works, our approach operates entirely in image space, avoiding the need of 3D facial reconstruction. We demonstrate that our method can synthesize visually believable performances with applications in emotion transition, performance correction, and timing control.
我们提出了一种方法,在演员的多个面部表演之间持续融合,这些面部表演可以包含不同的面部表情或情绪状态。例如,给定一个场景的悲伤和愤怒视频,我们的方法授权电影导演在后期制作中指定任意加权组合和两个镜头之间的平滑过渡。我们的贡献包括:(1)一种鲁棒非线性视听同步技术,利用音频和视觉线索的互补特性来自动确定拍摄之间的鲁棒、密集的时空对应关系;(2)一种无缝的面部混合方法,使导演能够完全控制插入时间、面部表情和局部外观,以便在拍摄后产生新颖的表演。与大多数先前的工作相比,我们的方法完全在图像空间中操作,避免了3D面部重建的需要。我们证明了我们的方法可以合成视觉上可信的表演,并应用于情绪转换,表演纠正和时间控制。
{"title":"FaceDirector: Continuous Control of Facial Performance in Video","authors":"Charles Malleson, J. Bazin, Oliver Wang, D. Bradley, T. Beeler, A. Hilton, A. Sorkine-Hornung","doi":"10.1109/ICCV.2015.453","DOIUrl":"https://doi.org/10.1109/ICCV.2015.453","url":null,"abstract":"We present a method to continuously blend between multiple facial performances of an actor, which can contain different facial expressions or emotional states. As an example, given sad and angry video takes of a scene, our method empowers the movie director to specify arbitrary weighted combinations and smooth transitions between the two takes in post-production. Our contributions include (1) a robust nonlinear audio-visual synchronization technique that exploits complementary properties of audio and visual cues to automatically determine robust, dense spatiotemporal correspondences between takes, and (2) a seamless facial blending approach that provides the director full control to interpolate timing, facial expression, and local appearance, in order to generate novel performances after filming. In contrast to most previous works, our approach operates entirely in image space, avoiding the need of 3D facial reconstruction. We demonstrate that our method can synthesize visually believable performances with applications in emotion transition, performance correction, and timing control.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"3 1","pages":"3979-3987"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85236653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Exploiting High Level Scene Cues in Stereo Reconstruction 利用立体重建中的高级场景线索
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.96
Simon Hadfield, R. Bowden
We present a novel approach to 3D reconstruction which is inspired by the human visual system. This system unifies standard appearance matching and triangulation techniques with higher level reasoning and scene understanding, in order to resolve ambiguities between different interpretations of the scene. The types of reasoning integrated in the approach includes recognising common configurations of surface normals and semantic edges (e.g. convex, concave and occlusion boundaries). We also recognise the coplanar, collinear and symmetric structures which are especially common in man made environments.
我们提出了一种受人类视觉系统启发的三维重建新方法。该系统将标准的外观匹配和三角测量技术与更高层次的推理和场景理解相结合,以解决不同场景解释之间的歧义。该方法中集成的推理类型包括识别表面法线和语义边缘的常见配置(例如凸、凹和遮挡边界)。我们还认识到共面、共线和对称结构在人造环境中特别常见。
{"title":"Exploiting High Level Scene Cues in Stereo Reconstruction","authors":"Simon Hadfield, R. Bowden","doi":"10.1109/ICCV.2015.96","DOIUrl":"https://doi.org/10.1109/ICCV.2015.96","url":null,"abstract":"We present a novel approach to 3D reconstruction which is inspired by the human visual system. This system unifies standard appearance matching and triangulation techniques with higher level reasoning and scene understanding, in order to resolve ambiguities between different interpretations of the scene. The types of reasoning integrated in the approach includes recognising common configurations of surface normals and semantic edges (e.g. convex, concave and occlusion boundaries). We also recognise the coplanar, collinear and symmetric structures which are especially common in man made environments.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"44 1","pages":"783-791"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87223087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
From Emotions to Action Units with Hidden and Semi-Hidden-Task Learning 从情绪到行动单位,隐藏和半隐藏任务学习
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.422
Adria Ruiz, Joost van de Weijer, Xavier Binefa
Limited annotated training data is a challenging problem in Action Unit recognition. In this paper, we investigate how the use of large databases labelled according to the 6 universal facial expressions can increase the generalization ability of Action Unit classifiers. For this purpose, we propose a novel learning framework: Hidden-Task Learning. HTL aims to learn a set of Hidden-Tasks (Action Units) for which samples are not available but, in contrast, training data is easier to obtain from a set of related Visible-Tasks (Facial Expressions). To that end, HTL is able to exploit prior knowledge about the relation between Hidden and Visible-Tasks. In our case, we base this prior knowledge on empirical psychological studies providing statistical correlations between Action Units and universal facial expressions. Additionally, we extend HTL to Semi-Hidden Task Learning (SHTL) assuming that Action Unit training samples are also provided. Performing exhaustive experiments over four different datasets, we show that HTL and SHTL improve the generalization ability of AU classifiers by training them with additional facial expression data. Additionally, we show that SHTL achieves competitive performance compared with state-of-the-art Transductive Learning approaches which face the problem of limited training data by using unlabelled test samples during training.
在动作单元识别中,有限的标注训练数据是一个具有挑战性的问题。在本文中,我们研究了根据6种通用面部表情标记的大型数据库如何提高动作单元分类器的泛化能力。为此,我们提出了一种新的学习框架:隐任务学习。html的目标是学习一组隐藏任务(动作单元),这些任务的样本是不可用的,相反,训练数据更容易从一组相关的可见任务(面部表情)中获得。为此,html能够利用关于隐藏任务和可见任务之间关系的先验知识。在我们的案例中,我们将这种先验知识建立在经验心理学研究的基础上,这些研究提供了行动单位和普遍面部表情之间的统计相关性。此外,我们将html扩展到半隐藏任务学习(SHTL),假设也提供了Action Unit训练样本。在四个不同的数据集上进行详尽的实验,我们表明html和SHTL通过使用额外的面部表情数据训练来提高AU分类器的泛化能力。此外,我们表明,与最先进的转换学习方法相比,SHTL实现了具有竞争力的性能,转换学习方法在训练过程中使用未标记的测试样本,面临训练数据有限的问题。
{"title":"From Emotions to Action Units with Hidden and Semi-Hidden-Task Learning","authors":"Adria Ruiz, Joost van de Weijer, Xavier Binefa","doi":"10.1109/ICCV.2015.422","DOIUrl":"https://doi.org/10.1109/ICCV.2015.422","url":null,"abstract":"Limited annotated training data is a challenging problem in Action Unit recognition. In this paper, we investigate how the use of large databases labelled according to the 6 universal facial expressions can increase the generalization ability of Action Unit classifiers. For this purpose, we propose a novel learning framework: Hidden-Task Learning. HTL aims to learn a set of Hidden-Tasks (Action Units) for which samples are not available but, in contrast, training data is easier to obtain from a set of related Visible-Tasks (Facial Expressions). To that end, HTL is able to exploit prior knowledge about the relation between Hidden and Visible-Tasks. In our case, we base this prior knowledge on empirical psychological studies providing statistical correlations between Action Units and universal facial expressions. Additionally, we extend HTL to Semi-Hidden Task Learning (SHTL) assuming that Action Unit training samples are also provided. Performing exhaustive experiments over four different datasets, we show that HTL and SHTL improve the generalization ability of AU classifiers by training them with additional facial expression data. Additionally, we show that SHTL achieves competitive performance compared with state-of-the-art Transductive Learning approaches which face the problem of limited training data by using unlabelled test samples during training.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"7 1","pages":"3703-3711"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87437985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 58
Context-Guided Diffusion for Label Propagation on Graphs 图上标签传播的上下文引导扩散
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.318
K. Kim, J. Tompkin, H. Pfister, C. Theobalt
Existing approaches for diffusion on graphs, e.g., for label propagation, are mainly focused on isotropic diffusion, which is induced by the commonly-used graph Laplacian regularizer. Inspired by the success of diffusivity tensors for anisotropic diffusion in image processing, we presents anisotropic diffusion on graphs and the corresponding label propagation algorithm. We develop positive definite diffusivity operators on the vector bundles of Riemannian manifolds, and discretize them to diffusivity operators on graphs. This enables us to easily define new robust diffusivity operators which significantly improve semi-supervised learning performance over existing diffusion algorithms.
现有的图上扩散方法,如标签传播,主要集中在各向同性扩散,这是由常用的图拉普拉斯正则化器诱导的。受扩散张量在图像处理中应用于各向异性扩散的成功启发,我们提出了图上的各向异性扩散及其相应的标签传播算法。在黎曼流形的向量束上建立了正定的扩散算子,并将其离散为图上的扩散算子。这使我们能够轻松定义新的鲁棒扩散算子,这些算子显著提高了现有扩散算法的半监督学习性能。
{"title":"Context-Guided Diffusion for Label Propagation on Graphs","authors":"K. Kim, J. Tompkin, H. Pfister, C. Theobalt","doi":"10.1109/ICCV.2015.318","DOIUrl":"https://doi.org/10.1109/ICCV.2015.318","url":null,"abstract":"Existing approaches for diffusion on graphs, e.g., for label propagation, are mainly focused on isotropic diffusion, which is induced by the commonly-used graph Laplacian regularizer. Inspired by the success of diffusivity tensors for anisotropic diffusion in image processing, we presents anisotropic diffusion on graphs and the corresponding label propagation algorithm. We develop positive definite diffusivity operators on the vector bundles of Riemannian manifolds, and discretize them to diffusivity operators on graphs. This enables us to easily define new robust diffusivity operators which significantly improve semi-supervised learning performance over existing diffusion algorithms.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"29 1","pages":"2776-2784"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83714554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Dual-Feature Warping-Based Motion Model Estimation 基于双特征翘曲的运动模型估计
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.487
Shiwei Li, Lu Yuan, Jian Sun, Long Quan
To break down the geometry assumptions of traditional motion models (e.g., homography, affine), warping-based motion model recently becomes very popular and is adopted in many latest applications (e.g., image stitching, video stabilization). With high degrees of freedom, the accuracy of model heavily relies on data-terms (keypoint correspondences). In some low-texture environments (e.g., indoor) where keypoint feature is insufficient or unreliable, the warping model is often erroneously estimated. In this paper we propose a simple and effective approach by considering both keypoint and line segment correspondences as data-term. Line segment is a prominent feature in artificial environments and it can supply sufficient geometrical and structural information of scenes, which not only helps guild to a correct warp in low-texture condition, but also prevents the undesired distortion induced by warping. The combination aims to complement each other and benefit for a wider range of scenes. Our method is general and can be ported to many existing applications. Experiments demonstrate that using dual-feature yields more robust and accurate result especially for those low-texture images.
为了打破传统运动模型(例如,单应性,仿射)的几何假设,基于翘曲的运动模型最近变得非常流行,并被采用在许多最新的应用中(例如,图像拼接,视频稳定)。在高度自由度下,模型的准确性严重依赖于数据项(关键点对应)。在一些低纹理环境中(如室内),关键点特征不充分或不可靠,扭曲模型经常被错误估计。本文提出了一种简单有效的方法,将关键点和线段对应作为数据项。线段是人工环境中的一个重要特征,它能提供充分的场景几何和结构信息,不仅有助于在低纹理条件下进行正确的翘曲,还能防止翘曲引起的不良变形。这种组合旨在相互补充,并在更广泛的场景中受益。我们的方法是通用的,可以移植到许多现有的应用程序中。实验表明,对于低纹理图像,使用双特征可以获得更好的鲁棒性和准确性。
{"title":"Dual-Feature Warping-Based Motion Model Estimation","authors":"Shiwei Li, Lu Yuan, Jian Sun, Long Quan","doi":"10.1109/ICCV.2015.487","DOIUrl":"https://doi.org/10.1109/ICCV.2015.487","url":null,"abstract":"To break down the geometry assumptions of traditional motion models (e.g., homography, affine), warping-based motion model recently becomes very popular and is adopted in many latest applications (e.g., image stitching, video stabilization). With high degrees of freedom, the accuracy of model heavily relies on data-terms (keypoint correspondences). In some low-texture environments (e.g., indoor) where keypoint feature is insufficient or unreliable, the warping model is often erroneously estimated. In this paper we propose a simple and effective approach by considering both keypoint and line segment correspondences as data-term. Line segment is a prominent feature in artificial environments and it can supply sufficient geometrical and structural information of scenes, which not only helps guild to a correct warp in low-texture condition, but also prevents the undesired distortion induced by warping. The combination aims to complement each other and benefit for a wider range of scenes. Our method is general and can be ported to many existing applications. Experiments demonstrate that using dual-feature yields more robust and accurate result especially for those low-texture images.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"49 6","pages":"4283-4291"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91435873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 88
期刊
2015 IEEE International Conference on Computer Vision (ICCV)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1