首页 > 最新文献

IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision最新文献

英文 中文
A fully implicit alternating direction method of multipliers for the minimization of convex problems with an application to motion segmentation 一种用于最小化凸问题的乘法器的完全隐式交替方向方法及其在运动分割中的应用
Karin Tichmann, O. Junge
Motivated by a variational formulation of the motion segmentation problem, we propose a fully implicit variant of the (linearized) alternating direction method of multipliers for the minimization of convex functionals over a convex set. The new scheme does not require a step size restriction for stability and thus approaches the minimum using considerably fewer iterates. In numerical experiments on standard image sequences, the scheme often significantly outperforms other state of the art methods.
在运动分割问题的变分公式的激励下,我们提出了一个完全隐式的(线性化)交替方向乘法器方法,用于凸集上凸泛函的最小化。新方案不需要稳定的步长限制,因此使用更少的迭代来接近最小值。在标准图像序列的数值实验中,该方案通常显著优于其他最先进的方法。
{"title":"A fully implicit alternating direction method of multipliers for the minimization of convex problems with an application to motion segmentation","authors":"Karin Tichmann, O. Junge","doi":"10.1109/WACV.2014.6836018","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836018","url":null,"abstract":"Motivated by a variational formulation of the motion segmentation problem, we propose a fully implicit variant of the (linearized) alternating direction method of multipliers for the minimization of convex functionals over a convex set. The new scheme does not require a step size restriction for stability and thus approaches the minimum using considerably fewer iterates. In numerical experiments on standard image sequences, the scheme often significantly outperforms other state of the art methods.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"57 1","pages":"823-830"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84567755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interactive video segmentation using occlusion boundaries and temporally coherent superpixels 使用遮挡边界和时间相干超像素的交互式视频分割
Radu Dondera, Vlad I. Morariu, Yulu Wang, L. Davis
We propose an interactive video segmentation system built on the basis of occlusion and long term spatio-temporal structure cues. User supervision is incorporated in a superpixel graph clustering framework that differs crucially from prior art in that it modifies the graph according to the output of an occlusion boundary detector. Working with long temporal intervals (up to 100 frames) enables our system to significantly reduce annotation effort with respect to state of the art systems. Even though the segmentation results are less than perfect, they are obtained efficiently and can be used in weakly supervised learning from video or for video content description. We do not rely on a discriminative object appearance model and allow extracting multiple foreground objects together, saving user time if more than one object is present. Additional experiments with unsupervised clustering based on occlusion boundaries demonstrate the importance of this cue for video segmentation and thus validate our system design.
提出了一种基于遮挡和长期时空结构线索的交互式视频分割系统。用户监督被纳入超像素图聚类框架,该框架与现有技术的关键区别在于,它根据遮挡边界检测器的输出修改图。使用较长的时间间隔(最多100帧)使我们的系统能够显著减少相对于当前系统状态的注释工作。尽管分割结果不太完美,但它们是有效的,可以用于视频的弱监督学习或视频内容描述。我们不依赖于区分对象外观模型,并允许同时提取多个前景对象,如果存在多个对象,则节省用户时间。基于遮挡边界的无监督聚类的其他实验证明了该线索对视频分割的重要性,从而验证了我们的系统设计。
{"title":"Interactive video segmentation using occlusion boundaries and temporally coherent superpixels","authors":"Radu Dondera, Vlad I. Morariu, Yulu Wang, L. Davis","doi":"10.1109/WACV.2014.6836023","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836023","url":null,"abstract":"We propose an interactive video segmentation system built on the basis of occlusion and long term spatio-temporal structure cues. User supervision is incorporated in a superpixel graph clustering framework that differs crucially from prior art in that it modifies the graph according to the output of an occlusion boundary detector. Working with long temporal intervals (up to 100 frames) enables our system to significantly reduce annotation effort with respect to state of the art systems. Even though the segmentation results are less than perfect, they are obtained efficiently and can be used in weakly supervised learning from video or for video content description. We do not rely on a discriminative object appearance model and allow extracting multiple foreground objects together, saving user time if more than one object is present. Additional experiments with unsupervised clustering based on occlusion boundaries demonstrate the importance of this cue for video segmentation and thus validate our system design.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"180 1","pages":"784-791"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88468919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Real-time video decolorization using bilateral filtering 实时视频脱色使用双边滤波
Yibing Song, Linchao Bao, Qingxiong Yang
This paper presents a real-time decolorization method. Given the human visual systems preference for luminance information, the luminance should be preserved as much as possible during decolorization. As a result, the proposed decolorization method measures the amount of color contrast/detail lost when converting color to luminance. The detail loss is estimated by computing the difference between two intermediate images: one obtained by applying bilateral filter to the original color image, and the other obtained by applying joint bilateral filter to the original color image with its luminance as the guidance image. The estimated detail loss is then mapped to a grayscale image named residual image by minimizing the difference between the image gradients of the input color image and the objective grayscale image that is the sum of the residual image and the luminance. Apparently, the residual image will contain pixels with all zero values (that is the two intermediate images will be the same) only when no visual detail is missing in the luminance. Unlike most previous methods, the proposed decolorization method preserves both contrast in the color image and the luminance. Quantitative evaluation shows that it is the top performer on the standard test suite. Meanwhile it is very robust and can be directly used to convert videos while maintaining the temporal coherence. Specifically it can convert a high-resolution video (1280 × 720) in real time (about 28 Hz) on a 3.4 GHz i7 CPU.
本文提出了一种实时脱色方法。鉴于人类视觉系统对亮度信息的偏好,在脱色过程中应尽可能地保留亮度。因此,所提出的脱色方法测量了将颜色转换为亮度时颜色对比度/细节损失的量。通过计算两幅中间图像之间的差值来估计细节损失,一幅中间图像是对原始彩色图像进行双边滤波得到的,另一幅中间图像是对原始彩色图像进行联合双边滤波得到的,其亮度作为引导图像。然后通过最小化输入彩色图像的图像梯度与客观灰度图像(即残差图像和亮度之和)之间的差值,将估计的细节损失映射到称为残差图像的灰度图像。显然,残差图像将包含像素与所有零值(即两个中间图像将是相同的),只有当没有视觉细节丢失的亮度。与以往大多数方法不同,本文提出的脱色方法既保留了彩色图像的对比度,又保留了亮度。定量评估表明它在标准测试套件中表现最好。同时,它具有很强的鲁棒性,可以直接用于视频转换,同时保持时间相干性。具体来说,它可以在3.4 GHz i7 CPU上实时(约28 Hz)转换高分辨率视频(1280 × 720)。
{"title":"Real-time video decolorization using bilateral filtering","authors":"Yibing Song, Linchao Bao, Qingxiong Yang","doi":"10.1109/WACV.2014.6836106","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836106","url":null,"abstract":"This paper presents a real-time decolorization method. Given the human visual systems preference for luminance information, the luminance should be preserved as much as possible during decolorization. As a result, the proposed decolorization method measures the amount of color contrast/detail lost when converting color to luminance. The detail loss is estimated by computing the difference between two intermediate images: one obtained by applying bilateral filter to the original color image, and the other obtained by applying joint bilateral filter to the original color image with its luminance as the guidance image. The estimated detail loss is then mapped to a grayscale image named residual image by minimizing the difference between the image gradients of the input color image and the objective grayscale image that is the sum of the residual image and the luminance. Apparently, the residual image will contain pixels with all zero values (that is the two intermediate images will be the same) only when no visual detail is missing in the luminance. Unlike most previous methods, the proposed decolorization method preserves both contrast in the color image and the luminance. Quantitative evaluation shows that it is the top performer on the standard test suite. Meanwhile it is very robust and can be directly used to convert videos while maintaining the temporal coherence. Specifically it can convert a high-resolution video (1280 × 720) in real time (about 28 Hz) on a 3.4 GHz i7 CPU.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"55 1","pages":"159-166"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90052446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Joint hierarchical learning for efficient multi-class object detection 高效多类目标检测的联合分层学习
Hamidreza Odabai Fard, M. Chaouch, Q. Pham, A. Vacavant, T. Chateau
In addition to multi-class classification, the multi-class object detection task consists further in classifying a dominating background label. In this work, we present a novel approach where relevant classes are ranked higher and background labels are rejected. To this end, we arrange the classes into a tree structure where the classifiers are trained in a joint framework combining ranking and classification constraints. Our convex problem formulation naturally allows to apply a tree traversal algorithm that searches for the best class label and progressively rejects background labels. We evaluate our approach on the PASCAL VOC 2007 dataset and show a considerable speed-up of the detection time with increased detection performance.
除了多类分类之外,多类目标检测任务还包括对主导背景标签进行分类。在这项工作中,我们提出了一种新的方法,其中相关类的排名更高,背景标签被拒绝。为此,我们将类排列成树结构,其中分类器在结合排名和分类约束的联合框架中进行训练。我们的凸问题公式自然允许应用树遍历算法,该算法搜索最佳类标签并逐步拒绝背景标签。我们在PASCAL VOC 2007数据集上评估了我们的方法,并显示出随着检测性能的提高,检测时间大大加快。
{"title":"Joint hierarchical learning for efficient multi-class object detection","authors":"Hamidreza Odabai Fard, M. Chaouch, Q. Pham, A. Vacavant, T. Chateau","doi":"10.1109/WACV.2014.6836090","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836090","url":null,"abstract":"In addition to multi-class classification, the multi-class object detection task consists further in classifying a dominating background label. In this work, we present a novel approach where relevant classes are ranked higher and background labels are rejected. To this end, we arrange the classes into a tree structure where the classifiers are trained in a joint framework combining ranking and classification constraints. Our convex problem formulation naturally allows to apply a tree traversal algorithm that searches for the best class label and progressively rejects background labels. We evaluate our approach on the PASCAL VOC 2007 dataset and show a considerable speed-up of the detection time with increased detection performance.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"58 1","pages":"261-268"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90557973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Mining discriminative 3D Poselet for cross-view action recognition 面向交叉视角动作识别的判别性三维波selet挖掘
Jiang Wang, Xiaohan Nie, Yin Xia, Ying Wu
This paper presents a novel approach to cross-view action recognition. Traditional cross-view action recognition methods typically rely on local appearance/motion features. In this paper, we take advantage of the recent developments of depth cameras to build a more discriminative cross-view action representation. In this representation, an action is characterized by the spatio-temporal configuration of 3D Poselets, which are discriminatively discovered with a novel Poselet mining algorithm and can be detected with view-invariant 3D Poselet detectors. The Kinect skeleton is employed to facilitate the 3D Poselet mining and 3D Poselet detectors learning, but the recognition is solely based on 2D video input. Extensive experiments have demonstrated that this new action representation significantly improves the accuracy and robustness for cross-view action recognition.
提出了一种新的跨视动作识别方法。传统的交叉视图动作识别方法通常依赖于局部外观/运动特征。在本文中,我们利用深度相机的最新发展来构建更具判别性的跨视图动作表示。在这种表示中,一个动作的特征是三维Poselet的时空配置,这些Poselet是用一种新的Poselet挖掘算法鉴别发现的,并且可以用视图不变的3D Poselet检测器检测到。Kinect骨架用于3D Poselet挖掘和3D Poselet检测器学习,但识别仅基于2D视频输入。大量的实验表明,这种新的动作表示显著提高了跨视动作识别的准确性和鲁棒性。
{"title":"Mining discriminative 3D Poselet for cross-view action recognition","authors":"Jiang Wang, Xiaohan Nie, Yin Xia, Ying Wu","doi":"10.1109/WACV.2014.6836043","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836043","url":null,"abstract":"This paper presents a novel approach to cross-view action recognition. Traditional cross-view action recognition methods typically rely on local appearance/motion features. In this paper, we take advantage of the recent developments of depth cameras to build a more discriminative cross-view action representation. In this representation, an action is characterized by the spatio-temporal configuration of 3D Poselets, which are discriminatively discovered with a novel Poselet mining algorithm and can be detected with view-invariant 3D Poselet detectors. The Kinect skeleton is employed to facilitate the 3D Poselet mining and 3D Poselet detectors learning, but the recognition is solely based on 2D video input. Extensive experiments have demonstrated that this new action representation significantly improves the accuracy and robustness for cross-view action recognition.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"69 1","pages":"634-639"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77063414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Transfer learning via attributes for improved on-the-fly classification 通过属性迁移学习改进动态分类
Praveen Kulkarni, Gaurav Sharma, J. Zepeda, Louis Chevallier
Retrieving images for an arbitrary user query, provided in textual form, is a challenging problem. A recently proposed method addresses this by constructing a visual classifier with images returned by an internet image search engine, based on the user query, as positive images while using a fixed pool of negative images. However, in practice, not all the images obtained from internet image search are always pertinent to the query; some might contain abstract or artistic representation of the content and some might have artifacts. Such images degrade the performance of on-the-fly constructed classifier. We propose a method for improving the performance of on-the-fly classifiers by using transfer learning via attributes. We first map the textual query to a set of known attributes and then use those attributes to prune the set of images downloaded from the internet. This pruning step can be seen as zero-shot learning of the visual classifier for the textual user query, which transfers knowledge from the attribute domain to the query domain. We also use the attributes along with the on-the-fly classifier to score the database images and obtain a hybrid ranking. We show interesting qualitative results and demonstrate by experiments with standard datasets that the proposed method improves upon the baseline on-the-fly classification system.
检索以文本形式提供的任意用户查询的图像是一个具有挑战性的问题。最近提出的一种方法通过构建一个视觉分类器来解决这个问题,该分类器使用互联网图像搜索引擎根据用户查询返回的图像作为正面图像,同时使用固定的负面图像池。然而,在实践中,并非所有从网络图像搜索中获得的图像都与查询相关;有些可能包含内容的抽象或艺术表示,有些可能包含工件。这样的图像会降低实时构造分类器的性能。我们提出了一种通过属性迁移学习来提高动态分类器性能的方法。我们首先将文本查询映射到一组已知属性,然后使用这些属性对从互联网下载的图像集进行修剪。这个修剪步骤可以看作是文本用户查询的视觉分类器的零次学习,它将知识从属性域转移到查询域。我们还使用属性和实时分类器对数据库图像进行评分,并获得混合排名。我们展示了有趣的定性结果,并通过标准数据集的实验证明了所提出的方法在基线实时分类系统的基础上得到了改进。
{"title":"Transfer learning via attributes for improved on-the-fly classification","authors":"Praveen Kulkarni, Gaurav Sharma, J. Zepeda, Louis Chevallier","doi":"10.1109/WACV.2014.6836097","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836097","url":null,"abstract":"Retrieving images for an arbitrary user query, provided in textual form, is a challenging problem. A recently proposed method addresses this by constructing a visual classifier with images returned by an internet image search engine, based on the user query, as positive images while using a fixed pool of negative images. However, in practice, not all the images obtained from internet image search are always pertinent to the query; some might contain abstract or artistic representation of the content and some might have artifacts. Such images degrade the performance of on-the-fly constructed classifier. We propose a method for improving the performance of on-the-fly classifiers by using transfer learning via attributes. We first map the textual query to a set of known attributes and then use those attributes to prune the set of images downloaded from the internet. This pruning step can be seen as zero-shot learning of the visual classifier for the textual user query, which transfers knowledge from the attribute domain to the query domain. We also use the attributes along with the on-the-fly classifier to score the database images and obtain a hybrid ranking. We show interesting qualitative results and demonstrate by experiments with standard datasets that the proposed method improves upon the baseline on-the-fly classification system.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"168 1","pages":"220-226"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86887252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Optical filter selection for automatic visual inspection 光学滤光片选择自动目视检查
Matthias Richter, J. Beyerer
The color of a material is one of the most frequently used features in automated visual inspection systems. While this is sufficient for many “easy” tasks, mixed and organic materials usually require more complex features. Spectral signatures, especially in the near infrared range, have been proven useful in many cases. However, hyperspectral imaging devices are still very costly and too slow to use them in practice. As a work-around, off-the-shelve cameras and optical filters are used to extract few characteristic features from the spectra. Often, these filters are selected by a human expert in a time consuming and error prone process; surprisingly few works are concerned with automatic selection of suitable filters. We approach this problem by stating filter selection as feature selection problem. In contrast to existing techniques that are mainly concerned with filter design, our approach explicitly selects the best out of a large set of given filters. Our method becomes most appealing for use in an industrial setting, when this selection represents (physically) available filters. We show the application of our technique by implementing six different selection strategies and applying each to two real-world sorting problems.
材料的颜色是自动视觉检测系统中最常用的特征之一。虽然这对于许多“简单”的任务来说已经足够了,但混合材料和有机材料通常需要更复杂的特性。光谱特征,特别是在近红外范围内,已被证明在许多情况下是有用的。然而,高光谱成像设备仍然非常昂贵,而且速度太慢,无法在实践中使用。作为一种解决方案,使用现成的相机和光学滤光片从光谱中提取少量特征。通常,这些过滤器是由人类专家在一个耗时且容易出错的过程中选择的;令人惊讶的是,很少有作品涉及到自动选择合适的过滤器。我们通过将滤波器选择描述为特征选择问题来解决这个问题。与主要关注滤波器设计的现有技术相比,我们的方法明确地从一组给定的滤波器中选择最佳滤波器。当这个选择代表(物理上)可用的过滤器时,我们的方法最适合在工业环境中使用。我们通过实现六种不同的选择策略并将每种策略应用于两个现实世界的排序问题来展示我们的技术的应用。
{"title":"Optical filter selection for automatic visual inspection","authors":"Matthias Richter, J. Beyerer","doi":"10.1109/WACV.2014.6836110","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836110","url":null,"abstract":"The color of a material is one of the most frequently used features in automated visual inspection systems. While this is sufficient for many “easy” tasks, mixed and organic materials usually require more complex features. Spectral signatures, especially in the near infrared range, have been proven useful in many cases. However, hyperspectral imaging devices are still very costly and too slow to use them in practice. As a work-around, off-the-shelve cameras and optical filters are used to extract few characteristic features from the spectra. Often, these filters are selected by a human expert in a time consuming and error prone process; surprisingly few works are concerned with automatic selection of suitable filters. We approach this problem by stating filter selection as feature selection problem. In contrast to existing techniques that are mainly concerned with filter design, our approach explicitly selects the best out of a large set of given filters. Our method becomes most appealing for use in an industrial setting, when this selection represents (physically) available filters. We show the application of our technique by implementing six different selection strategies and applying each to two real-world sorting problems.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"2021 1","pages":"123-128"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87954008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Understanding the 3D layout of a cluttered room from multiple images 从多个图像中理解杂乱房间的3D布局
Sid Ying-Ze Bao, A. Furlan, Li Fei-Fei, S. Savarese
We present a novel framework for robustly understanding the geometrical and semantic structure of a cluttered room from a small number of images captured from different viewpoints. The tasks we seek to address include: i) estimating the 3D layout of the room - that is, the 3D configuration of floor, walls and ceiling; ii) identifying and localizing all the foreground objects in the room. We jointly use multiview geometry constraints and image appearance to identify the best room layout configuration. Extensive experimental evaluation demonstrates that our estimation results are more complete and accurate in estimating 3D room structure and recognizing objects than alternative state-of-the-art algorithms. In addition, we show an augmented reality mobile application to highlight the high accuracy of our method, which may be beneficial to many computer vision applications.
我们提出了一个新的框架,用于从不同角度捕获的少量图像中稳健地理解杂乱房间的几何和语义结构。我们寻求解决的任务包括:i)估计房间的3D布局-即地板,墙壁和天花板的3D配置;Ii)识别和定位房间内所有前景物体。我们联合使用多视图几何约束和图像外观来确定最佳的房间布局配置。大量的实验评估表明,我们的估计结果在估计3D房间结构和识别物体方面比其他最先进的算法更完整和准确。此外,我们展示了一个增强现实移动应用程序,以突出我们的方法的高精度,这可能有利于许多计算机视觉应用。
{"title":"Understanding the 3D layout of a cluttered room from multiple images","authors":"Sid Ying-Ze Bao, A. Furlan, Li Fei-Fei, S. Savarese","doi":"10.1109/WACV.2014.6836035","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836035","url":null,"abstract":"We present a novel framework for robustly understanding the geometrical and semantic structure of a cluttered room from a small number of images captured from different viewpoints. The tasks we seek to address include: i) estimating the 3D layout of the room - that is, the 3D configuration of floor, walls and ceiling; ii) identifying and localizing all the foreground objects in the room. We jointly use multiview geometry constraints and image appearance to identify the best room layout configuration. Extensive experimental evaluation demonstrates that our estimation results are more complete and accurate in estimating 3D room structure and recognizing objects than alternative state-of-the-art algorithms. In addition, we show an augmented reality mobile application to highlight the high accuracy of our method, which may be beneficial to many computer vision applications.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"27 1","pages":"690-697"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89065362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Robust optical flow estimation for continuous blurred scenes using RGB-motion imaging and directional filtering 基于rgb运动成像和方向滤波的连续模糊场景鲁棒光流估计
Wenbin Li, Yang Chen, JeeHang Lee, Gang Ren, D. Cosker
Optical flow estimation is a difficult task given real-world video footage with camera and object blur. In this paper, we combine a 3D pose&position tracker with an RGB sensor allowing us to capture video footage together with 3D camera motion. We show that the additional camera motion information can be embedded into a hybrid optical flow framework by interleaving an iterative blind deconvolution and warping based minimization scheme. Such a hybrid framework significantly improves the accuracy of optical flow estimation in scenes with strong blur. Our approach yields improved overall performance against three state-of-the-art baseline methods applied to our proposed ground truth sequences, as well as in several other real-world sequences captured by our novel imaging system.
对于带有摄像机和物体模糊的真实视频片段,光流估计是一项困难的任务。在本文中,我们将3D姿势和位置跟踪器与RGB传感器相结合,使我们能够捕获视频片段以及3D相机运动。我们展示了额外的相机运动信息可以嵌入到一个混合光流框架通过交错的迭代盲反褶积和基于扭曲的最小化方案。这种混合框架显著提高了在强模糊场景下光流估计的精度。我们的方法比应用于我们提出的地面真值序列的三种最先进的基线方法以及我们的新型成像系统捕获的其他几个真实世界序列的总体性能有所提高。
{"title":"Robust optical flow estimation for continuous blurred scenes using RGB-motion imaging and directional filtering","authors":"Wenbin Li, Yang Chen, JeeHang Lee, Gang Ren, D. Cosker","doi":"10.1109/WACV.2014.6836022","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836022","url":null,"abstract":"Optical flow estimation is a difficult task given real-world video footage with camera and object blur. In this paper, we combine a 3D pose&position tracker with an RGB sensor allowing us to capture video footage together with 3D camera motion. We show that the additional camera motion information can be embedded into a hybrid optical flow framework by interleaving an iterative blind deconvolution and warping based minimization scheme. Such a hybrid framework significantly improves the accuracy of optical flow estimation in scenes with strong blur. Our approach yields improved overall performance against three state-of-the-art baseline methods applied to our proposed ground truth sequences, as well as in several other real-world sequences captured by our novel imaging system.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"108 1","pages":"792-799"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87611216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Benchmarking large-scale Fine-Grained Categorization 大规模细粒度分类的基准测试
A. Angelova, Philip M. Long
This paper presents a systematic evaluation of recent methods in the fine-grained categorization domain, which have shown significant promise. More specifically, we investigate an automatic segmentation algorithm, a region pooling algorithm which is akin to pose-normalized pooling [31] [28], and a multi-class optimization method. We considered the largest and most popular datasets for fine-grained categorization available in the field: the Caltech-UCSD 200 Birds dataset [27], the Oxford 102 Flowers dataset [19], the Stanford 120 Dogs dataset [16], and the Oxford 37 Cats and Dogs dataset [21]. We view this work from a practitioner's perspective, answering the question: what are the methods that can create the best possible fine-grained recognition system which can be applied in practice? Our experiments provide insights of the relative merit of these methods. More importantly, after combining the methods, we achieve the top results in the field, outperforming the state-of-the-art methods by 4.8% and 10.3% for birds and dogs datasets, respectively. Additionally, our method achieves a mAP of 37.92 on the of 2012 Imagenet Fine-Grained Categorization Challenge [1], which outperforms the winner of this challenge by 5.7 points.
本文对细粒度分类领域的最新方法进行了系统的评价,这些方法显示出很大的前景。更具体地说,我们研究了一种自动分割算法,一种类似于姿态归一化池化的区域池化算法[31][28],以及一种多类优化方法。我们考虑了该领域最大和最流行的细粒度分类数据集:加州理工大学-加州大学圣地亚哥分校200只鸟数据集[27]、牛津大学102只花数据集[19]、斯坦福大学120只狗数据集[16]和牛津大学37只猫和狗数据集[21]。我们从从业者的角度来看待这项工作,回答这个问题:哪些方法可以创建最好的细粒度识别系统,并可以在实践中应用?我们的实验提供了这些方法的相对优点的见解。更重要的是,在结合了这些方法之后,我们在该领域取得了最好的结果,在鸟类和狗的数据集上分别比最先进的方法高出4.8%和10.3%。此外,我们的方法在2012年Imagenet细粒度分类挑战赛[1]上取得了37.92分的mAP,比本次挑战赛的获胜者高出5.7分。
{"title":"Benchmarking large-scale Fine-Grained Categorization","authors":"A. Angelova, Philip M. Long","doi":"10.1109/WACV.2014.6836056","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836056","url":null,"abstract":"This paper presents a systematic evaluation of recent methods in the fine-grained categorization domain, which have shown significant promise. More specifically, we investigate an automatic segmentation algorithm, a region pooling algorithm which is akin to pose-normalized pooling [31] [28], and a multi-class optimization method. We considered the largest and most popular datasets for fine-grained categorization available in the field: the Caltech-UCSD 200 Birds dataset [27], the Oxford 102 Flowers dataset [19], the Stanford 120 Dogs dataset [16], and the Oxford 37 Cats and Dogs dataset [21]. We view this work from a practitioner's perspective, answering the question: what are the methods that can create the best possible fine-grained recognition system which can be applied in practice? Our experiments provide insights of the relative merit of these methods. More importantly, after combining the methods, we achieve the top results in the field, outperforming the state-of-the-art methods by 4.8% and 10.3% for birds and dogs datasets, respectively. Additionally, our method achieves a mAP of 37.92 on the of 2012 Imagenet Fine-Grained Categorization Challenge [1], which outperforms the winner of this challenge by 5.7 points.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"83 1","pages":"532-539"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89952993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
期刊
IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1