Pub Date : 2014-03-24DOI: 10.1109/WACV.2014.6835732
Kyoung-Rok Lee, Truong Q. Nguyen
In this paper, we propose a robust method for camera tracking and surface mapping using a handheld RGB-D camera which is effective in challenging situations such as fast camera motion or geometrically featureless scenes. The main contributions are threefold. First, we introduce a robust orientation estimation based on quaternion method for initial sparse estimation. By using visual feature points detection and matching, no prior or small movement assumption is required to estimate a rigid transformation between frames. Second, a weighted ICP (Iterative Closest Point) method for better rate of convergence in optimization and accuracy in resulting trajectory is proposed. While the conventional ICP fails when there is no 3D features in the scene, our approach achieves robustness by emphasizing the influence of points that contain more geometric information of the scene. Finally, we show quantitative results on an RGB-D benchmark dataset. The experiments on an RGB-D trajectory benchmark dataset demonstrate that our method is able to track camera pose accurately.
{"title":"Robust tracking and mapping with a handheld RGB-D camera","authors":"Kyoung-Rok Lee, Truong Q. Nguyen","doi":"10.1109/WACV.2014.6835732","DOIUrl":"https://doi.org/10.1109/WACV.2014.6835732","url":null,"abstract":"In this paper, we propose a robust method for camera tracking and surface mapping using a handheld RGB-D camera which is effective in challenging situations such as fast camera motion or geometrically featureless scenes. The main contributions are threefold. First, we introduce a robust orientation estimation based on quaternion method for initial sparse estimation. By using visual feature points detection and matching, no prior or small movement assumption is required to estimate a rigid transformation between frames. Second, a weighted ICP (Iterative Closest Point) method for better rate of convergence in optimization and accuracy in resulting trajectory is proposed. While the conventional ICP fails when there is no 3D features in the scene, our approach achieves robustness by emphasizing the influence of points that contain more geometric information of the scene. Finally, we show quantitative results on an RGB-D benchmark dataset. The experiments on an RGB-D trajectory benchmark dataset demonstrate that our method is able to track camera pose accurately.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"32 1","pages":"1120-1127"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72577984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-03-24DOI: 10.1109/WACV.2014.6836051
Ke Fan, A. Mian, Wanquan Liu, Lin Li
We propose a new unsupervised algorithm for the automatic alignment of two manifolds of different datasets with possibly different dimensionalities. Alignment is performed automatically without any assumptions on the correspondences between the two manifolds. The proposed algorithm automatically establishes an initial set of sparse correspondences between the two datasets by matching their underlying manifold structures. Local feature histograms are extracted at each point of the manifolds and matched using a robust algorithm to find the initial correspondences. Based on these sparse correspondences, an embedding space is estimated where the distance between the two manifolds is minimized while maximally retaining the original structure of the manifolds. The problem is formulated as a generalized eigenvalue problem and solved efficiently. Dense correspondences are then established between the two manifolds and the process is iteratively implemented until the two manifolds are correctly aligned consequently revealing their joint structure. We demonstrate the effectiveness of our algorithm on aligning protein structures, facial images of different subjects under pose variations and RGB and Depth data from Kinect. Comparison with an state-of-the-art algorithm shows the superiority of the proposed manifold alignment algorithm in terms of accuracy and computational time.
{"title":"Unsupervised iterative manifold alignment via local feature histograms","authors":"Ke Fan, A. Mian, Wanquan Liu, Lin Li","doi":"10.1109/WACV.2014.6836051","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836051","url":null,"abstract":"We propose a new unsupervised algorithm for the automatic alignment of two manifolds of different datasets with possibly different dimensionalities. Alignment is performed automatically without any assumptions on the correspondences between the two manifolds. The proposed algorithm automatically establishes an initial set of sparse correspondences between the two datasets by matching their underlying manifold structures. Local feature histograms are extracted at each point of the manifolds and matched using a robust algorithm to find the initial correspondences. Based on these sparse correspondences, an embedding space is estimated where the distance between the two manifolds is minimized while maximally retaining the original structure of the manifolds. The problem is formulated as a generalized eigenvalue problem and solved efficiently. Dense correspondences are then established between the two manifolds and the process is iteratively implemented until the two manifolds are correctly aligned consequently revealing their joint structure. We demonstrate the effectiveness of our algorithm on aligning protein structures, facial images of different subjects under pose variations and RGB and Depth data from Kinect. Comparison with an state-of-the-art algorithm shows the superiority of the proposed manifold alignment algorithm in terms of accuracy and computational time.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"112 1","pages":"572-579"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74683469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-03-24DOI: 10.1109/WACV.2014.6836018
Karin Tichmann, O. Junge
Motivated by a variational formulation of the motion segmentation problem, we propose a fully implicit variant of the (linearized) alternating direction method of multipliers for the minimization of convex functionals over a convex set. The new scheme does not require a step size restriction for stability and thus approaches the minimum using considerably fewer iterates. In numerical experiments on standard image sequences, the scheme often significantly outperforms other state of the art methods.
{"title":"A fully implicit alternating direction method of multipliers for the minimization of convex problems with an application to motion segmentation","authors":"Karin Tichmann, O. Junge","doi":"10.1109/WACV.2014.6836018","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836018","url":null,"abstract":"Motivated by a variational formulation of the motion segmentation problem, we propose a fully implicit variant of the (linearized) alternating direction method of multipliers for the minimization of convex functionals over a convex set. The new scheme does not require a step size restriction for stability and thus approaches the minimum using considerably fewer iterates. In numerical experiments on standard image sequences, the scheme often significantly outperforms other state of the art methods.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"57 1","pages":"823-830"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84567755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-03-24DOI: 10.1109/WACV.2014.6836078
Pedro A. Rodriguez, Nathan G. Drenkow, D. DeMenthon, Zachary H. Koterba, Kathleen Kauffman, Duane C. Cornish, Bart Paulhamus, R. J. Vogelstein
Neuromimetic algorithms, such as the HMAX algorithm, have been very successful in image classification tasks. However, current implementations of these algorithms do not scale well to large datasets. Often, target-specific features or patches are “learned” ahead of time and then correlated with test images during feature extraction. In this paper, we develop a novel method for selecting a single set of universal features that enables classification across a broad range of image classes. Our method trains multiple Random Forest classifiers using a large dictionary of features and then combines them using a majority voting scheme. This enables the selection of the most discriminative patches based on feature importance measures. Experiments demonstrate the viability of this method using HMAX features as well as the tradeoff between the number of universal features, classification performance, and processing time.
{"title":"Selection of universal features for image classification","authors":"Pedro A. Rodriguez, Nathan G. Drenkow, D. DeMenthon, Zachary H. Koterba, Kathleen Kauffman, Duane C. Cornish, Bart Paulhamus, R. J. Vogelstein","doi":"10.1109/WACV.2014.6836078","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836078","url":null,"abstract":"Neuromimetic algorithms, such as the HMAX algorithm, have been very successful in image classification tasks. However, current implementations of these algorithms do not scale well to large datasets. Often, target-specific features or patches are “learned” ahead of time and then correlated with test images during feature extraction. In this paper, we develop a novel method for selecting a single set of universal features that enables classification across a broad range of image classes. Our method trains multiple Random Forest classifiers using a large dictionary of features and then combines them using a majority voting scheme. This enables the selection of the most discriminative patches based on feature importance measures. Experiments demonstrate the viability of this method using HMAX features as well as the tradeoff between the number of universal features, classification performance, and processing time.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"12 1","pages":"355-362"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85190292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-03-24DOI: 10.1109/WACV.2014.6836090
Hamidreza Odabai Fard, M. Chaouch, Q. Pham, A. Vacavant, T. Chateau
In addition to multi-class classification, the multi-class object detection task consists further in classifying a dominating background label. In this work, we present a novel approach where relevant classes are ranked higher and background labels are rejected. To this end, we arrange the classes into a tree structure where the classifiers are trained in a joint framework combining ranking and classification constraints. Our convex problem formulation naturally allows to apply a tree traversal algorithm that searches for the best class label and progressively rejects background labels. We evaluate our approach on the PASCAL VOC 2007 dataset and show a considerable speed-up of the detection time with increased detection performance.
{"title":"Joint hierarchical learning for efficient multi-class object detection","authors":"Hamidreza Odabai Fard, M. Chaouch, Q. Pham, A. Vacavant, T. Chateau","doi":"10.1109/WACV.2014.6836090","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836090","url":null,"abstract":"In addition to multi-class classification, the multi-class object detection task consists further in classifying a dominating background label. In this work, we present a novel approach where relevant classes are ranked higher and background labels are rejected. To this end, we arrange the classes into a tree structure where the classifiers are trained in a joint framework combining ranking and classification constraints. Our convex problem formulation naturally allows to apply a tree traversal algorithm that searches for the best class label and progressively rejects background labels. We evaluate our approach on the PASCAL VOC 2007 dataset and show a considerable speed-up of the detection time with increased detection performance.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"58 1","pages":"261-268"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90557973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-03-24DOI: 10.1109/WACV.2014.6836043
Jiang Wang, Xiaohan Nie, Yin Xia, Ying Wu
This paper presents a novel approach to cross-view action recognition. Traditional cross-view action recognition methods typically rely on local appearance/motion features. In this paper, we take advantage of the recent developments of depth cameras to build a more discriminative cross-view action representation. In this representation, an action is characterized by the spatio-temporal configuration of 3D Poselets, which are discriminatively discovered with a novel Poselet mining algorithm and can be detected with view-invariant 3D Poselet detectors. The Kinect skeleton is employed to facilitate the 3D Poselet mining and 3D Poselet detectors learning, but the recognition is solely based on 2D video input. Extensive experiments have demonstrated that this new action representation significantly improves the accuracy and robustness for cross-view action recognition.
{"title":"Mining discriminative 3D Poselet for cross-view action recognition","authors":"Jiang Wang, Xiaohan Nie, Yin Xia, Ying Wu","doi":"10.1109/WACV.2014.6836043","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836043","url":null,"abstract":"This paper presents a novel approach to cross-view action recognition. Traditional cross-view action recognition methods typically rely on local appearance/motion features. In this paper, we take advantage of the recent developments of depth cameras to build a more discriminative cross-view action representation. In this representation, an action is characterized by the spatio-temporal configuration of 3D Poselets, which are discriminatively discovered with a novel Poselet mining algorithm and can be detected with view-invariant 3D Poselet detectors. The Kinect skeleton is employed to facilitate the 3D Poselet mining and 3D Poselet detectors learning, but the recognition is solely based on 2D video input. Extensive experiments have demonstrated that this new action representation significantly improves the accuracy and robustness for cross-view action recognition.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"69 1","pages":"634-639"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77063414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-03-24DOI: 10.1109/WACV.2014.6836097
Praveen Kulkarni, Gaurav Sharma, J. Zepeda, Louis Chevallier
Retrieving images for an arbitrary user query, provided in textual form, is a challenging problem. A recently proposed method addresses this by constructing a visual classifier with images returned by an internet image search engine, based on the user query, as positive images while using a fixed pool of negative images. However, in practice, not all the images obtained from internet image search are always pertinent to the query; some might contain abstract or artistic representation of the content and some might have artifacts. Such images degrade the performance of on-the-fly constructed classifier. We propose a method for improving the performance of on-the-fly classifiers by using transfer learning via attributes. We first map the textual query to a set of known attributes and then use those attributes to prune the set of images downloaded from the internet. This pruning step can be seen as zero-shot learning of the visual classifier for the textual user query, which transfers knowledge from the attribute domain to the query domain. We also use the attributes along with the on-the-fly classifier to score the database images and obtain a hybrid ranking. We show interesting qualitative results and demonstrate by experiments with standard datasets that the proposed method improves upon the baseline on-the-fly classification system.
{"title":"Transfer learning via attributes for improved on-the-fly classification","authors":"Praveen Kulkarni, Gaurav Sharma, J. Zepeda, Louis Chevallier","doi":"10.1109/WACV.2014.6836097","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836097","url":null,"abstract":"Retrieving images for an arbitrary user query, provided in textual form, is a challenging problem. A recently proposed method addresses this by constructing a visual classifier with images returned by an internet image search engine, based on the user query, as positive images while using a fixed pool of negative images. However, in practice, not all the images obtained from internet image search are always pertinent to the query; some might contain abstract or artistic representation of the content and some might have artifacts. Such images degrade the performance of on-the-fly constructed classifier. We propose a method for improving the performance of on-the-fly classifiers by using transfer learning via attributes. We first map the textual query to a set of known attributes and then use those attributes to prune the set of images downloaded from the internet. This pruning step can be seen as zero-shot learning of the visual classifier for the textual user query, which transfers knowledge from the attribute domain to the query domain. We also use the attributes along with the on-the-fly classifier to score the database images and obtain a hybrid ranking. We show interesting qualitative results and demonstrate by experiments with standard datasets that the proposed method improves upon the baseline on-the-fly classification system.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"168 1","pages":"220-226"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86887252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-03-24DOI: 10.1109/WACV.2014.6836110
Matthias Richter, J. Beyerer
The color of a material is one of the most frequently used features in automated visual inspection systems. While this is sufficient for many “easy” tasks, mixed and organic materials usually require more complex features. Spectral signatures, especially in the near infrared range, have been proven useful in many cases. However, hyperspectral imaging devices are still very costly and too slow to use them in practice. As a work-around, off-the-shelve cameras and optical filters are used to extract few characteristic features from the spectra. Often, these filters are selected by a human expert in a time consuming and error prone process; surprisingly few works are concerned with automatic selection of suitable filters. We approach this problem by stating filter selection as feature selection problem. In contrast to existing techniques that are mainly concerned with filter design, our approach explicitly selects the best out of a large set of given filters. Our method becomes most appealing for use in an industrial setting, when this selection represents (physically) available filters. We show the application of our technique by implementing six different selection strategies and applying each to two real-world sorting problems.
{"title":"Optical filter selection for automatic visual inspection","authors":"Matthias Richter, J. Beyerer","doi":"10.1109/WACV.2014.6836110","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836110","url":null,"abstract":"The color of a material is one of the most frequently used features in automated visual inspection systems. While this is sufficient for many “easy” tasks, mixed and organic materials usually require more complex features. Spectral signatures, especially in the near infrared range, have been proven useful in many cases. However, hyperspectral imaging devices are still very costly and too slow to use them in practice. As a work-around, off-the-shelve cameras and optical filters are used to extract few characteristic features from the spectra. Often, these filters are selected by a human expert in a time consuming and error prone process; surprisingly few works are concerned with automatic selection of suitable filters. We approach this problem by stating filter selection as feature selection problem. In contrast to existing techniques that are mainly concerned with filter design, our approach explicitly selects the best out of a large set of given filters. Our method becomes most appealing for use in an industrial setting, when this selection represents (physically) available filters. We show the application of our technique by implementing six different selection strategies and applying each to two real-world sorting problems.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"2021 1","pages":"123-128"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87954008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-03-24DOI: 10.1109/WACV.2014.6836035
Sid Ying-Ze Bao, A. Furlan, Li Fei-Fei, S. Savarese
We present a novel framework for robustly understanding the geometrical and semantic structure of a cluttered room from a small number of images captured from different viewpoints. The tasks we seek to address include: i) estimating the 3D layout of the room - that is, the 3D configuration of floor, walls and ceiling; ii) identifying and localizing all the foreground objects in the room. We jointly use multiview geometry constraints and image appearance to identify the best room layout configuration. Extensive experimental evaluation demonstrates that our estimation results are more complete and accurate in estimating 3D room structure and recognizing objects than alternative state-of-the-art algorithms. In addition, we show an augmented reality mobile application to highlight the high accuracy of our method, which may be beneficial to many computer vision applications.
{"title":"Understanding the 3D layout of a cluttered room from multiple images","authors":"Sid Ying-Ze Bao, A. Furlan, Li Fei-Fei, S. Savarese","doi":"10.1109/WACV.2014.6836035","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836035","url":null,"abstract":"We present a novel framework for robustly understanding the geometrical and semantic structure of a cluttered room from a small number of images captured from different viewpoints. The tasks we seek to address include: i) estimating the 3D layout of the room - that is, the 3D configuration of floor, walls and ceiling; ii) identifying and localizing all the foreground objects in the room. We jointly use multiview geometry constraints and image appearance to identify the best room layout configuration. Extensive experimental evaluation demonstrates that our estimation results are more complete and accurate in estimating 3D room structure and recognizing objects than alternative state-of-the-art algorithms. In addition, we show an augmented reality mobile application to highlight the high accuracy of our method, which may be beneficial to many computer vision applications.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"27 1","pages":"690-697"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89065362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-03-24DOI: 10.1109/WACV.2014.6836022
Wenbin Li, Yang Chen, JeeHang Lee, Gang Ren, D. Cosker
Optical flow estimation is a difficult task given real-world video footage with camera and object blur. In this paper, we combine a 3D pose&position tracker with an RGB sensor allowing us to capture video footage together with 3D camera motion. We show that the additional camera motion information can be embedded into a hybrid optical flow framework by interleaving an iterative blind deconvolution and warping based minimization scheme. Such a hybrid framework significantly improves the accuracy of optical flow estimation in scenes with strong blur. Our approach yields improved overall performance against three state-of-the-art baseline methods applied to our proposed ground truth sequences, as well as in several other real-world sequences captured by our novel imaging system.
{"title":"Robust optical flow estimation for continuous blurred scenes using RGB-motion imaging and directional filtering","authors":"Wenbin Li, Yang Chen, JeeHang Lee, Gang Ren, D. Cosker","doi":"10.1109/WACV.2014.6836022","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836022","url":null,"abstract":"Optical flow estimation is a difficult task given real-world video footage with camera and object blur. In this paper, we combine a 3D pose&position tracker with an RGB sensor allowing us to capture video footage together with 3D camera motion. We show that the additional camera motion information can be embedded into a hybrid optical flow framework by interleaving an iterative blind deconvolution and warping based minimization scheme. Such a hybrid framework significantly improves the accuracy of optical flow estimation in scenes with strong blur. Our approach yields improved overall performance against three state-of-the-art baseline methods applied to our proposed ground truth sequences, as well as in several other real-world sequences captured by our novel imaging system.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"108 1","pages":"792-799"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87611216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}