Pub Date : 2007-06-17DOI: 10.1109/CVPR.2007.383464
Philip Tuddenham, P. Robinson
Displaying small text on large multiprojector tiled displays is challenging. Problems arise because text is badly affected by the image-warping techniques that these displays apply to rectify projector misalignment. As a consequence, there has been little progress with important large-display applications that require small text, such as collaborative tutoring or Web-browsing. In this paper we present a new warping technique designed to preserve crisp text, based on recent work by Hereld and Stevens. Our technique produces good results, free of artifacts, when used in today's multiprojector displays. We evaluate the legibility of our technique against conventional interpolation-based warping and find that users prefer our technique. We describe an efficient and reusable implementation, and show how the increased legibility has allowed us to investigate two new applications.
{"title":"Improved Legibility of Text for Multiprojector Tiled Displays","authors":"Philip Tuddenham, P. Robinson","doi":"10.1109/CVPR.2007.383464","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383464","url":null,"abstract":"Displaying small text on large multiprojector tiled displays is challenging. Problems arise because text is badly affected by the image-warping techniques that these displays apply to rectify projector misalignment. As a consequence, there has been little progress with important large-display applications that require small text, such as collaborative tutoring or Web-browsing. In this paper we present a new warping technique designed to preserve crisp text, based on recent work by Hereld and Stevens. Our technique produces good results, free of artifacts, when used in today's multiprojector displays. We evaluate the legibility of our technique against conventional interpolation-based warping and find that users prefer our technique. We describe an efficient and reusable implementation, and show how the increased legibility has allowed us to investigate two new applications.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"124 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129775190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-06-17DOI: 10.1109/CVPR.2007.383435
A. Declercq, J. Piater
Model learning and tracking are two important topics in computer vision. While there are many applications where one of them is used to support the other, there are currently only few where both aid each other simultaneously. In this work, we seek to incrementally learn a graphical model from tracking and to simultaneously use whatever has been learned to improve the tracking in the next frames. The main problem encountered in this situation is that the current intermediate model may be inconsistent with future observations, creating a bias in the tracking results. We propose an uncertain model that explicitly accounts for such uncertainties by representing relations by an appropriately weighted sum of informative (parametric) and uninformative (uniform) components. The method is completely unsupervised and operates in real time.
{"title":"On-line Simultaneous Learning and Tracking of Visual Feature Graphs","authors":"A. Declercq, J. Piater","doi":"10.1109/CVPR.2007.383435","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383435","url":null,"abstract":"Model learning and tracking are two important topics in computer vision. While there are many applications where one of them is used to support the other, there are currently only few where both aid each other simultaneously. In this work, we seek to incrementally learn a graphical model from tracking and to simultaneously use whatever has been learned to improve the tracking in the next frames. The main problem encountered in this situation is that the current intermediate model may be inconsistent with future observations, creating a bias in the tracking results. We propose an uncertain model that explicitly accounts for such uncertainties by representing relations by an appropriately weighted sum of informative (parametric) and uninformative (uniform) components. The method is completely unsupervised and operates in real time.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128448904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-06-17DOI: 10.1109/CVPR.2007.383130
Patrick Peursum, S. Venkatesh, G. West
This paper addresses the problem of markerless tracking of a human in full 3D with a high-dimensional (29D) body model. Most work in this area has been focused on achieving accurate tracking in order to replace marker-based motion capture, but do so at the cost of relying on relatively clean observing conditions. This paper takes a different perspective, proposing a body-tracking model that is explicitly designed to handle real-world conditions such as occlusions by scene objects, failure recovery, long-term tracking, auto-initialisation, generalisation to different people and integration with action recognition. To achieve these goals, an action's motions are modelled with a variant of the hierarchical hidden Markov model. The model is quantitatively evaluated with several tests, including comparison to the annealed particle filter, tracking different people and tracking with a reduced resolution and frame rate.
{"title":"Tracking-as-Recognition for Articulated Full-Body Human Motion Analysis","authors":"Patrick Peursum, S. Venkatesh, G. West","doi":"10.1109/CVPR.2007.383130","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383130","url":null,"abstract":"This paper addresses the problem of markerless tracking of a human in full 3D with a high-dimensional (29D) body model. Most work in this area has been focused on achieving accurate tracking in order to replace marker-based motion capture, but do so at the cost of relying on relatively clean observing conditions. This paper takes a different perspective, proposing a body-tracking model that is explicitly designed to handle real-world conditions such as occlusions by scene objects, failure recovery, long-term tracking, auto-initialisation, generalisation to different people and integration with action recognition. To achieve these goals, an action's motions are modelled with a variant of the hierarchical hidden Markov model. The model is quantitatively evaluated with several tests, including comparison to the annealed particle filter, tracking different people and tracking with a reduced resolution and frame rate.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"07 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128730469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-06-17DOI: 10.1109/CVPR.2007.383149
Akash M. Kushal, C. Schmid, J. Ponce
Today's category-level object recognition systems largely focus on fronto-parallel views of objects with characteristic texture patterns. To overcome these limitations, we propose a novel framework for visual object recognition where object classes are represented by assemblies of partial surface models (PSMs) obeying loose local geometric constraints. The PSMs themselves are formed of dense, locally rigid assemblies of image features. Since our model only enforces local geometric consistency, both at the level of model parts and at the level of individual features within the parts, it is robust to viewpoint changes and intra-class variability. The proposed approach has been implemented, and it outperforms the state-of-the-art algorithms for object detection and localization recently compared in [14] on the Pascal 2005 VOC Challenge Cars Test 1 data.
{"title":"Flexible Object Models for Category-Level 3D Object Recognition","authors":"Akash M. Kushal, C. Schmid, J. Ponce","doi":"10.1109/CVPR.2007.383149","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383149","url":null,"abstract":"Today's category-level object recognition systems largely focus on fronto-parallel views of objects with characteristic texture patterns. To overcome these limitations, we propose a novel framework for visual object recognition where object classes are represented by assemblies of partial surface models (PSMs) obeying loose local geometric constraints. The PSMs themselves are formed of dense, locally rigid assemblies of image features. Since our model only enforces local geometric consistency, both at the level of model parts and at the level of individual features within the parts, it is robust to viewpoint changes and intra-class variability. The proposed approach has been implemented, and it outperforms the state-of-the-art algorithms for object detection and localization recently compared in [14] on the Pascal 2005 VOC Challenge Cars Test 1 data.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128575467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-06-17DOI: 10.1109/CVPR.2007.383423
R. Eastman, J. L. Moigne, N. Netanyahu
Image registration is an important element in data processing for remote sensing with many applications and a wide range of solutions. Despite considerable investigation the field has not settled on a definitive solution for most applications and a number of questions remain open. This article looks at selected research issues by surveying the experience of operational satellite teams, application-specific requirements for Earth science, and our experiments in the evaluation of image registration algorithms with emphasis on the comparison of algorithms for subpixel accuracy. We conclude that remote sensing applications put particular demands on image registration algorithms to take into account domain-specific knowledge of geometric transformations and image content.
{"title":"Research issues in image registration for remote sensing","authors":"R. Eastman, J. L. Moigne, N. Netanyahu","doi":"10.1109/CVPR.2007.383423","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383423","url":null,"abstract":"Image registration is an important element in data processing for remote sensing with many applications and a wide range of solutions. Despite considerable investigation the field has not settled on a definitive solution for most applications and a number of questions remain open. This article looks at selected research issues by surveying the experience of operational satellite teams, application-specific requirements for Earth science, and our experiments in the evaluation of image registration algorithms with emphasis on the comparison of algorithms for subpixel accuracy. We conclude that remote sensing applications put particular demands on image registration algorithms to take into account domain-specific knowledge of geometric transformations and image content.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129140810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-06-17DOI: 10.1109/CVPR.2007.383477
M. Kimura, M. Mochimaru, T. Kanade
In this paper, an easy calibration method for projector is proposed. The calibration handled in this paper is projective relation between 3D space and 2D pattern, and is not correction of trapezoid distortion in projected pattern. In projector-camera systems, especially for 3D measurement, such calibration is the basis of process. The projection from projector can be modeled as inverse projection of the pinhole camera, which is generally considered as perspective projection. In the existing systems, some special objects or devices are often used to calibrate projector, so that 3D-2D projection map can be measured for typical camera calibration methods. The proposed method utilizes projective geometry between camera and projector, so that it requires only pre-calibrated camera and a plane. It is easy to practice, easy to calculate, and reasonably accurate.
{"title":"Projector Calibration using Arbitrary Planes and Calibrated Camera","authors":"M. Kimura, M. Mochimaru, T. Kanade","doi":"10.1109/CVPR.2007.383477","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383477","url":null,"abstract":"In this paper, an easy calibration method for projector is proposed. The calibration handled in this paper is projective relation between 3D space and 2D pattern, and is not correction of trapezoid distortion in projected pattern. In projector-camera systems, especially for 3D measurement, such calibration is the basis of process. The projection from projector can be modeled as inverse projection of the pinhole camera, which is generally considered as perspective projection. In the existing systems, some special objects or devices are often used to calibrate projector, so that 3D-2D projection map can be measured for typical camera calibration methods. The proposed method utilizes projective geometry between camera and projector, so that it requires only pre-calibrated camera and a plane. It is easy to practice, easy to calculate, and reasonably accurate.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129357368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-06-17DOI: 10.1109/CVPR.2007.383269
S. Fidler, A. Leonardis
This paper proposes a novel approach to constructing a hierarchical representation of visual input that aims to enable recognition and detection of a large number of object categories. Inspired by the principles of efficient indexing (bottom-up,), robust matching (top-down,), and ideas of compositionality, our approach learns a hierarchy of spatially flexible compositions, i.e. parts, in an unsupervised, statistics-driven manner. Starting with simple, frequent features, we learn the statistically most significant compositions (parts composed of parts), which consequently define the next layer. Parts are learned sequentially, layer after layer, optimally adjusting to the visual data. Lower layers are learned in a category-independent way to obtain complex, yet sharable visual building blocks, which is a crucial step towards a scalable representation. Higher layers of the hierarchy, on the other hand, are constructed by using specific categories, achieving a category representation with a small number of highly generalizable parts that gained their structural flexibility through composition within the hierarchy. Built in this way, new categories can be efficiently and continuously added to the system by adding a small number of parts only in the higher layers. The approach is demonstrated on a large collection of images and a variety of object categories. Detection results confirm the effectiveness and robustness of the learned parts.
{"title":"Towards Scalable Representations of Object Categories: Learning a Hierarchy of Parts","authors":"S. Fidler, A. Leonardis","doi":"10.1109/CVPR.2007.383269","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383269","url":null,"abstract":"This paper proposes a novel approach to constructing a hierarchical representation of visual input that aims to enable recognition and detection of a large number of object categories. Inspired by the principles of efficient indexing (bottom-up,), robust matching (top-down,), and ideas of compositionality, our approach learns a hierarchy of spatially flexible compositions, i.e. parts, in an unsupervised, statistics-driven manner. Starting with simple, frequent features, we learn the statistically most significant compositions (parts composed of parts), which consequently define the next layer. Parts are learned sequentially, layer after layer, optimally adjusting to the visual data. Lower layers are learned in a category-independent way to obtain complex, yet sharable visual building blocks, which is a crucial step towards a scalable representation. Higher layers of the hierarchy, on the other hand, are constructed by using specific categories, achieving a category representation with a small number of highly generalizable parts that gained their structural flexibility through composition within the hierarchy. Built in this way, new categories can be efficiently and continuously added to the system by adding a small number of parts only in the higher layers. The approach is demonstrated on a large collection of images and a variety of object categories. Detection results confirm the effectiveness and robustness of the learned parts.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124673326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-06-17DOI: 10.1109/CVPR.2007.383441
Hang Zhou, D. Suter
Informative Vector Machine (IVM) is an efficient fast sparse Gaussian process's (GP) method previously suggested for active learning. It greatly reduces the computational cost of GP classification and makes the GP learning close to real time. We apply IVM for man-made structure classification (a two class problem). Our work includes the investigation of the performance of IVM with varied active data points as well as the effects of different choices of GP kernels. Satisfactory results have been obtained, showing that the approach keeps full GP classification performance and yet is significantly faster (by virtue if using a subset of the whole training data points).
{"title":"Fast Sparse Gaussian Processes Learning for Man-Made Structure Classification","authors":"Hang Zhou, D. Suter","doi":"10.1109/CVPR.2007.383441","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383441","url":null,"abstract":"Informative Vector Machine (IVM) is an efficient fast sparse Gaussian process's (GP) method previously suggested for active learning. It greatly reduces the computational cost of GP classification and makes the GP learning close to real time. We apply IVM for man-made structure classification (a two class problem). Our work includes the investigation of the performance of IVM with varied active data points as well as the effects of different choices of GP kernels. Satisfactory results have been obtained, showing that the approach keeps full GP classification performance and yet is significantly faster (by virtue if using a subset of the whole training data points).","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124737615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-06-17DOI: 10.1109/CVPR.2007.383240
Q. A. Nguyen, A. Robles-Kelly, Chunhua Shen
In this paper, we present a probabilistic formulation of kernel-based tracking methods based upon maximum likelihood estimation. To this end, we view the coordinates for the pixels in both, the target model and its candidate as random variables and make use of a generative model so as to cast the tracking task into a maximum likelihood framework. This, in turn, permits the use of the EM-algorithm to estimate a set of latent variables that can be used to update the target-center position. Once the latent variables have been estimated, we use the Kullback-Leibler divergence so as to minimise the mutual information between the target model and candidate distributions in order to develop a target-center update rule and a kernel bandwidth adjustment scheme. The method is very general in nature. We illustrate the utility of our approach for purposes of tracking on real-world video sequences using two alternative kernel functions.
{"title":"Kernel-based Tracking from a Probabilistic Viewpoint","authors":"Q. A. Nguyen, A. Robles-Kelly, Chunhua Shen","doi":"10.1109/CVPR.2007.383240","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383240","url":null,"abstract":"In this paper, we present a probabilistic formulation of kernel-based tracking methods based upon maximum likelihood estimation. To this end, we view the coordinates for the pixels in both, the target model and its candidate as random variables and make use of a generative model so as to cast the tracking task into a maximum likelihood framework. This, in turn, permits the use of the EM-algorithm to estimate a set of latent variables that can be used to update the target-center position. Once the latent variables have been estimated, we use the Kullback-Leibler divergence so as to minimise the mutual information between the target model and candidate distributions in order to develop a target-center update rule and a kernel bandwidth adjustment scheme. The method is very general in nature. We illustrate the utility of our approach for purposes of tracking on real-world video sequences using two alternative kernel functions.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126910673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-06-17DOI: 10.1109/CVPR.2007.383001
Wei-liang Fan, D. Yeung
In this paper, we propose a novel learning-based method for image hallucination, with image super-resolution being a specific application that we focus on here. Given a low-resolution image, its underlying higher-resolution details are synthesized based on a set of training images. In order to build a compact yet descriptive training set, we investigate the characteristic local structures contained in large volumes of small image patches. Inspired by progress in manifold learning research, we take the assumption that small image patches in the low-resolution and high-resolution images form manifolds with similar local geometry in the corresponding image feature spaces. This assumption leads to a super-resolution approach which reconstructs the feature vector corresponding to an image patch by its neighbors in the feature space. In addition, the residual errors associated with the reconstructed image patches are also estimated to compensate for the information loss in the local averaging process. Experimental results show that our hallucination method can synthesize higher-quality images compared with other methods.
{"title":"Image Hallucination Using Neighbor Embedding over Visual Primitive Manifolds","authors":"Wei-liang Fan, D. Yeung","doi":"10.1109/CVPR.2007.383001","DOIUrl":"https://doi.org/10.1109/CVPR.2007.383001","url":null,"abstract":"In this paper, we propose a novel learning-based method for image hallucination, with image super-resolution being a specific application that we focus on here. Given a low-resolution image, its underlying higher-resolution details are synthesized based on a set of training images. In order to build a compact yet descriptive training set, we investigate the characteristic local structures contained in large volumes of small image patches. Inspired by progress in manifold learning research, we take the assumption that small image patches in the low-resolution and high-resolution images form manifolds with similar local geometry in the corresponding image feature spaces. This assumption leads to a super-resolution approach which reconstructs the feature vector corresponding to an image patch by its neighbors in the feature space. In addition, the residual errors associated with the reconstructed image patches are also estimated to compensate for the information loss in the local averaging process. Experimental results show that our hallucination method can synthesize higher-quality images compared with other methods.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130643669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}