In this paper, we present a technique for the construction of a camera sensor model for visual SLAM. The proposed method is an extension of the general camera calibration procedure and requires the camera to observe a planar checkerboard pattern shown at different orientations. By iteratively placing the pattern at different distances from the camera, we can find a relationship between the measurement noise covariance matrix and the range. We conclude that the error distribution of a camera sensor follows a Gaussian distribution, based on the Geary's test, and the magnitude of the error variance is linearly related to the range between the camera and the features being observed. Our sensor model can potentially benefit visual SLAM algorithms by varying its measurement noise covariance matrix with range.
{"title":"Camera Sensor Model for Visual SLAM","authors":"Jing Wu, Hong Zhang","doi":"10.1109/CRV.2007.14","DOIUrl":"https://doi.org/10.1109/CRV.2007.14","url":null,"abstract":"In this paper, we present a technique for the construction of a camera sensor model for visual SLAM. The proposed method is an extension of the general camera calibration procedure and requires the camera to observe a planar checkerboard pattern shown at different orientations. By iteratively placing the pattern at different distances from the camera, we can find a relationship between the measurement noise covariance matrix and the range. We conclude that the error distribution of a camera sensor follows a Gaussian distribution, based on the Geary's test, and the magnitude of the error variance is linearly related to the range between the camera and the features being observed. Our sensor model can potentially benefit visual SLAM algorithms by varying its measurement noise covariance matrix with range.","PeriodicalId":304254,"journal":{"name":"Fourth Canadian Conference on Computer and Robot Vision (CRV '07)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116006858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we propose a finite mixture model of generalized Gaussian distributions (GDD) for robust segmentation and data modeling in the presence of noise and outliers. The model has more flexibility to adapt the shape of data and less sensibility for over-fitting the number of classes than the Gaussian mixture. In a first part of the present work, we propose a derivation of the maximum-likelihood estimation of the parameters of the new mixture model and we propose an information-theory based approach for the selection of the number of classes. In a second part, we propose some applications relating to image, motion and foreground segmentation to measure the performance of the new model in image data modeling with comparison to the Gaussian mixture.
{"title":"Finite Generalized Gaussian Mixture Modeling and Applications to Image and Video Foreground Segmentation","authors":"M. S. Allili, N. Bouguila, D. Ziou","doi":"10.1117/1.2898125","DOIUrl":"https://doi.org/10.1117/1.2898125","url":null,"abstract":"In this paper, we propose a finite mixture model of generalized Gaussian distributions (GDD) for robust segmentation and data modeling in the presence of noise and outliers. The model has more flexibility to adapt the shape of data and less sensibility for over-fitting the number of classes than the Gaussian mixture. In a first part of the present work, we propose a derivation of the maximum-likelihood estimation of the parameters of the new mixture model and we propose an information-theory based approach for the selection of the number of classes. In a second part, we propose some applications relating to image, motion and foreground segmentation to measure the performance of the new model in image data modeling with comparison to the Gaussian mixture.","PeriodicalId":304254,"journal":{"name":"Fourth Canadian Conference on Computer and Robot Vision (CRV '07)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131109101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a new and flexible hierarchical multibaseline stereo algorithm that features a non-uniform spatial decomposition of the disparity map. The visibility computation and refinement of the disparity map are integrated into a single iterative framework that does not add extra constraints to the cost function. This makes it possible to use a standard efficient stereo matcher during each iteration. The level of refinement is increased automatically where it is needed in order to preserve a good localization of boundaries. While two graph-theoretic stereo matchers are used in our experiments, our framework is general enough to be applied to many others. The validity of our framework is demonstrated using real imagery with ground truth.
{"title":"Non-Uniform Hierarchical Geo-consistency for Multi-baseline Stereo","authors":"M. Drouin, Martin Trudeau, S. Roy","doi":"10.1109/CRV.2007.47","DOIUrl":"https://doi.org/10.1109/CRV.2007.47","url":null,"abstract":"We propose a new and flexible hierarchical multibaseline stereo algorithm that features a non-uniform spatial decomposition of the disparity map. The visibility computation and refinement of the disparity map are integrated into a single iterative framework that does not add extra constraints to the cost function. This makes it possible to use a standard efficient stereo matcher during each iteration. The level of refinement is increased automatically where it is needed in order to preserve a good localization of boundaries. While two graph-theoretic stereo matchers are used in our experiments, our framework is general enough to be applied to many others. The validity of our framework is demonstrated using real imagery with ground truth.","PeriodicalId":304254,"journal":{"name":"Fourth Canadian Conference on Computer and Robot Vision (CRV '07)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128224469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Strongly similar subimages contain different views of the same object. In subimage search, the user selects an image region and the retrieval system attempts to find matching subimages in an image database that are strongly similar. Solutions have been proposed using salient features or "interest points" that have associated descriptor vectors. However, searching large image databases by exhaustive comparison of interest point descriptors is not feasible. To solve this problem, we propose a novel off-line indexing scheme based on the most significant bits (MSBs) of these descriptors. On-line search uses this index file to limit the search to interest points whose descriptors have the same MSB value, a process up to three orders of magnitude faster than exhaustive search. It is also incremental, since the index file for a union of a group of images can be created by merging the index files of the individual image groups. The effectiveness of the approach is demonstrated experimentally on a variety of image databases.
{"title":"Efficient indexing for strongly similar subimage retrieval","authors":"G. Roth, W. Scott","doi":"10.1109/CRV.2007.24","DOIUrl":"https://doi.org/10.1109/CRV.2007.24","url":null,"abstract":"Strongly similar subimages contain different views of the same object. In subimage search, the user selects an image region and the retrieval system attempts to find matching subimages in an image database that are strongly similar. Solutions have been proposed using salient features or \"interest points\" that have associated descriptor vectors. However, searching large image databases by exhaustive comparison of interest point descriptors is not feasible. To solve this problem, we propose a novel off-line indexing scheme based on the most significant bits (MSBs) of these descriptors. On-line search uses this index file to limit the search to interest points whose descriptors have the same MSB value, a process up to three orders of magnitude faster than exhaustive search. It is also incremental, since the index file for a union of a group of images can be created by merging the index files of the individual image groups. The effectiveness of the approach is demonstrated experimentally on a variety of image databases.","PeriodicalId":304254,"journal":{"name":"Fourth Canadian Conference on Computer and Robot Vision (CRV '07)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129392655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Hansen, B. K. Mortensen, P. Duizer, Jens R. Andersen, T. Moeslund
In this paper we present a system for automatic annotation of humans passing a surveillance camera. Each human has 4 associated annotations: the primary color of the clothing, the height, and focus of attention. The annotation occurs after robust background subtraction based on a Codebook representation. The primary colors of the clothing are estimated by grouping similar pixels according to a body model. The height is estimated based on a 3D mapping using the head and feet. Lastly, the focus of attention is defined as the overall direction of the head, which is estimated using changes in intensity at four different positions. Results show successful detection and hence successful annotation for most test sequences.
{"title":"Automatic Annotation of Humans in Surveillance Video","authors":"D. Hansen, B. K. Mortensen, P. Duizer, Jens R. Andersen, T. Moeslund","doi":"10.1109/CRV.2007.12","DOIUrl":"https://doi.org/10.1109/CRV.2007.12","url":null,"abstract":"In this paper we present a system for automatic annotation of humans passing a surveillance camera. Each human has 4 associated annotations: the primary color of the clothing, the height, and focus of attention. The annotation occurs after robust background subtraction based on a Codebook representation. The primary colors of the clothing are estimated by grouping similar pixels according to a body model. The height is estimated based on a 3D mapping using the head and feet. Lastly, the focus of attention is defined as the overall direction of the head, which is estimated using changes in intensity at four different positions. Results show successful detection and hence successful annotation for most test sequences.","PeriodicalId":304254,"journal":{"name":"Fourth Canadian Conference on Computer and Robot Vision (CRV '07)","volume":"151 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122963662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Developments in the consumer market have indicated that the average user of a personal computer is likely to also own a webcam. With the emergence of this new user group will come a new set of applications, which will require a user-friendly way to calibrate the position of the camera with respect to the location of the screen. This paper presents a fully automatic method to calibrate a screen-camera setup, using a single moving spherical mirror. Unlike other methods, our algorithm needs no user intervention other then moving around a spherical mirror. In addition, if the user provides the algorithm with the exact radius of the sphere in millimeters, the scale of the computed solution is uniquely defined.
{"title":"Screen-Camera Calibration using a Spherical Mirror","authors":"Yannick Francken, C. Hermans, P. Bekaert","doi":"10.1109/CRV.2007.59","DOIUrl":"https://doi.org/10.1109/CRV.2007.59","url":null,"abstract":"Developments in the consumer market have indicated that the average user of a personal computer is likely to also own a webcam. With the emergence of this new user group will come a new set of applications, which will require a user-friendly way to calibrate the position of the camera with respect to the location of the screen. This paper presents a fully automatic method to calibrate a screen-camera setup, using a single moving spherical mirror. Unlike other methods, our algorithm needs no user intervention other then moving around a spherical mirror. In addition, if the user provides the algorithm with the exact radius of the sphere in millimeters, the scale of the computed solution is uniquely defined.","PeriodicalId":304254,"journal":{"name":"Fourth Canadian Conference on Computer and Robot Vision (CRV '07)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122678002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we introduce the Fourier tag, a synthetic fiducial marker used to visually encode information and provide controllable positioning. The Fourier tag is a synthetic target akin to a bar-code that specifies multi-bit information which can be efficiently and robustly detected in an image. Moreover, the Fourier tag has the beneficial property that the bit string it encodes has variable length as a function of the distance between the camera and the target. This follows from the fact that the effective resolution decreases as an effect of perspective. This paper introduces the Fourier tag, describes its design, and illustrates its properties experimentally.
{"title":"Fourier tags: Smoothly degradable fiducial markers for use in human-robot interaction","authors":"Junaed Sattar, Eric Bourque, P. Giguère, G. Dudek","doi":"10.1109/CRV.2007.34","DOIUrl":"https://doi.org/10.1109/CRV.2007.34","url":null,"abstract":"In this paper we introduce the Fourier tag, a synthetic fiducial marker used to visually encode information and provide controllable positioning. The Fourier tag is a synthetic target akin to a bar-code that specifies multi-bit information which can be efficiently and robustly detected in an image. Moreover, the Fourier tag has the beneficial property that the bit string it encodes has variable length as a function of the distance between the camera and the target. This follows from the fact that the effective resolution decreases as an effect of perspective. This paper introduces the Fourier tag, describes its design, and illustrates its properties experimentally.","PeriodicalId":304254,"journal":{"name":"Fourth Canadian Conference on Computer and Robot Vision (CRV '07)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132573991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Multiview image matching methods typically require feature point correspondences. We propose a novel spatial topology method that represents the space with a set of connected projective invariant features. Typically, isolated features, such as corners, cannot be matched reliably. Hence, limitations are imposed on viewpoint changes, or projective invariant descriptions are needed. The fundamental matrix is discovered using stochastic optimization requiring a large number of features. In contrast, our enhanced feature set models connectivity in space, forming a unique configuration that can be matched with few features and over large viewpoint changes. Our features are derived from edges, their curvatures, and neighborhood relationships. A probabilistic spatial topology graph models the space using these features and a second graph represents the neighborhood relationships. Probabilistic graph matching is used to find feature correspondences. Our results show robust feature detection and an average 80% discovery rate of feature matches.
{"title":"Spatial Topology Graphs for Feature-Minimal Correspondence","authors":"Z. Tauber, Ze-Nian Li, M. S. Drew","doi":"10.1109/CRV.2007.60","DOIUrl":"https://doi.org/10.1109/CRV.2007.60","url":null,"abstract":"Multiview image matching methods typically require feature point correspondences. We propose a novel spatial topology method that represents the space with a set of connected projective invariant features. Typically, isolated features, such as corners, cannot be matched reliably. Hence, limitations are imposed on viewpoint changes, or projective invariant descriptions are needed. The fundamental matrix is discovered using stochastic optimization requiring a large number of features. In contrast, our enhanced feature set models connectivity in space, forming a unique configuration that can be matched with few features and over large viewpoint changes. Our features are derived from edges, their curvatures, and neighborhood relationships. A probabilistic spatial topology graph models the space using these features and a second graph represents the neighborhood relationships. Probabilistic graph matching is used to find feature correspondences. Our results show robust feature detection and an average 80% discovery rate of feature matches.","PeriodicalId":304254,"journal":{"name":"Fourth Canadian Conference on Computer and Robot Vision (CRV '07)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129216679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present SHREC, an efficient algorithm for registration of 3D SPHARM (spherical harmonic) surfaces. SHREC follows the iterative closest point (ICP) registration strategy, and alternately improves the surface correspondence and adjusts the object pose. It establishes the surface correspondence by aligning the underlying SPHARM parameterization. It employs a rotational property of the harmonic expansion to accelerate its step for parameterization rotation. It uses a hierarchical icosahedron approach to sample the rotation space and searches for the best parameterization that matches the template. Our experimental results show that SHREC can not only create more accurate registration than previous methods but also do it efficiently. SHREC is a simple, efficient and general registration method, and has a great potential to be used in many shape modeling and analysis applications.
{"title":"Efficient Registration of 3D SPHARM Surfaces","authors":"Li Shen, Heng Huang, F. Makedon, A. Saykin","doi":"10.1109/CRV.2007.26","DOIUrl":"https://doi.org/10.1109/CRV.2007.26","url":null,"abstract":"We present SHREC, an efficient algorithm for registration of 3D SPHARM (spherical harmonic) surfaces. SHREC follows the iterative closest point (ICP) registration strategy, and alternately improves the surface correspondence and adjusts the object pose. It establishes the surface correspondence by aligning the underlying SPHARM parameterization. It employs a rotational property of the harmonic expansion to accelerate its step for parameterization rotation. It uses a hierarchical icosahedron approach to sample the rotation space and searches for the best parameterization that matches the template. Our experimental results show that SHREC can not only create more accurate registration than previous methods but also do it efficiently. SHREC is a simple, efficient and general registration method, and has a great potential to be used in many shape modeling and analysis applications.","PeriodicalId":304254,"journal":{"name":"Fourth Canadian Conference on Computer and Robot Vision (CRV '07)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123385556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Lalonde, David Byrns, L. Gagnon, N. Teasdale, D. Laurendeau
This paper reports on the implementation of a GPU-based, real-time eye blink detector on very low contrast images acquired under near-infrared illumination. This detector is part of a multi-sensor data acquisition and analysis system for driver performance assessment and training. Eye blinks are detected inside regions of interest that are aligned with the subject's eyes at initialization. Alignment is maintained through time by tracking SIFT feature points that are used to estimate the affine transformation between the initial face pose and the pose in subsequent frames. The GPU implementation of the SIFT feature point extraction algorithm ensures real-time processing. An eye blink detection rate of 97% is obtained on a video dataset of 33,000 frames showing 237 blinks from 22 subjects.
{"title":"Real-time eye blink detection with GPU-based SIFT tracking","authors":"M. Lalonde, David Byrns, L. Gagnon, N. Teasdale, D. Laurendeau","doi":"10.1109/CRV.2007.54","DOIUrl":"https://doi.org/10.1109/CRV.2007.54","url":null,"abstract":"This paper reports on the implementation of a GPU-based, real-time eye blink detector on very low contrast images acquired under near-infrared illumination. This detector is part of a multi-sensor data acquisition and analysis system for driver performance assessment and training. Eye blinks are detected inside regions of interest that are aligned with the subject's eyes at initialization. Alignment is maintained through time by tracking SIFT feature points that are used to estimate the affine transformation between the initial face pose and the pose in subsequent frames. The GPU implementation of the SIFT feature point extraction algorithm ensures real-time processing. An eye blink detection rate of 97% is obtained on a video dataset of 33,000 frames showing 237 blinks from 22 subjects.","PeriodicalId":304254,"journal":{"name":"Fourth Canadian Conference on Computer and Robot Vision (CRV '07)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122640448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}