Pub Date : 2004-02-01DOI: 10.1109/ICCV.2003.1238660
Liang Wang, Huazhong Ning, T. Tan, Weiming Hu
Human identification at a distance has recently gained growing interest from computer vision researchers. This paper aims to propose a visual recognition algorithm based upon fusion of static and dynamic body biometrics. For each sequence involving a walking figure, pose changes of the segmented moving silhouettes are represented as an associated sequence of complex vector configurations, and are then analyzed using the Procrustes shape analysis method to obtain a compact appearance representation, called static information of body. Also, a model-based approach is presented under a condensation framework to track the walker and to recover joint-angle trajectories of lower limbs, called dynamic information of gait. Both static and dynamic cues are respectively used for recognition using the nearest exemplar classifier. They are also effectively fused on decision level using different combination rules to improve the performance of both identification and verification. Experimental results on a dataset including 20 subjects demonstrate the validity of the proposed algorithm.
{"title":"Fusion of static and dynamic body biometrics for gait recognition","authors":"Liang Wang, Huazhong Ning, T. Tan, Weiming Hu","doi":"10.1109/ICCV.2003.1238660","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238660","url":null,"abstract":"Human identification at a distance has recently gained growing interest from computer vision researchers. This paper aims to propose a visual recognition algorithm based upon fusion of static and dynamic body biometrics. For each sequence involving a walking figure, pose changes of the segmented moving silhouettes are represented as an associated sequence of complex vector configurations, and are then analyzed using the Procrustes shape analysis method to obtain a compact appearance representation, called static information of body. Also, a model-based approach is presented under a condensation framework to track the walker and to recover joint-angle trajectories of lower limbs, called dynamic information of gait. Both static and dynamic cues are respectively used for recognition using the nearest exemplar classifier. They are also effectively fused on decision level using different combination rules to improve the performance of both identification and verification. Experimental results on a dataset including 20 subjects demonstrate the validity of the proposed algorithm.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115921936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238399
Ruigang Yang, M. Pollefeys, G. Welch
We present two extensions to the space carving framework. The first is a progressive scheme to better reconstruct surfaces lacking sufficient textures. The second is a novel photo-consistency measure that is valid for both specular and diffuse surfaces, under unknown lighting conditions.
{"title":"Dealing with textureless regions and specular highlights - a progressive space carving scheme using a novel photo-consistency measure","authors":"Ruigang Yang, M. Pollefeys, G. Welch","doi":"10.1109/ICCV.2003.1238399","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238399","url":null,"abstract":"We present two extensions to the space carving framework. The first is a progressive scheme to better reconstruct surfaces lacking sufficient textures. The second is a novel photo-consistency measure that is valid for both specular and diffuse surfaces, under unknown lighting conditions.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"4 3-4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121002839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238396
M. El-Melegy, A. Farag
We address the problem of calibrating camera lens distortion, which can be significant in medium to wide angle lenses. While almost all existing nonmetric distortion calibration methods need user involvement in one form or another, we present an automatic approach based on the robust the-least-median-of-squares (LMedS) estimator. Our approach is thus less sensitive to erroneous input data such as image curves that are mistakenly considered as projections of 3D linear segments. Our approach uniquely uses fast, closed-form solutions to the distortion coefficients, which serve as an initial point for a nonlinear optimization algorithm to straighten imaged lines. Moreover we propose a method for distortion model selection based on geometrical inference. Successful experiments to evaluate the performance of this approach on synthetic and real data are reported.
{"title":"Nonmetric lens distortion calibration: closed-form solutions, robust estimation and model selection","authors":"M. El-Melegy, A. Farag","doi":"10.1109/ICCV.2003.1238396","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238396","url":null,"abstract":"We address the problem of calibrating camera lens distortion, which can be significant in medium to wide angle lenses. While almost all existing nonmetric distortion calibration methods need user involvement in one form or another, we present an automatic approach based on the robust the-least-median-of-squares (LMedS) estimator. Our approach is thus less sensitive to erroneous input data such as image curves that are mistakenly considered as projections of 3D linear segments. Our approach uniquely uses fast, closed-form solutions to the distortion coefficients, which serve as an initial point for a nonlinear optimization algorithm to straighten imaged lines. Moreover we propose a method for distortion model selection based on geometrical inference. Successful experiments to evaluate the performance of this approach on synthetic and real data are reported.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115189103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238453
T. Ueshiba, F. Tomita
A new calibration algorithm for multicamera systems using a planar reference pattern is proposed. The algorithm is an extension of Sturm-Maybank-Zhang style plane-based calibration technique for use with multiple cameras. Rigid displacements between the cameras are recovered as well as the intrinsic parameters only by capturing with the cameras a model plane with known reference points placed at three or more locations. Thus the algorithm yields a simple calibration means for stereo vision systems with an arbitrary number of cameras while maintaining the handiness and flexibility of the original method. The algorithm is based on factorization of homography matrices between the model and image planes into the camera and plane parameters. To compensate for the indetermination of scaling factors, each homography matrix is rescaled by a double eigenvalue of a planar homology defined by two views and two model planes. The obtained parameters are finally refined by a nonlinear maximum likelihood estimation (MLE) process. The validity of the proposed technique was verified through simulation and experiments with real data.
{"title":"Plane-based calibration algorithm for multi-camera systems via factorization of homography matrices","authors":"T. Ueshiba, F. Tomita","doi":"10.1109/ICCV.2003.1238453","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238453","url":null,"abstract":"A new calibration algorithm for multicamera systems using a planar reference pattern is proposed. The algorithm is an extension of Sturm-Maybank-Zhang style plane-based calibration technique for use with multiple cameras. Rigid displacements between the cameras are recovered as well as the intrinsic parameters only by capturing with the cameras a model plane with known reference points placed at three or more locations. Thus the algorithm yields a simple calibration means for stereo vision systems with an arbitrary number of cameras while maintaining the handiness and flexibility of the original method. The algorithm is based on factorization of homography matrices between the model and image planes into the camera and plane parameters. To compensate for the indetermination of scaling factors, each homography matrix is rescaled by a double eigenvalue of a planar homology defined by two views and two model planes. The obtained parameters are finally refined by a nonlinear maximum likelihood estimation (MLE) process. The validity of the proposed technique was verified through simulation and experiments with real data.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116117438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238337
Hanzi Wang, D. Suter
Robust estimators, such as least median of squared (LMedS) residuals, M-estimators, the least trimmed squares (LTS) etc., have been employed to estimate optical flow from image sequences in recent years. However, these robust estimators have a breakdown point of no more than 50%. We propose a novel robust estimator, called variable bandwidth quick maximum density power estimator (vbQMDPE), which can tolerate more than 50% outliers. We apply the novel proposed estimator to robust optical flow estimation. Our method yields better results than most other recently proposed methods, and it has the potential to better handle multiple motion effects.
{"title":"Variable bandwidth QMDPE and its application in robust optical flow estimation","authors":"Hanzi Wang, D. Suter","doi":"10.1109/ICCV.2003.1238337","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238337","url":null,"abstract":"Robust estimators, such as least median of squared (LMedS) residuals, M-estimators, the least trimmed squares (LTS) etc., have been employed to estimate optical flow from image sequences in recent years. However, these robust estimators have a breakdown point of no more than 50%. We propose a novel robust estimator, called variable bandwidth quick maximum density power estimator (vbQMDPE), which can tolerate more than 50% outliers. We apply the novel proposed estimator to robust optical flow estimation. Our method yields better results than most other recently proposed methods, and it has the potential to better handle multiple motion effects.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122728116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238426
Christopher Geyer, Kostas Daniilidis
In this paper we consider the images taken from pairs of parabolic catadioptric cameras separated by discrete motions. Despite the nonlinearity of the projection model, the epipolar geometry arising from such a system, like the perspective case, can be encoded in a bilinear form, the catadioptric fundamental matrix. We show that all such matrices have equal Lorentzian singular values, and they define a nine-dimensional manifold in the space of 4 /spl times/ 4 matrices. Furthermore, this manifold can be identified with a quotient of two Lie groups. We present a method to estimate a matrix in this space, so as to obtain an estimate of the motion. We show that the estimation procedures are robust to modest deviations from the ideal assumptions.
{"title":"Mirrors in motion: epipolar geometry and motion estimation","authors":"Christopher Geyer, Kostas Daniilidis","doi":"10.1109/ICCV.2003.1238426","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238426","url":null,"abstract":"In this paper we consider the images taken from pairs of parabolic catadioptric cameras separated by discrete motions. Despite the nonlinearity of the projection model, the epipolar geometry arising from such a system, like the perspective case, can be encoded in a bilinear form, the catadioptric fundamental matrix. We show that all such matrices have equal Lorentzian singular values, and they define a nine-dimensional manifold in the space of 4 /spl times/ 4 matrices. Furthermore, this manifold can be identified with a quotient of two Lie groups. We present a method to estimate a matrix in this space, so as to obtain an estimate of the motion. We show that the estimation procedures are robust to modest deviations from the ideal assumptions.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126904170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238441
Haifeng Chen, P. Meer
The robust regression techniques in the RANSAC family are popular today in computer vision, but their performance depends on a user supplied threshold. We eliminate this drawback of RANSAC by reformulating another robust method, the M-estimator, as a projection pursuit optimization problem. The projection based pbM-estimator automatically derives the threshold from univariate kernel density estimates. Nevertheless, the performance of the pbM-estimator equals or exceeds that of RANSAC techniques tuned to the optimal threshold, a value which is never available in practice. Experiments were performed both with synthetic and real data in the affine motion and fundamental matrix estimation tasks.
{"title":"Robust regression with projection based M-estimators","authors":"Haifeng Chen, P. Meer","doi":"10.1109/ICCV.2003.1238441","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238441","url":null,"abstract":"The robust regression techniques in the RANSAC family are popular today in computer vision, but their performance depends on a user supplied threshold. We eliminate this drawback of RANSAC by reformulating another robust method, the M-estimator, as a projection pursuit optimization problem. The projection based pbM-estimator automatically derives the threshold from univariate kernel density estimates. Nevertheless, the performance of the pbM-estimator equals or exceeds that of RANSAC techniques tuned to the optimal threshold, a value which is never available in practice. Experiments were performed both with synthetic and real data in the affine motion and fundamental matrix estimation tasks.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123929813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238325
Danny B. Yang, H. González-Baños, L. Guibas
Estimating the number of people in a crowded environment is a central task in civilian surveillance. Most vision-based counting techniques depend on detecting individuals in order to count, an unrealistic proposition in crowded settings. We propose an alternative approach that directly estimates the number of people. In our system, groups of image sensors segment foreground objects from the background, aggregate the resulting silhouettes over a network, and compute a planar projection of the scene's visual hull. We introduce a geometric algorithm that calculates bounds on the number of persons in each region of the projection, after phantom regions have been eliminated. The computational requirements scale well with the number of sensors and the number of people, and only limited amounts of data are transmitted over the network. Because of these properties, our system runs in real-time and can be deployed as an untethered wireless sensor network. We describe the major components of our system, and report preliminary experiments with our first prototype implementation.
{"title":"Counting people in crowds with a real-time network of simple image sensors","authors":"Danny B. Yang, H. González-Baños, L. Guibas","doi":"10.1109/ICCV.2003.1238325","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238325","url":null,"abstract":"Estimating the number of people in a crowded environment is a central task in civilian surveillance. Most vision-based counting techniques depend on detecting individuals in order to count, an unrealistic proposition in crowded settings. We propose an alternative approach that directly estimates the number of people. In our system, groups of image sensors segment foreground objects from the background, aggregate the resulting silhouettes over a network, and compute a planar projection of the scene's visual hull. We introduce a geometric algorithm that calculates bounds on the number of persons in each region of the projection, after phantom regions have been eliminated. The computational requirements scale well with the number of sensors and the number of people, and only limited amounts of data are transmitted over the network. Because of these properties, our system runs in real-time and can be deployed as an untethered wireless sensor network. We describe the major components of our system, and report preliminary experiments with our first prototype implementation.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120951681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238342
A. Bartoli, P. Sturm
We address the problem of camera motion and structure reconstruction from line correspondences across multiple views, from initialization to final bundle adjustment. One of the main difficulties when dealing with line features is their algebraic representation. First, we consider the triangulation problem. Based on Plucker coordinates to represent the lines, we propose a maximum likelihood algorithm, relying on linearising the Plucker constraint, and on a Plucker correction procedure to compute the closest Plucker coordinates to a given 6-vector. Second, we consider the bundle adjustment problem. Previous overparameterizations of 3D lines induce gauge freedoms and/or internal consistency constraints. We propose the orthonormal representation, which allows handy nonlinear optimization of 3D lines using the minimum 4 parameters, within an unconstrained nonlinear optimizer. We compare our algorithms to existing ones on simulated and real data.
{"title":"Multiple-view structure and motion from line correspondences","authors":"A. Bartoli, P. Sturm","doi":"10.1109/ICCV.2003.1238342","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238342","url":null,"abstract":"We address the problem of camera motion and structure reconstruction from line correspondences across multiple views, from initialization to final bundle adjustment. One of the main difficulties when dealing with line features is their algebraic representation. First, we consider the triangulation problem. Based on Plucker coordinates to represent the lines, we propose a maximum likelihood algorithm, relying on linearising the Plucker constraint, and on a Plucker correction procedure to compute the closest Plucker coordinates to a given 6-vector. Second, we consider the bundle adjustment problem. Previous overparameterizations of 3D lines induce gauge freedoms and/or internal consistency constraints. We propose the orthonormal representation, which allows handy nonlinear optimization of 3D lines using the minimum 4 parameters, within an unconstrained nonlinear optimizer. We compare our algorithms to existing ones on simulated and real data.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121293888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238380
F. Cao
We propose a probabilistic algorithm able to detect the curves that are unexpectedly smooth in a set of digital curves. The only parameter is a false alarm rate, influencing the detection only by its logarithm. We experiment the good continuation criterion on image level lines. One of the conclusion is that, accordingly to Gestalt theory, one can detect edges in a way that is widely independent of contrast. We also use the same kind of method to detect corners and junctions.
{"title":"Good continuations in digital image level lines","authors":"F. Cao","doi":"10.1109/ICCV.2003.1238380","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238380","url":null,"abstract":"We propose a probabilistic algorithm able to detect the curves that are unexpectedly smooth in a set of digital curves. The only parameter is a false alarm rate, influencing the detection only by its logarithm. We experiment the good continuation criterion on image level lines. One of the conclusion is that, accordingly to Gestalt theory, one can detect edges in a way that is widely independent of contrast. We also use the same kind of method to detect corners and junctions.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114556416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}