This paper describes a new face recognition method using a projection-based 3D normalization and a shading subspace orthogonalization under variation in facial pose and illumination. The proposed method does not need any reconstruction and reillumination for a personalized 3D model, thus it can avoid these troublesome problems and the recognition process can be done rapidly. The facial size and pose including out of plane rotation can be normalized to a generic 3D model from one still image and the input subspace is generated by perturbed cropped patterns in order to absorb the localization errors. Furthermore, by exploiting the fact that a normalized pattern is fitted to the generic 3D model, illumination robust features are extracted through the shading subspace orthogonalization. Evaluation experiments are performed using several databases and the results show the effectiveness of our method under various facial poses and illuminations
{"title":"Face recognition by projection-based 3D normalization and shading subspace orthogonalization","authors":"Tatsuo Kozakaya, Osamu Yamaguchi","doi":"10.1109/FGR.2006.43","DOIUrl":"https://doi.org/10.1109/FGR.2006.43","url":null,"abstract":"This paper describes a new face recognition method using a projection-based 3D normalization and a shading subspace orthogonalization under variation in facial pose and illumination. The proposed method does not need any reconstruction and reillumination for a personalized 3D model, thus it can avoid these troublesome problems and the recognition process can be done rapidly. The facial size and pose including out of plane rotation can be normalized to a generic 3D model from one still image and the input subspace is generated by perturbed cropped patterns in order to absorb the localization errors. Furthermore, by exploiting the fact that a normalized pattern is fitted to the generic 3D model, illumination robust features are extracted through the shading subspace orthogonalization. Evaluation experiments are performed using several databases and the results show the effectiveness of our method under various facial poses and illuminations","PeriodicalId":109260,"journal":{"name":"7th International Conference on Automatic Face and Gesture Recognition (FGR06)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114426664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Traditionally, motion estimation and segmentation have been performed mostly in the spatial domain, i.e., using the luminance information in the video sequence. Frequency domain representation offers an alternative, rich source of motion information, which has been used to a very limited extent in the past, and on relatively simple problems such as image registration. We review our work during the last few years on an approach to video motion analysis that combines spatial and Fourier domain information. We review our methods for (1) basic (translation and rotation) motion estimation and segmentation, for multiple moving objects, with constant as well as time varying velocities; and (2) more complicated motions, such as periodic motion, and periodic motion superposed on translation. The joint space analysis leads to more compact and computationally efficient solutions than existing techniques
{"title":"Joint spatial and frequency domain motion analysis","authors":"N. Ahuja, A. Briassouli","doi":"10.1109/FGR.2006.68","DOIUrl":"https://doi.org/10.1109/FGR.2006.68","url":null,"abstract":"Traditionally, motion estimation and segmentation have been performed mostly in the spatial domain, i.e., using the luminance information in the video sequence. Frequency domain representation offers an alternative, rich source of motion information, which has been used to a very limited extent in the past, and on relatively simple problems such as image registration. We review our work during the last few years on an approach to video motion analysis that combines spatial and Fourier domain information. We review our methods for (1) basic (translation and rotation) motion estimation and segmentation, for multiple moving objects, with constant as well as time varying velocities; and (2) more complicated motions, such as periodic motion, and periodic motion superposed on translation. The joint space analysis leads to more compact and computationally efficient solutions than existing techniques","PeriodicalId":109260,"journal":{"name":"7th International Conference on Automatic Face and Gesture Recognition (FGR06)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126256636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gabor filters based features, with their good properties of space-frequency localization and orientation selectivity, seem to be the most effective features for face recognition currently. In this paper, we propose a kind of weighted Gabor complex features which combining Gabor magnitude and phase features in unitary space. Its weights are determined according to recognition rates of magnitude and phase features. Meanwhile, subspace based algorithms, PCA and LDA, are generalized into unitary space, and a rarely used distance measure, unitary space cosine distance, is adopted for unitary subspace based recognition algorithms. Using the generalized subspace algorithms our proposed weighted Gabor complex features (WGCF) produce better recognition result than either Gabor magnitude or Gabor phase features. Experiments on FERET database show good results comparable to the best one reported in literature
{"title":"Weighted Gabor features in unitary space for face recognition","authors":"Yong Gao, Yangsheng Wang, Xinshan Zhu, Xuetao Feng, Xiaoxu Zhou","doi":"10.1109/FGR.2006.111","DOIUrl":"https://doi.org/10.1109/FGR.2006.111","url":null,"abstract":"Gabor filters based features, with their good properties of space-frequency localization and orientation selectivity, seem to be the most effective features for face recognition currently. In this paper, we propose a kind of weighted Gabor complex features which combining Gabor magnitude and phase features in unitary space. Its weights are determined according to recognition rates of magnitude and phase features. Meanwhile, subspace based algorithms, PCA and LDA, are generalized into unitary space, and a rarely used distance measure, unitary space cosine distance, is adopted for unitary subspace based recognition algorithms. Using the generalized subspace algorithms our proposed weighted Gabor complex features (WGCF) produce better recognition result than either Gabor magnitude or Gabor phase features. Experiments on FERET database show good results comparable to the best one reported in literature","PeriodicalId":109260,"journal":{"name":"7th International Conference on Automatic Face and Gesture Recognition (FGR06)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130994261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this work, we address a method that is able to track simultaneously 3D head movements and facial actions like lip and eyebrow movements in a video sequence. In a baseline framework, an adaptive appearance model is estimated online by the knowledge of a monocular video sequence. This method uses a 3D model of the face and a facial adaptive texture model. Then, we consider and compare two improved models in order to increase robustness to occlusions. First, we use robust statistics in order to downweight the hidden regions or outlier pixels. In a second approach, mixture models provides better integration of occlusions. Experiments demonstrate the benefit of the two robust models. The latter are compared under various occlusions
{"title":"Head and facial action tracking: comparison of two robust approaches","authors":"R. Hérault, F. Davoine, Yves Grandvalet","doi":"10.1109/FGR.2006.63","DOIUrl":"https://doi.org/10.1109/FGR.2006.63","url":null,"abstract":"In this work, we address a method that is able to track simultaneously 3D head movements and facial actions like lip and eyebrow movements in a video sequence. In a baseline framework, an adaptive appearance model is estimated online by the knowledge of a monocular video sequence. This method uses a 3D model of the face and a facial adaptive texture model. Then, we consider and compare two improved models in order to increase robustness to occlusions. First, we use robust statistics in order to downweight the hidden regions or outlier pixels. In a second approach, mixture models provides better integration of occlusions. Experiments demonstrate the benefit of the two robust models. The latter are compared under various occlusions","PeriodicalId":109260,"journal":{"name":"7th International Conference on Automatic Face and Gesture Recognition (FGR06)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130957840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Active statistical models including active shape models and active appearance models are very powerful for face alignment. They are composed of two parts: the subspace model(s) and the search process. While these two parts are closely correlated, existing efforts treated them separately and had not considered how to optimize them overall. Another problem with the subspace model(s) is that the two kinds of parameters of subspaces (the number of components and the constraints on the components) are also treated separately. So they are not jointly optimized. To tackle these two problems, an unified subspace optimization method is proposed. This method is composed of two unification aspects: (I) unification of the statistical model and the search process: the subspace models are optimized according to the search procedure; (2) unification of the number of components and the constraints: the two kinds of parameters are modelled in an unified way, such that they can be optimized jointly. Experimental results demonstrate that our method can effectively find the optimal subspace model and significantly improve the performance
{"title":"Face Alignment with Unified Subspace Optimization of Active Statistical Models","authors":"Ming Zhao, Tat-Seng Chua","doi":"10.1109/FGR.2006.40","DOIUrl":"https://doi.org/10.1109/FGR.2006.40","url":null,"abstract":"Active statistical models including active shape models and active appearance models are very powerful for face alignment. They are composed of two parts: the subspace model(s) and the search process. While these two parts are closely correlated, existing efforts treated them separately and had not considered how to optimize them overall. Another problem with the subspace model(s) is that the two kinds of parameters of subspaces (the number of components and the constraints on the components) are also treated separately. So they are not jointly optimized. To tackle these two problems, an unified subspace optimization method is proposed. This method is composed of two unification aspects: (I) unification of the statistical model and the search process: the subspace models are optimized according to the search procedure; (2) unification of the number of components and the constraints: the two kinds of parameters are modelled in an unified way, such that they can be optimized jointly. Experimental results demonstrate that our method can effectively find the optimal subspace model and significantly improve the performance","PeriodicalId":109260,"journal":{"name":"7th International Conference on Automatic Face and Gesture Recognition (FGR06)","volume":"128 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129625743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, a robust 3D dance posture recognition system using two cameras is proposed. A pair of wide-baseline video cameras with approximately orthogonal looking directions is used to reduce pose recognition ambiguities. Silhouettes extracted from these two views are represented using Gaussian mixture models (GMM) and used as features for recognition. Relevance vector machine (RVM) is deployed for robust pose recognition. The proposed system is trained using synthesized silhouettes created using animation software and motion capture data. The experimental results on synthetic and real images illustrate that the proposed approach can recognize 3D postures effectively. In addition, the system is easy to set up without any need of precise camera calibration
{"title":"Dance posture recognition using wide-baseline orthogonal stereo cameras","authors":"Feng Guo, G. Qian","doi":"10.1109/FGR.2006.35","DOIUrl":"https://doi.org/10.1109/FGR.2006.35","url":null,"abstract":"In this paper, a robust 3D dance posture recognition system using two cameras is proposed. A pair of wide-baseline video cameras with approximately orthogonal looking directions is used to reduce pose recognition ambiguities. Silhouettes extracted from these two views are represented using Gaussian mixture models (GMM) and used as features for recognition. Relevance vector machine (RVM) is deployed for robust pose recognition. The proposed system is trained using synthesized silhouettes created using animation software and motion capture data. The experimental results on synthetic and real images illustrate that the proposed approach can recognize 3D postures effectively. In addition, the system is easy to set up without any need of precise camera calibration","PeriodicalId":109260,"journal":{"name":"7th International Conference on Automatic Face and Gesture Recognition (FGR06)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115023365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Non-linear subspaces derived using kernel methods have been found to be superior compared to linear subspaces in modeling or classification tasks of several visual phenomena. Such kernel methods include kernel PCA, kernel DA, kernel SVD and kernel QR. Since incremental computation algorithms for these methods do not exist yet, the practicality of these methods on large datasets or online video processing is minimal. We propose an approximate incremental kernel SVD algorithm for computer vision applications that require estimation of non-linear subspaces, specifically face recognition by matching image sets obtained through long-term observations or video recordings. We extend a well-known linear subspace updating algorithm to the nonlinear case by utilizing the kernel trick, and apply a reduced set construction method to produce sparse expressions for the derived subspace basis so as to maintain constant processing speed and memory usage. Experimental results demonstrate the effectiveness of the proposed method
{"title":"Incremental kernel SVD for face recognition with image sets","authors":"Tat-Jun Chin, K. Schindler, D. Suter","doi":"10.1109/FGR.2006.67","DOIUrl":"https://doi.org/10.1109/FGR.2006.67","url":null,"abstract":"Non-linear subspaces derived using kernel methods have been found to be superior compared to linear subspaces in modeling or classification tasks of several visual phenomena. Such kernel methods include kernel PCA, kernel DA, kernel SVD and kernel QR. Since incremental computation algorithms for these methods do not exist yet, the practicality of these methods on large datasets or online video processing is minimal. We propose an approximate incremental kernel SVD algorithm for computer vision applications that require estimation of non-linear subspaces, specifically face recognition by matching image sets obtained through long-term observations or video recordings. We extend a well-known linear subspace updating algorithm to the nonlinear case by utilizing the kernel trick, and apply a reduced set construction method to produce sparse expressions for the derived subspace basis so as to maintain constant processing speed and memory usage. Experimental results demonstrate the effectiveness of the proposed method","PeriodicalId":109260,"journal":{"name":"7th International Conference on Automatic Face and Gesture Recognition (FGR06)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125083482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Illumination invariance remains the most researched, yet the most challenging aspect of automatic face recognition. In this paper we propose a novel, general recognition framework for efficient matching of individual face images, sets or sequences. The framework is based on simple image processing filters that compete with unprocessed greyscale input to yield a single matching score between individuals. It is shown how the discrepancy between illumination conditions between novel input and the training data set can be estimated and used to weigh the contribution of two competing representations. We describe an extensive empirical evaluation of the proposed method on 171 individuals and over 1300 video sequences with extreme illumination, pose and head motion variation. On this challenging data set our algorithm consistently demonstrated a dramatic performance improvement over traditional filtering approaches. We demonstrate a reduction of 50-75% in recognition error rates, the best performing method-filter combination correctly recognizing 96% of the individuals
{"title":"A new look at filtering techniques for illumination invariance in automatic face recognition","authors":"Ognjen Arandjelovic, R. Cipolla","doi":"10.1109/FGR.2006.14","DOIUrl":"https://doi.org/10.1109/FGR.2006.14","url":null,"abstract":"Illumination invariance remains the most researched, yet the most challenging aspect of automatic face recognition. In this paper we propose a novel, general recognition framework for efficient matching of individual face images, sets or sequences. The framework is based on simple image processing filters that compete with unprocessed greyscale input to yield a single matching score between individuals. It is shown how the discrepancy between illumination conditions between novel input and the training data set can be estimated and used to weigh the contribution of two competing representations. We describe an extensive empirical evaluation of the proposed method on 171 individuals and over 1300 video sequences with extreme illumination, pose and head motion variation. On this challenging data set our algorithm consistently demonstrated a dramatic performance improvement over traditional filtering approaches. We demonstrate a reduction of 50-75% in recognition error rates, the best performing method-filter combination correctly recognizing 96% of the individuals","PeriodicalId":109260,"journal":{"name":"7th International Conference on Automatic Face and Gesture Recognition (FGR06)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130228942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Estimating 3D head poses accurately in low resolution video is a challenging vision task because it is difficult to find continuous one-to-one mapping from person-independent low resolution visual representation to head pose parameters. We propose to track head poses by modeling the shape-free facial textures acquired from the video with subspace learning techniques. In particular, we propose to model the facial appearance variations online by incremental weighted PCA subspace with forgetting mechanism, and we do the tracking in an annealed particle filtering framework. Experiments show that, the tracking accuracy of our approach outperforms past visual face tracking algorithms especially in low resolution videos
{"title":"Accurate Head Pose Tracking in Low Resolution Video","authors":"J. Tu, Thomas S. Huang, Hai Tao","doi":"10.1109/FGR.2006.19","DOIUrl":"https://doi.org/10.1109/FGR.2006.19","url":null,"abstract":"Estimating 3D head poses accurately in low resolution video is a challenging vision task because it is difficult to find continuous one-to-one mapping from person-independent low resolution visual representation to head pose parameters. We propose to track head poses by modeling the shape-free facial textures acquired from the video with subspace learning techniques. In particular, we propose to model the facial appearance variations online by incremental weighted PCA subspace with forgetting mechanism, and we do the tracking in an annealed particle filtering framework. Experiments show that, the tracking accuracy of our approach outperforms past visual face tracking algorithms especially in low resolution videos","PeriodicalId":109260,"journal":{"name":"7th International Conference on Automatic Face and Gesture Recognition (FGR06)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129587071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the PLAYBOT project, we aim at assisting disabled children at play. To this end, we are developing a semi autonomous robotic wheelchair. It is equipped with several visual sensors and a robotic manipulator and thus conveniently enhances the innate capabilities of a disabled child. In addition to a touch screen, the child may control the wheelchair using simple head movements. As control based on head posture requires reliable face detection and head pose recognition, we are in need of a robust technique that may effortlessly be tailored to individual users. In this paper, we present a multilinear classification algorithm for fast and reliable face detection. It trains within seconds and thus can easily be customized to the home environment of a disabled child. Subsequent head pose recognition is done using support vector machines. Experimental results show that this two stage approach to head pose-based robotic wheelchair control performs fast and very robust
{"title":"Fast learning for customizable head pose recognition in robotic wheelchair control","authors":"C. Bauckhage, Thomas Käster, Andrei M. Rotenstein","doi":"10.1109/FGR.2006.52","DOIUrl":"https://doi.org/10.1109/FGR.2006.52","url":null,"abstract":"In the PLAYBOT project, we aim at assisting disabled children at play. To this end, we are developing a semi autonomous robotic wheelchair. It is equipped with several visual sensors and a robotic manipulator and thus conveniently enhances the innate capabilities of a disabled child. In addition to a touch screen, the child may control the wheelchair using simple head movements. As control based on head posture requires reliable face detection and head pose recognition, we are in need of a robust technique that may effortlessly be tailored to individual users. In this paper, we present a multilinear classification algorithm for fast and reliable face detection. It trains within seconds and thus can easily be customized to the home environment of a disabled child. Subsequent head pose recognition is done using support vector machines. Experimental results show that this two stage approach to head pose-based robotic wheelchair control performs fast and very robust","PeriodicalId":109260,"journal":{"name":"7th International Conference on Automatic Face and Gesture Recognition (FGR06)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115855519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}