Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840664
A. Lanitis, C. Taylor
A large number of high-performance automatic face recognition systems have been reported in the literature. Many of them are robust to within class appearance variation of subjects such as variation in expression, lighting of subjects such as variation in expression, lighting and pose. However, most of the face identification systems developed are sensitive to changes in the age of individuals. We present experimental results to prove that the performance of automatic face recognition systems depends on the age difference of subjects between the training and test images. We also demonstrate that automatic age simulation techniques can be used for designing face recognition systems, robust to ageing variation. In this context, the perceived age of the subjects in the training and test images is modified before the training and classification procedures, so that ageing variation is eliminated. Experimental results demonstrate that the performance of our face recognition system can be improved significantly, when this approach is adopted.
{"title":"Towards automatic face identification robust to ageing variation","authors":"A. Lanitis, C. Taylor","doi":"10.1109/AFGR.2000.840664","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840664","url":null,"abstract":"A large number of high-performance automatic face recognition systems have been reported in the literature. Many of them are robust to within class appearance variation of subjects such as variation in expression, lighting of subjects such as variation in expression, lighting and pose. However, most of the face identification systems developed are sensitive to changes in the age of individuals. We present experimental results to prove that the performance of automatic face recognition systems depends on the age difference of subjects between the training and test images. We also demonstrate that automatic age simulation techniques can be used for designing face recognition systems, robust to ageing variation. In this context, the perceived age of the subjects in the training and test images is modified before the training and classification procedures, so that ageing variation is eliminated. Experimental results demonstrate that the performance of our face recognition system can be improved significantly, when this approach is adopted.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133792677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840636
K. Jonsson, J. Kittler, Yongping Li, Jiri Matas
The paper studies support vector machines (SVM) in the context of face verification and recognition. Our study supports the hypothesis that the SVM approach is able to extract the relevant discriminatory information from the training data and we present results showing superior performance in comparison with benchmark methods. However, when the representation space already captures and emphasises the discriminatory information (e.g., Fisher's linear discriminant), SVM loose their superiority. The results also indicate that the SVM are robust against changes in illumination provided these are adequately represented in the training data. The proposed system is evaluated on a large database of 295 people obtaining highly competitive results: an equal error rate of 1% for verification and a rank-one error rate of 2% for recognition (or 98% correct rank-one recognition).
{"title":"Learning support vectors for face verification and recognition","authors":"K. Jonsson, J. Kittler, Yongping Li, Jiri Matas","doi":"10.1109/AFGR.2000.840636","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840636","url":null,"abstract":"The paper studies support vector machines (SVM) in the context of face verification and recognition. Our study supports the hypothesis that the SVM approach is able to extract the relevant discriminatory information from the training data and we present results showing superior performance in comparison with benchmark methods. However, when the representation space already captures and emphasises the discriminatory information (e.g., Fisher's linear discriminant), SVM loose their superiority. The results also indicate that the SVM are robust against changes in illumination provided these are adequately represented in the training data. The proposed system is evaluated on a large database of 295 people obtaining highly competitive results: an equal error rate of 1% for verification and a rank-one error rate of 2% for recognition (or 98% correct rank-one recognition).","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"355 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131691024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840685
G. Feng, P. Yuen, J. Lai
It is known that 2D views of a person can be synthesised if the face 3D model of that person is available. This paper proposes a new method, called 3D spring-based face model (SBFM), to determine the precise face model of a person with different poses and facial expressions from a single image. The SBFM combines the concepts of generic 3D face model in computer graphics and deformable template in computer vision. Face image databases from MIT AI laboratory and Yale University are used to test our proposed method and the results are encouraging.
{"title":"Virtual view face image synthesis using 3D spring-based face model from a single image","authors":"G. Feng, P. Yuen, J. Lai","doi":"10.1109/AFGR.2000.840685","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840685","url":null,"abstract":"It is known that 2D views of a person can be synthesised if the face 3D model of that person is available. This paper proposes a new method, called 3D spring-based face model (SBFM), to determine the precise face model of a person with different poses and facial expressions from a single image. The SBFM combines the concepts of generic 3D face model in computer graphics and deformable template in computer vision. Face image databases from MIT AI laboratory and Yale University are used to test our proposed method and the results are encouraging.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133362127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840622
R. Newman, Y. Matsumoto, S. Rougeaux, A. Zelinsky
Computer systems which analyse human face/head motion have attracted significant attention recently as there are a number of interesting and useful applications. Not least among these is the goal of tracking the head in real time. A useful extension of this problem is to estimate the subject's gaze point in addition to his/her head pose. This paper describes a real-time stereo vision system which determines the head pose and gaze direction of a human subject. Its accuracy makes it useful for a number of applications including human/computer interaction, consumer research and ergonomic assessment.
{"title":"Real-time stereo tracking for head pose and gaze estimation","authors":"R. Newman, Y. Matsumoto, S. Rougeaux, A. Zelinsky","doi":"10.1109/AFGR.2000.840622","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840622","url":null,"abstract":"Computer systems which analyse human face/head motion have attracted significant attention recently as there are a number of interesting and useful applications. Not least among these is the goal of tracking the head in real time. A useful extension of this problem is to estimate the subject's gaze point in addition to his/her head pose. This paper describes a real-time stereo vision system which determines the head pose and gaze direction of a human subject. Its accuracy makes it useful for a number of applications including human/computer interaction, consumer research and ergonomic assessment.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"364 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114011035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840674
S. Marcel, O. Bernier, J. Viallet, D. Collobert
A new hand gesture recognition method based on input-output hidden Markov models is presented. This method deals with the dynamic aspects of gestures. Gestures are extracted from a sequence of video images by tracking the skin-color blobs corresponding to the hand into a body-face space centered on the face of the user. Our goal is to recognize two classes of gestures: deictic and symbolic.
{"title":"Hand gesture recognition using input-output hidden Markov models","authors":"S. Marcel, O. Bernier, J. Viallet, D. Collobert","doi":"10.1109/AFGR.2000.840674","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840674","url":null,"abstract":"A new hand gesture recognition method based on input-output hidden Markov models is presented. This method deals with the dynamic aspects of gestures. Gestures are extracted from a sequence of video images by tracking the skin-color blobs corresponding to the hand into a body-face space centered on the face of the user. Our goal is to recognize two classes of gestures: deictic and symbolic.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"139 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116369884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840607
Markus Weber, W. Einhäuser, M. Welling, P. Perona
We present a method to learn models of human heads for the purpose of detection from different viewing angles. We focus on a model where objects are represented as constellations of rigid features (parts). Variability is represented by a joint probability density function (PDF) on the shape of the constellation. In the first stage, the method automatically identifies distinctive features in the training set using an interest operator followed by vector quantization. The set of model parameters, including the shape PDF, is then learned using expectation maximization. Experiments show good generalization performance to novel viewpoints and unseen faces. Performance is above 90% correct with less than 1 s computation time per image.
{"title":"Viewpoint-invariant learning and detection of human heads","authors":"Markus Weber, W. Einhäuser, M. Welling, P. Perona","doi":"10.1109/AFGR.2000.840607","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840607","url":null,"abstract":"We present a method to learn models of human heads for the purpose of detection from different viewing angles. We focus on a model where objects are represented as constellations of rigid features (parts). Variability is represented by a joint probability density function (PDF) on the shape of the constellation. In the first stage, the method automatically identifies distinctive features in the training set using an interest operator followed by vector quantization. The set of model parameters, including the shape PDF, is then learned using expectation maximization. Experiments show good generalization performance to novel viewpoints and unseen faces. Performance is above 90% correct with less than 1 s computation time per image.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115450556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840648
Wenyi Zhao, R. Chellappa
Sensitivity to variations in pose is a challenging problem in face recognition using appearance-based methods. More specifically, the appearance of a face changes dramatically when viewing and/or lighting directions change. Various approaches have been proposed to solve this difficult problem. They can be broadly divided into three classes: (1) multiple image-based methods where multiple images of various poses per person are available; (2) hybrid methods where multiple example images are available during learning but only one database image per person is available during recognition; and (3) single image-based methods where no example-based learning is carried out. We present a method that comes under class 3. This method, based on shape-from-shading (SFS), improves the performance of a face recognition system in handling variations due to pose and illumination via image synthesis.
{"title":"SFS based view synthesis for robust face recognition","authors":"Wenyi Zhao, R. Chellappa","doi":"10.1109/AFGR.2000.840648","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840648","url":null,"abstract":"Sensitivity to variations in pose is a challenging problem in face recognition using appearance-based methods. More specifically, the appearance of a face changes dramatically when viewing and/or lighting directions change. Various approaches have been proposed to solve this difficult problem. They can be broadly divided into three classes: (1) multiple image-based methods where multiple images of various poses per person are available; (2) hybrid methods where multiple example images are available during learning but only one database image per person is available during recognition; and (3) single image-based methods where no example-based learning is carried out. We present a method that comes under class 3. This method, based on shape-from-shading (SFS), improves the performance of a face recognition system in handling variations due to pose and illumination via image synthesis.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115161927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840637
T. Sim, R. Sukthankar, M. D. Mullin, S. Baluja
We show that a simple, memory-based technique for appearance-based face recognition, motivated by the real-world task of visitor identification, can outperform more sophisticated algorithms that use principal components analysis (PCA) and neural networks. This technique is closely related to correlation templates; however, we show that the use of novel similarity measures greatly improves performance. We also show that augmenting the memory base with additional, synthetic face images results in further improvements in performance. Results of extensive empirical testing on two standard face recognition datasets are presented, and direct comparisons with published work show that our algorithm achieves comparable (or superior) results. Our system is incorporated into an automated visitor identification system that has been operating successfully in an outdoor environment since January 1999.
{"title":"Memory-based face recognition for visitor identification","authors":"T. Sim, R. Sukthankar, M. D. Mullin, S. Baluja","doi":"10.1109/AFGR.2000.840637","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840637","url":null,"abstract":"We show that a simple, memory-based technique for appearance-based face recognition, motivated by the real-world task of visitor identification, can outperform more sophisticated algorithms that use principal components analysis (PCA) and neural networks. This technique is closely related to correlation templates; however, we show that the use of novel similarity measures greatly improves performance. We also show that augmenting the memory base with additional, synthetic face images results in further improvements in performance. Results of extensive empirical testing on two standard face recognition datasets are presented, and direct comparisons with published work show that our algorithm achieves comparable (or superior) results. Our system is incorporated into an automated visitor identification system that has been operating successfully in an outdoor environment since January 1999.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"218 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123336054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840652
Olivier Chomat, J. Crowley
This paper presents a new technique for the perception of activities using a statistical description of spatio-temporal properties. With this approach, the probability of an activity in a spatio-temporal image sequence is computed by applying a Bayes rule to the joint statistics of the responses of motion energy receptive fields. A set of motion energy receptive fields is designed in order to sample the power spectrum of a moving texture. Their structure relates to the spatio-temporal energy models of Adelson and Bergen where measures of local visual motion information are extracted comparing the outputs of triad of Gabor energy filters. Then the probability density function required for the Bayes rule is estimated for each class of activity by computing multi-dimensional histograms from the outputs from the set of receptive fields. The perception of activities is achieved according to the Bayes rule. The result at a given time is the map of the conditional probabilities that each pixel belongs to an activity of the training set. The approach is validated with experiments in the perception of activities of walking persons in a visual surveillance scenario. Results are robust to changes in illumination conditions, to occlusions and to changes in texture.
{"title":"A probabilistic sensor for the perception of activities","authors":"Olivier Chomat, J. Crowley","doi":"10.1109/AFGR.2000.840652","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840652","url":null,"abstract":"This paper presents a new technique for the perception of activities using a statistical description of spatio-temporal properties. With this approach, the probability of an activity in a spatio-temporal image sequence is computed by applying a Bayes rule to the joint statistics of the responses of motion energy receptive fields. A set of motion energy receptive fields is designed in order to sample the power spectrum of a moving texture. Their structure relates to the spatio-temporal energy models of Adelson and Bergen where measures of local visual motion information are extracted comparing the outputs of triad of Gabor energy filters. Then the probability density function required for the Bayes rule is estimated for each class of activity by computing multi-dimensional histograms from the outputs from the set of receptive fields. The perception of activities is achieved according to the Bayes rule. The result at a given time is the map of the conditional probabilities that each pixel belongs to an activity of the training set. The approach is validated with experiments in the perception of activities of walking persons in a visual surveillance scenario. Results are robust to changes in illumination conditions, to occlusions and to changes in texture.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123673421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840608
Chongzhen Zhang, F. Cohen
We describe a novel approach for creating a 3D face structure from multiple image views of a human face taken at a priori unknown poses by appropriately morphing a generic 3D face. A 3D cubic explicit polynomial is used to morph a generic face into the specific face structure. This allows the creation of a database of 3D faces that is used in identifying a person (in the database) from one or more arbitrary image view(s). The estimation of a 3D person's face and its recognition from the database of faces is achieved through the use of a distance map metric. The use of this metric avoids either resorting to the formidable task of establishing feature point correspondences in the image views, or even more severely, relying on the extremely view-sensitive image intensity (texture). Experimental results are shown for images of real faces, and excellent results are obtained.
{"title":"Face shape extraction and recognition using 3D morphing and distance mapping","authors":"Chongzhen Zhang, F. Cohen","doi":"10.1109/AFGR.2000.840608","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840608","url":null,"abstract":"We describe a novel approach for creating a 3D face structure from multiple image views of a human face taken at a priori unknown poses by appropriately morphing a generic 3D face. A 3D cubic explicit polynomial is used to morph a generic face into the specific face structure. This allows the creation of a database of 3D faces that is used in identifying a person (in the database) from one or more arbitrary image view(s). The estimation of a 3D person's face and its recognition from the database of faces is achieved through the use of a distance map metric. The use of this metric avoids either resorting to the formidable task of establishing feature point correspondences in the image views, or even more severely, relying on the extremely view-sensitive image intensity (texture). Experimental results are shown for images of real faces, and excellent results are obtained.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"372 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126030961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}