Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840664
A. Lanitis, C. Taylor
A large number of high-performance automatic face recognition systems have been reported in the literature. Many of them are robust to within class appearance variation of subjects such as variation in expression, lighting of subjects such as variation in expression, lighting and pose. However, most of the face identification systems developed are sensitive to changes in the age of individuals. We present experimental results to prove that the performance of automatic face recognition systems depends on the age difference of subjects between the training and test images. We also demonstrate that automatic age simulation techniques can be used for designing face recognition systems, robust to ageing variation. In this context, the perceived age of the subjects in the training and test images is modified before the training and classification procedures, so that ageing variation is eliminated. Experimental results demonstrate that the performance of our face recognition system can be improved significantly, when this approach is adopted.
{"title":"Towards automatic face identification robust to ageing variation","authors":"A. Lanitis, C. Taylor","doi":"10.1109/AFGR.2000.840664","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840664","url":null,"abstract":"A large number of high-performance automatic face recognition systems have been reported in the literature. Many of them are robust to within class appearance variation of subjects such as variation in expression, lighting of subjects such as variation in expression, lighting and pose. However, most of the face identification systems developed are sensitive to changes in the age of individuals. We present experimental results to prove that the performance of automatic face recognition systems depends on the age difference of subjects between the training and test images. We also demonstrate that automatic age simulation techniques can be used for designing face recognition systems, robust to ageing variation. In this context, the perceived age of the subjects in the training and test images is modified before the training and classification procedures, so that ageing variation is eliminated. Experimental results demonstrate that the performance of our face recognition system can be improved significantly, when this approach is adopted.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133792677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840622
R. Newman, Y. Matsumoto, S. Rougeaux, A. Zelinsky
Computer systems which analyse human face/head motion have attracted significant attention recently as there are a number of interesting and useful applications. Not least among these is the goal of tracking the head in real time. A useful extension of this problem is to estimate the subject's gaze point in addition to his/her head pose. This paper describes a real-time stereo vision system which determines the head pose and gaze direction of a human subject. Its accuracy makes it useful for a number of applications including human/computer interaction, consumer research and ergonomic assessment.
{"title":"Real-time stereo tracking for head pose and gaze estimation","authors":"R. Newman, Y. Matsumoto, S. Rougeaux, A. Zelinsky","doi":"10.1109/AFGR.2000.840622","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840622","url":null,"abstract":"Computer systems which analyse human face/head motion have attracted significant attention recently as there are a number of interesting and useful applications. Not least among these is the goal of tracking the head in real time. A useful extension of this problem is to estimate the subject's gaze point in addition to his/her head pose. This paper describes a real-time stereo vision system which determines the head pose and gaze direction of a human subject. Its accuracy makes it useful for a number of applications including human/computer interaction, consumer research and ergonomic assessment.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"364 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114011035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840648
Wenyi Zhao, R. Chellappa
Sensitivity to variations in pose is a challenging problem in face recognition using appearance-based methods. More specifically, the appearance of a face changes dramatically when viewing and/or lighting directions change. Various approaches have been proposed to solve this difficult problem. They can be broadly divided into three classes: (1) multiple image-based methods where multiple images of various poses per person are available; (2) hybrid methods where multiple example images are available during learning but only one database image per person is available during recognition; and (3) single image-based methods where no example-based learning is carried out. We present a method that comes under class 3. This method, based on shape-from-shading (SFS), improves the performance of a face recognition system in handling variations due to pose and illumination via image synthesis.
{"title":"SFS based view synthesis for robust face recognition","authors":"Wenyi Zhao, R. Chellappa","doi":"10.1109/AFGR.2000.840648","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840648","url":null,"abstract":"Sensitivity to variations in pose is a challenging problem in face recognition using appearance-based methods. More specifically, the appearance of a face changes dramatically when viewing and/or lighting directions change. Various approaches have been proposed to solve this difficult problem. They can be broadly divided into three classes: (1) multiple image-based methods where multiple images of various poses per person are available; (2) hybrid methods where multiple example images are available during learning but only one database image per person is available during recognition; and (3) single image-based methods where no example-based learning is carried out. We present a method that comes under class 3. This method, based on shape-from-shading (SFS), improves the performance of a face recognition system in handling variations due to pose and illumination via image synthesis.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115161927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840607
Markus Weber, W. Einhäuser, M. Welling, P. Perona
We present a method to learn models of human heads for the purpose of detection from different viewing angles. We focus on a model where objects are represented as constellations of rigid features (parts). Variability is represented by a joint probability density function (PDF) on the shape of the constellation. In the first stage, the method automatically identifies distinctive features in the training set using an interest operator followed by vector quantization. The set of model parameters, including the shape PDF, is then learned using expectation maximization. Experiments show good generalization performance to novel viewpoints and unseen faces. Performance is above 90% correct with less than 1 s computation time per image.
{"title":"Viewpoint-invariant learning and detection of human heads","authors":"Markus Weber, W. Einhäuser, M. Welling, P. Perona","doi":"10.1109/AFGR.2000.840607","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840607","url":null,"abstract":"We present a method to learn models of human heads for the purpose of detection from different viewing angles. We focus on a model where objects are represented as constellations of rigid features (parts). Variability is represented by a joint probability density function (PDF) on the shape of the constellation. In the first stage, the method automatically identifies distinctive features in the training set using an interest operator followed by vector quantization. The set of model parameters, including the shape PDF, is then learned using expectation maximization. Experiments show good generalization performance to novel viewpoints and unseen faces. Performance is above 90% correct with less than 1 s computation time per image.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115450556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840637
T. Sim, R. Sukthankar, M. D. Mullin, S. Baluja
We show that a simple, memory-based technique for appearance-based face recognition, motivated by the real-world task of visitor identification, can outperform more sophisticated algorithms that use principal components analysis (PCA) and neural networks. This technique is closely related to correlation templates; however, we show that the use of novel similarity measures greatly improves performance. We also show that augmenting the memory base with additional, synthetic face images results in further improvements in performance. Results of extensive empirical testing on two standard face recognition datasets are presented, and direct comparisons with published work show that our algorithm achieves comparable (or superior) results. Our system is incorporated into an automated visitor identification system that has been operating successfully in an outdoor environment since January 1999.
{"title":"Memory-based face recognition for visitor identification","authors":"T. Sim, R. Sukthankar, M. D. Mullin, S. Baluja","doi":"10.1109/AFGR.2000.840637","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840637","url":null,"abstract":"We show that a simple, memory-based technique for appearance-based face recognition, motivated by the real-world task of visitor identification, can outperform more sophisticated algorithms that use principal components analysis (PCA) and neural networks. This technique is closely related to correlation templates; however, we show that the use of novel similarity measures greatly improves performance. We also show that augmenting the memory base with additional, synthetic face images results in further improvements in performance. Results of extensive empirical testing on two standard face recognition datasets are presented, and direct comparisons with published work show that our algorithm achieves comparable (or superior) results. Our system is incorporated into an automated visitor identification system that has been operating successfully in an outdoor environment since January 1999.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"218 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123336054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840652
Olivier Chomat, J. Crowley
This paper presents a new technique for the perception of activities using a statistical description of spatio-temporal properties. With this approach, the probability of an activity in a spatio-temporal image sequence is computed by applying a Bayes rule to the joint statistics of the responses of motion energy receptive fields. A set of motion energy receptive fields is designed in order to sample the power spectrum of a moving texture. Their structure relates to the spatio-temporal energy models of Adelson and Bergen where measures of local visual motion information are extracted comparing the outputs of triad of Gabor energy filters. Then the probability density function required for the Bayes rule is estimated for each class of activity by computing multi-dimensional histograms from the outputs from the set of receptive fields. The perception of activities is achieved according to the Bayes rule. The result at a given time is the map of the conditional probabilities that each pixel belongs to an activity of the training set. The approach is validated with experiments in the perception of activities of walking persons in a visual surveillance scenario. Results are robust to changes in illumination conditions, to occlusions and to changes in texture.
{"title":"A probabilistic sensor for the perception of activities","authors":"Olivier Chomat, J. Crowley","doi":"10.1109/AFGR.2000.840652","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840652","url":null,"abstract":"This paper presents a new technique for the perception of activities using a statistical description of spatio-temporal properties. With this approach, the probability of an activity in a spatio-temporal image sequence is computed by applying a Bayes rule to the joint statistics of the responses of motion energy receptive fields. A set of motion energy receptive fields is designed in order to sample the power spectrum of a moving texture. Their structure relates to the spatio-temporal energy models of Adelson and Bergen where measures of local visual motion information are extracted comparing the outputs of triad of Gabor energy filters. Then the probability density function required for the Bayes rule is estimated for each class of activity by computing multi-dimensional histograms from the outputs from the set of receptive fields. The perception of activities is achieved according to the Bayes rule. The result at a given time is the map of the conditional probabilities that each pixel belongs to an activity of the training set. The approach is validated with experiments in the perception of activities of walking persons in a visual surveillance scenario. Results are robust to changes in illumination conditions, to occlusions and to changes in texture.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123673421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840677
Hiroyuki Segawa, H. Shioya, N. Hiraki, T. Totsuka
3D articulated motion is recovered from image sequences by relying on a recursive smoothing framework. In conventional recursive filtering frameworks, the filter may misestimate the state due to degenerated observation. To cope with this problem, we take into account knowledge about the limitations of the state-space. Our novel estimation framework relies on the combination of a smoothing algorithm with a "constraint-conscious" enhanced Kalman filter. The technique is shown to be effective for the recovery of experimental 3D articulated motions, making it a good candidate for marker-less motion capture applications.
{"title":"Constraint-conscious smoothing framework for the recovery of 3D articulated motion from image sequences","authors":"Hiroyuki Segawa, H. Shioya, N. Hiraki, T. Totsuka","doi":"10.1109/AFGR.2000.840677","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840677","url":null,"abstract":"3D articulated motion is recovered from image sequences by relying on a recursive smoothing framework. In conventional recursive filtering frameworks, the filter may misestimate the state due to degenerated observation. To cope with this problem, we take into account knowledge about the limitations of the state-space. Our novel estimation framework relies on the combination of a smoothing algorithm with a \"constraint-conscious\" enhanced Kalman filter. The technique is shown to be effective for the recovery of experimental 3D articulated motions, making it a good candidate for marker-less motion capture applications.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122796489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840608
Chongzhen Zhang, F. Cohen
We describe a novel approach for creating a 3D face structure from multiple image views of a human face taken at a priori unknown poses by appropriately morphing a generic 3D face. A 3D cubic explicit polynomial is used to morph a generic face into the specific face structure. This allows the creation of a database of 3D faces that is used in identifying a person (in the database) from one or more arbitrary image view(s). The estimation of a 3D person's face and its recognition from the database of faces is achieved through the use of a distance map metric. The use of this metric avoids either resorting to the formidable task of establishing feature point correspondences in the image views, or even more severely, relying on the extremely view-sensitive image intensity (texture). Experimental results are shown for images of real faces, and excellent results are obtained.
{"title":"Face shape extraction and recognition using 3D morphing and distance mapping","authors":"Chongzhen Zhang, F. Cohen","doi":"10.1109/AFGR.2000.840608","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840608","url":null,"abstract":"We describe a novel approach for creating a 3D face structure from multiple image views of a human face taken at a priori unknown poses by appropriately morphing a generic 3D face. A 3D cubic explicit polynomial is used to morph a generic face into the specific face structure. This allows the creation of a database of 3D faces that is used in identifying a person (in the database) from one or more arbitrary image view(s). The estimation of a 3D person's face and its recognition from the database of faces is achieved through the use of a distance map metric. The use of this metric avoids either resorting to the formidable task of establishing feature point correspondences in the image views, or even more severely, relying on the extremely view-sensitive image intensity (texture). Experimental results are shown for images of real faces, and excellent results are obtained.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"372 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126030961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840674
S. Marcel, O. Bernier, J. Viallet, D. Collobert
A new hand gesture recognition method based on input-output hidden Markov models is presented. This method deals with the dynamic aspects of gestures. Gestures are extracted from a sequence of video images by tracking the skin-color blobs corresponding to the hand into a body-face space centered on the face of the user. Our goal is to recognize two classes of gestures: deictic and symbolic.
{"title":"Hand gesture recognition using input-output hidden Markov models","authors":"S. Marcel, O. Bernier, J. Viallet, D. Collobert","doi":"10.1109/AFGR.2000.840674","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840674","url":null,"abstract":"A new hand gesture recognition method based on input-output hidden Markov models is presented. This method deals with the dynamic aspects of gestures. Gestures are extracted from a sequence of video images by tracking the skin-color blobs corresponding to the hand into a body-face space centered on the face of the user. Our goal is to recognize two classes of gestures: deictic and symbolic.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"139 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116369884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840685
G. Feng, P. Yuen, J. Lai
It is known that 2D views of a person can be synthesised if the face 3D model of that person is available. This paper proposes a new method, called 3D spring-based face model (SBFM), to determine the precise face model of a person with different poses and facial expressions from a single image. The SBFM combines the concepts of generic 3D face model in computer graphics and deformable template in computer vision. Face image databases from MIT AI laboratory and Yale University are used to test our proposed method and the results are encouraging.
{"title":"Virtual view face image synthesis using 3D spring-based face model from a single image","authors":"G. Feng, P. Yuen, J. Lai","doi":"10.1109/AFGR.2000.840685","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840685","url":null,"abstract":"It is known that 2D views of a person can be synthesised if the face 3D model of that person is available. This paper proposes a new method, called 3D spring-based face model (SBFM), to determine the precise face model of a person with different poses and facial expressions from a single image. The SBFM combines the concepts of generic 3D face model in computer graphics and deformable template in computer vision. Face image databases from MIT AI laboratory and Yale University are used to test our proposed method and the results are encouraging.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133362127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}