Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840626
K. Yachi, T. Wada, T. Matsuyama
We propose a method for detecting and tracking a human head in real time from an image sequence. The proposed method has three advantages: (1) we employ a fixed-viewpoint pan-tilt-zoom camera to acquire image sequences; with the camera, we eliminate the variations in the head appearance due to camera rotations with respect to the viewpoint; (2) we prepare a variety of contour models of the head appearances and relate them to the camera parameters; this allows us to adaptively select the model to deal with the variations in the head appearance due to human activities; (3) we use the model parameters obtained by detecting the head in the previous image to estimate those to be fitted in the current image; this estimation facilitates computational time for the head detection. Accordingly, the accuracy of the detection and required computational time are both improved and, at the same time, the robust head detection and tracking are realized in almost real time. Experimental results in the real situation show the effectiveness of our method.
{"title":"Human head tracking using adaptive appearance models with a fixed-viewpoint pan-tilt-zoom camera","authors":"K. Yachi, T. Wada, T. Matsuyama","doi":"10.1109/AFGR.2000.840626","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840626","url":null,"abstract":"We propose a method for detecting and tracking a human head in real time from an image sequence. The proposed method has three advantages: (1) we employ a fixed-viewpoint pan-tilt-zoom camera to acquire image sequences; with the camera, we eliminate the variations in the head appearance due to camera rotations with respect to the viewpoint; (2) we prepare a variety of contour models of the head appearances and relate them to the camera parameters; this allows us to adaptively select the model to deal with the variations in the head appearance due to human activities; (3) we use the model parameters obtained by detecting the head in the previous image to estimate those to be fitted in the current image; this estimation facilitates computational time for the head detection. Accordingly, the accuracy of the detection and required computational time are both improved and, at the same time, the robust head detection and tracking are realized in almost real time. Experimental results in the real situation show the effectiveness of our method.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127577239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840623
R. Kjeldsen, Aya Aner
Tracking the face of a computer user as he looks at various parts of the screen is a fundamental tool for a variety of perceptual user interface applications. The authors have developed a simple but surprisingly robust tracking algorithm based on template matching and applied it successfully. This paper describes extensions to that algorithm, which improves performance at large facial rotation angles. The method is based on pre-distorting the single training template using 2D image transformations to simulate 3D facial rotations. The method avoids many of the problems associated with using a complex 3D head model. It is robust to variations in the environment and well-suited to use in practical applications in typical computing environments.
{"title":"Improving face tracking with 2D template warping","authors":"R. Kjeldsen, Aya Aner","doi":"10.1109/AFGR.2000.840623","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840623","url":null,"abstract":"Tracking the face of a computer user as he looks at various parts of the screen is a fundamental tool for a variety of perceptual user interface applications. The authors have developed a simple but surprisingly robust tracking algorithm based on template matching and applied it successfully. This paper describes extensions to that algorithm, which improves performance at large facial rotation angles. The method is based on pre-distorting the single training template using 2D image transformations to simulate 3D facial rotations. The method avoids many of the problems associated with using a complex 3D head model. It is robust to variations in the environment and well-suited to use in practical applications in typical computing environments.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126320685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840675
Yoichi Sato, Yoshinori Kobayashi, H. Koike
We introduce a fast and robust method for tracking positions of the centers and the fingertips of both right and left hands. Our method makes use of infrared camera images for reliable detection of a user's hands, and uses a template matching strategy for finding fingertips. This method is an essential part of our augmented desk interface in which a user can, with natural hand gestures, simultaneously manipulate both physical objects and electronically projected objects on a desk, e.g., a textbook and related WWW pages. Previous tracking methods which are typically based on color segmentation or background subtraction simply do not perform well in this type of application because an observed color of human skin and image backgrounds may change significantly due to protection of various objects onto a desk. In contrast, our proposed method was shown to be effective even in such a challenging situation through demonstration in our augmented desk interface. This paper describes the details of our tracking method as well as typical applications in our augmented desk interface.
{"title":"Fast tracking of hands and fingertips in infrared images for augmented desk interface","authors":"Yoichi Sato, Yoshinori Kobayashi, H. Koike","doi":"10.1109/AFGR.2000.840675","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840675","url":null,"abstract":"We introduce a fast and robust method for tracking positions of the centers and the fingertips of both right and left hands. Our method makes use of infrared camera images for reliable detection of a user's hands, and uses a template matching strategy for finding fingertips. This method is an essential part of our augmented desk interface in which a user can, with natural hand gestures, simultaneously manipulate both physical objects and electronically projected objects on a desk, e.g., a textbook and related WWW pages. Previous tracking methods which are typically based on color segmentation or background subtraction simply do not perform well in this type of application because an observed color of human skin and image backgrounds may change significantly due to protection of various objects onto a desk. In contrast, our proposed method was shown to be effective even in such a challenging situation through demonstration in our augmented desk interface. This paper describes the details of our tracking method as well as typical applications in our augmented desk interface.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132029725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840638
G. V. Wheeler, P. Courtney, Tim Cootes, C. Taylor
In recent years there has been much progress in the development of facial recognition systems. The FERET series of tests reported the black box performance of several such systems working on stored face images. Much less effort has been spent in studying the behaviour of systems under realistic conditions of use. We describe and analyse the result of a trial of a door access control system based on a model-based approach. The trial consisted of 10 registered users making over 200 accesses during a 2 week period. We describe the internal failure modes and the performance characteristics of the system, identify inter- and intra-person dependencies and make recommendations for future work.
{"title":"Performance assessment of face-verification based access control system","authors":"G. V. Wheeler, P. Courtney, Tim Cootes, C. Taylor","doi":"10.1109/AFGR.2000.840638","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840638","url":null,"abstract":"In recent years there has been much progress in the development of facial recognition systems. The FERET series of tests reported the black box performance of several such systems working on stored face images. Much less effort has been spent in studying the behaviour of systems under realistic conditions of use. We describe and analyse the result of a trial of a door access control system based on a model-based approach. The trial consisted of 10 registered users making over 200 accesses during a 2 week period. We describe the internal failure modes and the performance characteristics of the system, identify inter- and intra-person dependencies and make recommendations for future work.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132000681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840671
H. Sagawa, M. Takeuchi
To automatically interpret Japanese sign language (JSL), the recognition of signed words must be more accurate and the effects of extraneous gestures removed. We describe the parameters and the algorithms used to accomplish this. We experimented with 200 JSL sentences and demonstrated that recognition performance could be considerably improved.
{"title":"A method for recognizing a sequence of sign language words represented in a Japanese sign language sentence","authors":"H. Sagawa, M. Takeuchi","doi":"10.1109/AFGR.2000.840671","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840671","url":null,"abstract":"To automatically interpret Japanese sign language (JSL), the recognition of signed words must be more accurate and the effects of extraneous gestures removed. We describe the parameters and the algorithms used to accomplish this. We experimented with 200 JSL sentences and demonstrated that recognition performance could be considerably improved.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"9 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115477641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840654
Shoichiro Iwasawa, J. Ohya, Kazuhiko Takahashi, T. Sakaguchi, S. Morishima, K. Ebihara
This paper proposes a new real-time method for estimating human postures in 3D from trinocular images. In this method, an upper body orientation detection and a heuristic contour analysis are performed on the human silhouettes extracted from the trinocular images so that representative points such as the top of the head can be located. The major joint positions are estimated based on a genetic algorithm-based learning procedure. 3D coordinates of the representative points and joints are then obtained from the two views by evaluating the appropriateness of the three views. The proposed method implemented on a personal computer runs in real-time. Experimental results show high estimation accuracies and the effectiveness of the view selection process.
{"title":"Human body postures from trinocular camera images","authors":"Shoichiro Iwasawa, J. Ohya, Kazuhiko Takahashi, T. Sakaguchi, S. Morishima, K. Ebihara","doi":"10.1109/AFGR.2000.840654","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840654","url":null,"abstract":"This paper proposes a new real-time method for estimating human postures in 3D from trinocular images. In this method, an upper body orientation detection and a heuristic contour analysis are performed on the human silhouettes extracted from the trinocular images so that representative points such as the top of the head can be located. The major joint positions are estimated based on a genetic algorithm-based learning procedure. 3D coordinates of the representative points and joints are then obtained from the two views by evaluating the appropriateness of the three views. The proposed method implemented on a personal computer runs in real-time. Experimental results show high estimation accuracies and the effectiveness of the view selection process.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121932724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840624
K. Murakami, M. Tominaga, H. Koshimizu
Facial caricaturing is a representation process of human visual impression onto paper or other media. Facial caricaturing should be discussed from multiple viewpoints of three relations among the model, the caricaturist and the gallery. Furthermore, some kinds of interactive mechanism should be required between the caricaturist and the gallery. We propose a dynamic caricaturing system. In our system the utilization of an in-betweening method realizes the generation mechanism from the caricaturist to the gallery, and on the contrary, the utilization of eye-camera vision realizes the feedback mechanism from the gallery to the caricaturist. This is an original and unique point of our system. The gallery mounts an eye-camera on the head, and the system reflects visual characteristics of the gallery directly onto the works of facial caricature. After observing the image of the model and analyzing the gaze direction and distribution, the system deforms some characteristic and impressive facial parts more strongly than other non-impressive facial parts, and generates the caricature which is suited especially for the gallery. We demonstrate experimentally the effectivity of this method to integrate these kinds of viewpoints.
{"title":"Dynamic facial caricaturing system based on the gaze direction of gallery","authors":"K. Murakami, M. Tominaga, H. Koshimizu","doi":"10.1109/AFGR.2000.840624","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840624","url":null,"abstract":"Facial caricaturing is a representation process of human visual impression onto paper or other media. Facial caricaturing should be discussed from multiple viewpoints of three relations among the model, the caricaturist and the gallery. Furthermore, some kinds of interactive mechanism should be required between the caricaturist and the gallery. We propose a dynamic caricaturing system. In our system the utilization of an in-betweening method realizes the generation mechanism from the caricaturist to the gallery, and on the contrary, the utilization of eye-camera vision realizes the feedback mechanism from the gallery to the caricaturist. This is an original and unique point of our system. The gallery mounts an eye-camera on the head, and the system reflects visual characteristics of the gallery directly onto the works of facial caricature. After observing the image of the model and analyzing the gaze direction and distribution, the system deforms some characteristic and impressive facial parts more strongly than other non-impressive facial parts, and generates the caricature which is suited especially for the gallery. We demonstrate experimentally the effectivity of this method to integrate these kinds of viewpoints.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"176 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132249744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840619
J. Triesch, C. Malsburg
A mechanism for the self-organized integration of different adaptive cues is proposed. In democratic integration the cues agree on a result and each cue adapts towards the result agreed upon. A technical formulation of this scheme is employed in a face tracking system. The self-organized adaptivity lends itself to suppression and recalibration of discordant cues. Experiments show that the system is robust to sudden changes in the environment as long as the changes disrupt only a minority of cues at the same time, although all cues may be affected in the long run.
{"title":"Self-organized integration of adaptive visual cues for face tracking","authors":"J. Triesch, C. Malsburg","doi":"10.1109/AFGR.2000.840619","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840619","url":null,"abstract":"A mechanism for the self-organized integration of different adaptive cues is proposed. In democratic integration the cues agree on a result and each cue adapts towards the result agreed upon. A technical formulation of this scheme is employed in a face tracking system. The self-organized adaptivity lends itself to suppression and recalibration of discordant cues. Experiments show that the system is robust to sudden changes in the environment as long as the changes disrupt only a minority of cues at the same time, although all cues may be affected in the long run.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130769558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840680
Y. Matsumoto, A. Zelinsky
To build smart human interfaces, it is necessary for a system to know a user's intention and point of attention. Since the motion of a person's head pose and gaze direction are deeply related with his/her intention and attention, detection of such information can be utilized to build natural and intuitive interfaces. We describe our real-time stereo face tracking and gaze detection system to measure head pose and gaze direction simultaneously. The key aspect of our system is the use of real-time stereo vision together with a simple algorithm which is suitable for real-time processing. Since the 3D coordinates of the features on a face can be directly measured in our system, we can significantly simplify the algorithm for 3D model fitting to obtain the full 3D pose of the head compared with conventional systems that use monocular camera. Consequently we achieved a non-contact, passive, real-time, robust, accurate and compact measurement system for head pose and gaze direction.
{"title":"An algorithm for real-time stereo vision implementation of head pose and gaze direction measurement","authors":"Y. Matsumoto, A. Zelinsky","doi":"10.1109/AFGR.2000.840680","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840680","url":null,"abstract":"To build smart human interfaces, it is necessary for a system to know a user's intention and point of attention. Since the motion of a person's head pose and gaze direction are deeply related with his/her intention and attention, detection of such information can be utilized to build natural and intuitive interfaces. We describe our real-time stereo face tracking and gaze detection system to measure head pose and gaze direction simultaneously. The key aspect of our system is the use of real-time stereo vision together with a simple algorithm which is suitable for real-time processing. Since the 3D coordinates of the features on a face can be directly measured in our system, we can significantly simplify the algorithm for 3D model fitting to obtain the full 3D pose of the head compared with conventional systems that use monocular camera. Consequently we achieved a non-contact, passive, real-time, robust, accurate and compact measurement system for head pose and gaze direction.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133782724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-03-26DOI: 10.1109/AFGR.2000.840609
K. Hotta, T. Mishima, Takio Kurita, S. Umeyama
This paper presents a face matching method through information theoretical attention points. The attention points are selected as the points where the outputs of Gabor filters applied to the contrast-filtered image (Gabor features) have rich information. The information value of Gabor features of the certain point is used as the weight and the weighed sum of the correlations is used as the similarity measure for the matching. To cope with the scale changes of a face, several images with different scales are generated by interpolation from the input image and the best match is searched. By using the attention points given from the information theoretical point of view, the matching becomes robust under various environments. This matching method is applied to face detection of a known person and face classification. The effectiveness of the proposed method is confirmed by experiments using the face images captured over years under the different environments.
{"title":"Face matching through information theoretical attention points and its applications to face detection and classification","authors":"K. Hotta, T. Mishima, Takio Kurita, S. Umeyama","doi":"10.1109/AFGR.2000.840609","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840609","url":null,"abstract":"This paper presents a face matching method through information theoretical attention points. The attention points are selected as the points where the outputs of Gabor filters applied to the contrast-filtered image (Gabor features) have rich information. The information value of Gabor features of the certain point is used as the weight and the weighed sum of the correlations is used as the similarity measure for the matching. To cope with the scale changes of a face, several images with different scales are generated by interpolation from the input image and the best match is searched. By using the attention points given from the information theoretical point of view, the matching becomes robust under various environments. This matching method is applied to face detection of a known person and face classification. The effectiveness of the proposed method is confirmed by experiments using the face images captured over years under the different environments.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123237646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}