Pub Date : 2007-09-05DOI: 10.1109/AVSS.2007.4425327
V. Neagoe, A. Ropot, A. Mugioiu
This paper is dedicated to multispectral facial image recognition, using decision fusion of neural classifiers. The novelty of this paper is that any classifier is based on the model of Concurrent Self-Organizing Maps (CSOM), previously proposed by first author of this paper. Our main achievement is the implementation of a real time CSOM face recognition system using the decision fusion that combines the recognition scores generated from visual channels {(R, G, and B) or Y} with a thermal infrared classifier. As a source of color and infrared images, we used our VICFACE database of 38 subjects. Any picture has 160 times 120 pixels; for each subject there are pictures corresponding to various face expressions and illuminations, in the visual and infrared spectrum. The spectral sensitivity of infrared images corresponds to the long wave range of 7.5 - 13 mum. The very good experimental results are given regarding recognition score.
{"title":"Real time face recognition using decision fusion of neural classifiers in the visible and thermal infrared spectrum","authors":"V. Neagoe, A. Ropot, A. Mugioiu","doi":"10.1109/AVSS.2007.4425327","DOIUrl":"https://doi.org/10.1109/AVSS.2007.4425327","url":null,"abstract":"This paper is dedicated to multispectral facial image recognition, using decision fusion of neural classifiers. The novelty of this paper is that any classifier is based on the model of Concurrent Self-Organizing Maps (CSOM), previously proposed by first author of this paper. Our main achievement is the implementation of a real time CSOM face recognition system using the decision fusion that combines the recognition scores generated from visual channels {(R, G, and B) or Y} with a thermal infrared classifier. As a source of color and infrared images, we used our VICFACE database of 38 subjects. Any picture has 160 times 120 pixels; for each subject there are pictures corresponding to various face expressions and illuminations, in the visual and infrared spectrum. The spectral sensitivity of infrared images corresponds to the long wave range of 7.5 - 13 mum. The very good experimental results are given regarding recognition score.","PeriodicalId":371050,"journal":{"name":"2007 IEEE Conference on Advanced Video and Signal Based Surveillance","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131057057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-09-05DOI: 10.1109/AVSS.2007.4425276
J. Garofolo
NIST has been conducting a series of evaluations in the automatic analysis of information in video since 2001. These began within the NIST text retrieval evaluation (TREC) as a pilot track in searching for information in large collections of video. The evaluation series was spun off into its own evaluation/workshop series called TRECVID. TRECVID continues to examine the challenge of extracting features for search technologies. In 2004, NIST also began an evaluation series dedicated to assessing video object detection and tracking technologies using training and test sets that were significantly larger than those used in the past -facilitating novel machine learning approaches and supporting statistically-informative evaluation results. Eventually this effort was merged with other video processing evaluations being implemented in Europe under the classification of events, activities, and relationships (CLEAR) consortium. NIST's goal is to evolve these evaluations of video processing technologies towards a focus on the detection of visually observable events and 3D modeling and to help the computer vision community make strides in the areas of accuracy, robustness, and efficiency.
{"title":"Directions in automatic video analysis evaluations at NIST","authors":"J. Garofolo","doi":"10.1109/AVSS.2007.4425276","DOIUrl":"https://doi.org/10.1109/AVSS.2007.4425276","url":null,"abstract":"NIST has been conducting a series of evaluations in the automatic analysis of information in video since 2001. These began within the NIST text retrieval evaluation (TREC) as a pilot track in searching for information in large collections of video. The evaluation series was spun off into its own evaluation/workshop series called TRECVID. TRECVID continues to examine the challenge of extracting features for search technologies. In 2004, NIST also began an evaluation series dedicated to assessing video object detection and tracking technologies using training and test sets that were significantly larger than those used in the past -facilitating novel machine learning approaches and supporting statistically-informative evaluation results. Eventually this effort was merged with other video processing evaluations being implemented in Europe under the classification of events, activities, and relationships (CLEAR) consortium. NIST's goal is to evolve these evaluations of video processing technologies towards a focus on the detection of visually observable events and 3D modeling and to help the computer vision community make strides in the areas of accuracy, robustness, and efficiency.","PeriodicalId":371050,"journal":{"name":"2007 IEEE Conference on Advanced Video and Signal Based Surveillance","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127885477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-09-05DOI: 10.1109/AVSS.2007.4425372
F. Keyrouz, K. Diepold, S. Keyrouz
One of the key features of the human auditory system, is its nearly constant omni-directional sensitivity, e.g., the system reacts to alerting signals coming from a direction away from the sight of focused visual attention. In many surveillance situations where visual attention completely fails since the robot cameras have no direct line of sight with the sound sources, the ability to estimate the direction of the sources of danger relying on sound becomes extremely important. We present in this paper a novel method for sound localization in azimuth and elevation based on a humanoid head. The method was tested in simulations as well as in a real reverberant environment. Compared to state-of-the-art localization techniques the method is able to localize with high accuracy 3D sound sources even in the presence of reflections and high distortion.
{"title":"High performance 3D sound localization for surveillance applications","authors":"F. Keyrouz, K. Diepold, S. Keyrouz","doi":"10.1109/AVSS.2007.4425372","DOIUrl":"https://doi.org/10.1109/AVSS.2007.4425372","url":null,"abstract":"One of the key features of the human auditory system, is its nearly constant omni-directional sensitivity, e.g., the system reacts to alerting signals coming from a direction away from the sight of focused visual attention. In many surveillance situations where visual attention completely fails since the robot cameras have no direct line of sight with the sound sources, the ability to estimate the direction of the sources of danger relying on sound becomes extremely important. We present in this paper a novel method for sound localization in azimuth and elevation based on a humanoid head. The method was tested in simulations as well as in a real reverberant environment. Compared to state-of-the-art localization techniques the method is able to localize with high accuracy 3D sound sources even in the presence of reflections and high distortion.","PeriodicalId":371050,"journal":{"name":"2007 IEEE Conference on Advanced Video and Signal Based Surveillance","volume":"241 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116155612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-09-05DOI: 10.1109/AVSS.2007.4425293
A. Dore, A. Cattoni, C. Regazzoni
One of the main issues for Ambient Intelligence (AmI) systems is to continuously localize the user and to detect his/her identity in order to provide dedicated services. A video-radio fusion methodology, relying on the Particle Filter algorithm, is here proposed to track objects in a complex extensive environment, exploiting the complementary benefits provided by both systems. Visual tracking commonly outperforms radio localization in terms of precision but it is inefficient because of occlusions and illumination changes. Instead, radio measurements, gathered by a user's radio device, are unambiguously associated to the respective target through the "virtual" identity (i.e. MAC/IP addresses). The joint usage of the two data typologies allows a more robust tracking and a major flexibility in the architectural setting up of the AmI system. The method has been extensively tested in a simulated and off-line framework and on real world data proving its effectiveness.
{"title":"A particle filter based fusion framework for video-radio tracking in smart spaces","authors":"A. Dore, A. Cattoni, C. Regazzoni","doi":"10.1109/AVSS.2007.4425293","DOIUrl":"https://doi.org/10.1109/AVSS.2007.4425293","url":null,"abstract":"One of the main issues for Ambient Intelligence (AmI) systems is to continuously localize the user and to detect his/her identity in order to provide dedicated services. A video-radio fusion methodology, relying on the Particle Filter algorithm, is here proposed to track objects in a complex extensive environment, exploiting the complementary benefits provided by both systems. Visual tracking commonly outperforms radio localization in terms of precision but it is inefficient because of occlusions and illumination changes. Instead, radio measurements, gathered by a user's radio device, are unambiguously associated to the respective target through the \"virtual\" identity (i.e. MAC/IP addresses). The joint usage of the two data typologies allows a more robust tracking and a major flexibility in the architectural setting up of the AmI system. The method has been extensively tested in a simulated and off-line framework and on real world data proving its effectiveness.","PeriodicalId":371050,"journal":{"name":"2007 IEEE Conference on Advanced Video and Signal Based Surveillance","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116553916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-09-05DOI: 10.1109/AVSS.2007.4425336
H. Celik, A. Hanjalic, E. Hendriks
Object detection is a crucial step in automating monitoring and surveillance. A classical approach to object detection employs supervised learning methods, which are effective in well-defined narrow application scopes. In this paper we propose a framework for detecting moving objects in video, which first learns autonomously and on-line the characteristic features of typical object appearances at various parts of the observed scene. The collected knowledge is then used to calibrate the system for the given scene, and to separate isolated appearances of a dominant moving object from other events. Compared to the supervised detectors, the proposed framework is self-adaptable, and therefore able to handle large diversity of objects and situations, typical for general surveillance and monitoring applications. We demonstrate the effectiveness of our framework by employing it to isolate pedestrians in public places and cars on a highway.
{"title":"On the development of an autonomous and self-adaptable moving object detector","authors":"H. Celik, A. Hanjalic, E. Hendriks","doi":"10.1109/AVSS.2007.4425336","DOIUrl":"https://doi.org/10.1109/AVSS.2007.4425336","url":null,"abstract":"Object detection is a crucial step in automating monitoring and surveillance. A classical approach to object detection employs supervised learning methods, which are effective in well-defined narrow application scopes. In this paper we propose a framework for detecting moving objects in video, which first learns autonomously and on-line the characteristic features of typical object appearances at various parts of the observed scene. The collected knowledge is then used to calibrate the system for the given scene, and to separate isolated appearances of a dominant moving object from other events. Compared to the supervised detectors, the proposed framework is self-adaptable, and therefore able to handle large diversity of objects and situations, typical for general surveillance and monitoring applications. We demonstrate the effectiveness of our framework by employing it to isolate pedestrians in public places and cars on a highway.","PeriodicalId":371050,"journal":{"name":"2007 IEEE Conference on Advanced Video and Signal Based Surveillance","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131149184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-09-05DOI: 10.1109/AVSS.2007.4425328
B. Banks, Gary M. Jackson, J. Helly, David N. Chin, T. J. Smith, A. Schmidt, P. Brewer, Roger Medd, D. Masters, Annetta Burger, W. K. Krebs
The objective of this research is to identify, develop, adapt, prototype, integrate and demonstrate open access force protection and security technologies and processes. The goal is to provide more open public access to recreational and other non-restricted facilities on military bases and to improve the overall base safety and security utilizing advanced video and signal based surveillance. A testbed was created at the Pacific Missile Range Facility (PMRF), Kauai, Hawaii to demonstrate novel and innovative security solutions that serve these objectives. The testbed consists of (1) novel sensors (video cameras, radio frequency identification tags, and seismic, lidar, microwave, and infrared sensors), (2) a computer, data storage, and network infrastructure, and (3) behavior analysis software. The behavior analysis software identifies patterns of behavior and discriminates "normal" and "anomalous" behavior in order to anticipate and predict threats so that they can be interdicted before they impact mission critical operations or cause harm to people and infrastructure.
{"title":"Using behavior analysis algorithms to anticipate security threats before they impact mission critical operations","authors":"B. Banks, Gary M. Jackson, J. Helly, David N. Chin, T. J. Smith, A. Schmidt, P. Brewer, Roger Medd, D. Masters, Annetta Burger, W. K. Krebs","doi":"10.1109/AVSS.2007.4425328","DOIUrl":"https://doi.org/10.1109/AVSS.2007.4425328","url":null,"abstract":"The objective of this research is to identify, develop, adapt, prototype, integrate and demonstrate open access force protection and security technologies and processes. The goal is to provide more open public access to recreational and other non-restricted facilities on military bases and to improve the overall base safety and security utilizing advanced video and signal based surveillance. A testbed was created at the Pacific Missile Range Facility (PMRF), Kauai, Hawaii to demonstrate novel and innovative security solutions that serve these objectives. The testbed consists of (1) novel sensors (video cameras, radio frequency identification tags, and seismic, lidar, microwave, and infrared sensors), (2) a computer, data storage, and network infrastructure, and (3) behavior analysis software. The behavior analysis software identifies patterns of behavior and discriminates \"normal\" and \"anomalous\" behavior in order to anticipate and predict threats so that they can be interdicted before they impact mission critical operations or cause harm to people and infrastructure.","PeriodicalId":371050,"journal":{"name":"2007 IEEE Conference on Advanced Video and Signal Based Surveillance","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132561308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-09-05DOI: 10.1109/AVSS.2007.4425365
I. Grinias, G. Tziritas
A Bayesian, fully automatic moving object localization method is proposed, using inter-frame differences and background/foreground colour as discrimination cues. Change detection pixel classification to one of the labels "changed" or "unchanged" is obtained by mixture analysis, while histograms are used for statistical description of colours. High confidence, change detection based, statistical criteria are used to compute a map of initial labelled pixels. Finally, a region growing algorithm, which is named priority multi-label flooding algorithm, assigns pixels to labels using Bayesian dissimilarity criteria. Localization results on well-known benchmark image sequences as well as on webcam and compressed videos are presented.
{"title":"Foreground object localization using a flooding algorithm based on inter-frame change and colour","authors":"I. Grinias, G. Tziritas","doi":"10.1109/AVSS.2007.4425365","DOIUrl":"https://doi.org/10.1109/AVSS.2007.4425365","url":null,"abstract":"A Bayesian, fully automatic moving object localization method is proposed, using inter-frame differences and background/foreground colour as discrimination cues. Change detection pixel classification to one of the labels \"changed\" or \"unchanged\" is obtained by mixture analysis, while histograms are used for statistical description of colours. High confidence, change detection based, statistical criteria are used to compute a map of initial labelled pixels. Finally, a region growing algorithm, which is named priority multi-label flooding algorithm, assigns pixels to labels using Bayesian dissimilarity criteria. Localization results on well-known benchmark image sequences as well as on webcam and compressed videos are presented.","PeriodicalId":371050,"journal":{"name":"2007 IEEE Conference on Advanced Video and Signal Based Surveillance","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132660008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-09-05DOI: 10.1109/AVSS.2007.4425290
S. Soro, W. Heinzelman
Wireless networks of visual sensors have recently emerged as a new type of sensor-based intelligent system, with performance and complexity challenges that go beyond that of existing wireless sensor networks. The goal of the visual sensor network we examine is to provide a user with visual information from any arbitrary viewpoint within the monitored field. This can be accomplished by synthesizing image data from a selection of cameras whose fields of view overlap with the desired field of view. In this work, we compare two methods for the selection of the camera-nodes. The first method selects cameras that minimize the difference between the images provided by the selected cameras and the image that would be captured by a real camera from the desired viewpoint. The second method considers the energy limitations of the battery powered camera-nodes, as well as their importance in the 3D coverage preservation task. Simulations using both metrics for camera-node selection show a clear trade-off between the quality of the reconstructed image and the network's ability to provide full coverage of the monitored 3D space for a longer period of time.
{"title":"Camera selection in visual sensor networks","authors":"S. Soro, W. Heinzelman","doi":"10.1109/AVSS.2007.4425290","DOIUrl":"https://doi.org/10.1109/AVSS.2007.4425290","url":null,"abstract":"Wireless networks of visual sensors have recently emerged as a new type of sensor-based intelligent system, with performance and complexity challenges that go beyond that of existing wireless sensor networks. The goal of the visual sensor network we examine is to provide a user with visual information from any arbitrary viewpoint within the monitored field. This can be accomplished by synthesizing image data from a selection of cameras whose fields of view overlap with the desired field of view. In this work, we compare two methods for the selection of the camera-nodes. The first method selects cameras that minimize the difference between the images provided by the selected cameras and the image that would be captured by a real camera from the desired viewpoint. The second method considers the energy limitations of the battery powered camera-nodes, as well as their importance in the 3D coverage preservation task. Simulations using both metrics for camera-node selection show a clear trade-off between the quality of the reconstructed image and the network's ability to provide full coverage of the monitored 3D space for a longer period of time.","PeriodicalId":371050,"journal":{"name":"2007 IEEE Conference on Advanced Video and Signal Based Surveillance","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133490943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-09-05DOI: 10.1109/AVSS.2007.4425321
A. Bevilacqua, Stefano Vaccari
Computer vision techniques are widely employed in Traffic Monitoring Systems (TMS) to automatically derive statistical information on traffic flow and trigger alarms on significant events. Research in this field embraces a wide range of methods developed to recognize moving objects and to infer their behavior. Tracking systems are used to reconstruct trajectories of moving objects detected often by using background difference approaches. Errors in either motion detection or tracking can perturb the position of the object centroids used to build the trajectories. To cope with the unavoidable errors, we have conceived a method to detect centers of non-motion through recognizing short stability intervals. These are further connected to build the long stability interval used to measure the overall vehicle stopping time. Extensive experiments also accomplished on the sequences provided by AVSS 2007 prove the effectiveness of our approach to measure the maximum stopped delay, even through a comparison with the ground truth.
{"title":"Real time detection of stopped vehicles in traffic scenes","authors":"A. Bevilacqua, Stefano Vaccari","doi":"10.1109/AVSS.2007.4425321","DOIUrl":"https://doi.org/10.1109/AVSS.2007.4425321","url":null,"abstract":"Computer vision techniques are widely employed in Traffic Monitoring Systems (TMS) to automatically derive statistical information on traffic flow and trigger alarms on significant events. Research in this field embraces a wide range of methods developed to recognize moving objects and to infer their behavior. Tracking systems are used to reconstruct trajectories of moving objects detected often by using background difference approaches. Errors in either motion detection or tracking can perturb the position of the object centroids used to build the trajectories. To cope with the unavoidable errors, we have conceived a method to detect centers of non-motion through recognizing short stability intervals. These are further connected to build the long stability interval used to measure the overall vehicle stopping time. Extensive experiments also accomplished on the sequences provided by AVSS 2007 prove the effectiveness of our approach to measure the maximum stopped delay, even through a comparison with the ground truth.","PeriodicalId":371050,"journal":{"name":"2007 IEEE Conference on Advanced Video and Signal Based Surveillance","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132336532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-09-05DOI: 10.1109/AVSS.2007.4425331
Grégory Rogez, J. J. Guerrero, C. Orrite-Uruñuela
We present a view-invariant human feature extractor (shape+pose) for pedestrian monitoring in man-made environments. Our approach can be divided into 2 steps: firstly, a series of view-based models is built by discretizing the viewpoint with respect to the camera into several training views. During the online stage, the Homography that relates the image points to the closest and most adequate training plane is calculated using the dominant 3D directions. The input image is then warped to this training view and processed using the corresponding view-based model. After model fitting, the inverse transformation is performed on the resulting human features obtaining a segmented silhouette and a 2D pose estimation in the original input image. Experimental results demonstrate our system performs well, independently of the direction of motion, when it is applied to monocular sequences with high perspective effect.
{"title":"View-invariant human feature extraction for video-surveillance applications","authors":"Grégory Rogez, J. J. Guerrero, C. Orrite-Uruñuela","doi":"10.1109/AVSS.2007.4425331","DOIUrl":"https://doi.org/10.1109/AVSS.2007.4425331","url":null,"abstract":"We present a view-invariant human feature extractor (shape+pose) for pedestrian monitoring in man-made environments. Our approach can be divided into 2 steps: firstly, a series of view-based models is built by discretizing the viewpoint with respect to the camera into several training views. During the online stage, the Homography that relates the image points to the closest and most adequate training plane is calculated using the dominant 3D directions. The input image is then warped to this training view and processed using the corresponding view-based model. After model fitting, the inverse transformation is performed on the resulting human features obtaining a segmented silhouette and a 2D pose estimation in the original input image. Experimental results demonstrate our system performs well, independently of the direction of motion, when it is applied to monocular sequences with high perspective effect.","PeriodicalId":371050,"journal":{"name":"2007 IEEE Conference on Advanced Video and Signal Based Surveillance","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134369886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}