Pub Date : 2007-09-05DOI: 10.1109/AVSS.2007.4425348
A. Senior, L. Brown, A. Hampapur, Chiao-Fe Shu, Y. Zhai, R. Feris, Ying-li Tian, S. Borger, Christopher R. Carlson
We describe a set of tools for retail analytics based on a combination of video understanding and transaction-log. Tools are provided for loss prevention (returns fraud and cashier fraud), store operations (customer counting) and merchandising (display effectiveness). Results are presented on returns fraud and customer counting.
{"title":"Video analytics for retail","authors":"A. Senior, L. Brown, A. Hampapur, Chiao-Fe Shu, Y. Zhai, R. Feris, Ying-li Tian, S. Borger, Christopher R. Carlson","doi":"10.1109/AVSS.2007.4425348","DOIUrl":"https://doi.org/10.1109/AVSS.2007.4425348","url":null,"abstract":"We describe a set of tools for retail analytics based on a combination of video understanding and transaction-log. Tools are provided for loss prevention (returns fraud and cashier fraud), store operations (customer counting) and merchandising (display effectiveness). Results are presented on returns fraud and customer counting.","PeriodicalId":371050,"journal":{"name":"2007 IEEE Conference on Advanced Video and Signal Based Surveillance","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126740364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-09-05DOI: 10.1109/AVSS.2007.4425349
C. Panagiotakis, E. Ramasso, G. Tziritas, M. Rombaut, D. Pellerin
We propose a general framework that focuses on automatic individual/multiple people motion-shape analysis and on suitable features extraction that can be used on action/activity recognition problems under real, dynamical and unconstrained environments. We have considered various athletic videos from a single uncalibrated, possibly moving camera in order to evaluate the robustness of the proposed method. We have used an easily expanded hierarchical scheme in order to classify them to videos of individual and team sports. Robust, adaptive and independent from the camera motion, the proposed features are combined within Transferable Belief Model (TBM) framework providing a two level (frames and shot) video categorization. The experimental results of 97% individual/team sport categorization accuracy, using a dataset of more than 250 videos of athletic meetings indicate the good performance of the proposed scheme.
{"title":"Automatic people detection and counting for athletic videos classification","authors":"C. Panagiotakis, E. Ramasso, G. Tziritas, M. Rombaut, D. Pellerin","doi":"10.1109/AVSS.2007.4425349","DOIUrl":"https://doi.org/10.1109/AVSS.2007.4425349","url":null,"abstract":"We propose a general framework that focuses on automatic individual/multiple people motion-shape analysis and on suitable features extraction that can be used on action/activity recognition problems under real, dynamical and unconstrained environments. We have considered various athletic videos from a single uncalibrated, possibly moving camera in order to evaluate the robustness of the proposed method. We have used an easily expanded hierarchical scheme in order to classify them to videos of individual and team sports. Robust, adaptive and independent from the camera motion, the proposed features are combined within Transferable Belief Model (TBM) framework providing a two level (frames and shot) video categorization. The experimental results of 97% individual/team sport categorization accuracy, using a dataset of more than 250 videos of athletic meetings indicate the good performance of the proposed scheme.","PeriodicalId":371050,"journal":{"name":"2007 IEEE Conference on Advanced Video and Signal Based Surveillance","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122667984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-09-05DOI: 10.1109/AVSS.2007.4425339
Ghassan O. Karame, A. Stergiou, N. Katsarakis, Panagiotis Papageorgiou, Aristodemos Pnevmatikakis
In this paper, we address face tracking of multiple people in complex 3D scenes, using multiple calibrated and synchronized far-field recordings. We localize faces in every camera view and associate them across the different views. To cope with the complexity of 2D face localization introduced by the multitude of people and unconstrained face poses, a combination of stochastic and deterministic trackers, detectors and a Gaussian mixture model for face validation are utilized. Then faces of the same person seen from the different cameras are associated by first finding all possible associations and then choosing the best option by means of a 3D stochastic tracker. The performance of the proposed system is evaluated and is found enhanced compared to existing systems.
{"title":"2D and 3D face localization for complex scenes","authors":"Ghassan O. Karame, A. Stergiou, N. Katsarakis, Panagiotis Papageorgiou, Aristodemos Pnevmatikakis","doi":"10.1109/AVSS.2007.4425339","DOIUrl":"https://doi.org/10.1109/AVSS.2007.4425339","url":null,"abstract":"In this paper, we address face tracking of multiple people in complex 3D scenes, using multiple calibrated and synchronized far-field recordings. We localize faces in every camera view and associate them across the different views. To cope with the complexity of 2D face localization introduced by the multitude of people and unconstrained face poses, a combination of stochastic and deterministic trackers, detectors and a Gaussian mixture model for face validation are utilized. Then faces of the same person seen from the different cameras are associated by first finding all possible associations and then choosing the best option by means of a 3D stochastic tracker. The performance of the proposed system is evaluated and is found enhanced compared to existing systems.","PeriodicalId":371050,"journal":{"name":"2007 IEEE Conference on Advanced Video and Signal Based Surveillance","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126145110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-09-05DOI: 10.1109/AVSS.2007.4425284
M. Bae, A. Razdan, G. Farin
This paper presents a fully automated 3D face authentication (verification) and recognition (identification) method and recent results from our work in this area. The major contributions of our paper are: (a) the method can handle data with different facial expressions including hair, upper body, clothing, etc. and (b) development of weighted features for discrimination. The input to our system is a triangular mesh and it outputs a matching % against a gallery. Our method includes both surface and curve based features that are automatically extracted from a given face data. The test set for authentication consisted of 117 different people with 421 scans including different facial expressions. Our study shows equal error rate (EER) at 0.065% for normal faces and 1.13% in faces with expressions. We report verification rates of 100% in normal faces and 93.12% in faces with expressions at 0.1% FAR. For identification, our experiment shows 100% rate in normal faces and 95.6% in faces with expressions. From our experiment we conclude that combining feature points, profile curve, and partial face surface matching gives better authentication and recognition rate than any single matching method.
{"title":"Automated 3D Face authentication & recognition","authors":"M. Bae, A. Razdan, G. Farin","doi":"10.1109/AVSS.2007.4425284","DOIUrl":"https://doi.org/10.1109/AVSS.2007.4425284","url":null,"abstract":"This paper presents a fully automated 3D face authentication (verification) and recognition (identification) method and recent results from our work in this area. The major contributions of our paper are: (a) the method can handle data with different facial expressions including hair, upper body, clothing, etc. and (b) development of weighted features for discrimination. The input to our system is a triangular mesh and it outputs a matching % against a gallery. Our method includes both surface and curve based features that are automatically extracted from a given face data. The test set for authentication consisted of 117 different people with 421 scans including different facial expressions. Our study shows equal error rate (EER) at 0.065% for normal faces and 1.13% in faces with expressions. We report verification rates of 100% in normal faces and 93.12% in faces with expressions at 0.1% FAR. For identification, our experiment shows 100% rate in normal faces and 95.6% in faces with expressions. From our experiment we conclude that combining feature points, profile curve, and partial face surface matching gives better authentication and recognition rate than any single matching method.","PeriodicalId":371050,"journal":{"name":"2007 IEEE Conference on Advanced Video and Signal Based Surveillance","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131266051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-09-05DOI: 10.1109/AVSS.2007.4425275
A. Coleman
Summary form only given. This overview talk will first introduce the Home Office Scientific Development Branch (HOSDB) as organisation and then will offer a summary of our programmes in the area of the physical security sector. The talk will explain how HOSDB is contributing to protection and law enforcement. I will use a series of examples to cover this area. In the second part, the talk shall focus on vision based systems and on HOSDB initiatives on this technology. I will provide a strategic view of initiatives aimed to cause innovation in the industry and academic research. I will then cover our initiatives in bench marking and in video evidence analysis. Finally, I will provide an overview of future technology trends from the HOSDB perspective.
{"title":"Technology, applications and innovations in physical security - A home office perspective","authors":"A. Coleman","doi":"10.1109/AVSS.2007.4425275","DOIUrl":"https://doi.org/10.1109/AVSS.2007.4425275","url":null,"abstract":"Summary form only given. This overview talk will first introduce the Home Office Scientific Development Branch (HOSDB) as organisation and then will offer a summary of our programmes in the area of the physical security sector. The talk will explain how HOSDB is contributing to protection and law enforcement. I will use a series of examples to cover this area. In the second part, the talk shall focus on vision based systems and on HOSDB initiatives on this technology. I will provide a strategic view of initiatives aimed to cause innovation in the industry and academic research. I will then cover our initiatives in bench marking and in video evidence analysis. Finally, I will provide an overview of future technology trends from the HOSDB perspective.","PeriodicalId":371050,"journal":{"name":"2007 IEEE Conference on Advanced Video and Signal Based Surveillance","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123845732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-09-05DOI: 10.1109/AVSS.2007.4425346
P. L. Venetianer, Zhong Zhang, Andrew W. Scanlon, Yongtong Hu, A. Lipton
Loss prevention is a significant challenge in retail enterprises. A significant percentage of this loss occurs at point of sale (POS) terminals. POS data mining tools known collectively as exception based reporting (EBR) are helping retailers, but they have limitations as they can only work statistically on trends and anomalies in digital POS data. By applying video analytics techniques to POS transactions, it is possible to detect fraudulent or anomalous activity at the level of individual transactions. Very specific fraudulent behaviors that cannot be detected via POS data alone become clear when combined with video-derived data. ObjectVideo, a provider of intelligent video software, has produced a system called RetailWatch that combines POS information with video data to create a unique loss prevention tool. This paper describes the system architecture, algorithmic approach, and capabilities of the system, together with a customer case-study illustrating the results and effectiveness of the system.
{"title":"Video verification of point of sale transactions","authors":"P. L. Venetianer, Zhong Zhang, Andrew W. Scanlon, Yongtong Hu, A. Lipton","doi":"10.1109/AVSS.2007.4425346","DOIUrl":"https://doi.org/10.1109/AVSS.2007.4425346","url":null,"abstract":"Loss prevention is a significant challenge in retail enterprises. A significant percentage of this loss occurs at point of sale (POS) terminals. POS data mining tools known collectively as exception based reporting (EBR) are helping retailers, but they have limitations as they can only work statistically on trends and anomalies in digital POS data. By applying video analytics techniques to POS transactions, it is possible to detect fraudulent or anomalous activity at the level of individual transactions. Very specific fraudulent behaviors that cannot be detected via POS data alone become clear when combined with video-derived data. ObjectVideo, a provider of intelligent video software, has produced a system called RetailWatch that combines POS information with video data to create a unique loss prevention tool. This paper describes the system architecture, algorithmic approach, and capabilities of the system, together with a customer case-study illustrating the results and effectiveness of the system.","PeriodicalId":371050,"journal":{"name":"2007 IEEE Conference on Advanced Video and Signal Based Surveillance","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115844018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-09-05DOI: 10.1109/AVSS.2007.4425355
V. Fursov, Nikita Kozin
The principal component analysis (PCA), also called the eigenfaces analysis, is one of the most extensively used face image recognition techniques. The idea of the method is decomposition of image vectors into a system of eigenvectors matched to the maximum eigenvalues. The method of proximity assessment of vectors composed of principal components essentially influences the recognition quality. In the paper the use of different indices of conjugation with subspace stretched on training vectors is considered as a proximity measure. It is shown that this approach is very effective in the case of a small number of training examples. The results of experiments for a standard ORL-face database are presented.
{"title":"Recognition through constructing the Eigenface classifiers using conjugation indices","authors":"V. Fursov, Nikita Kozin","doi":"10.1109/AVSS.2007.4425355","DOIUrl":"https://doi.org/10.1109/AVSS.2007.4425355","url":null,"abstract":"The principal component analysis (PCA), also called the eigenfaces analysis, is one of the most extensively used face image recognition techniques. The idea of the method is decomposition of image vectors into a system of eigenvectors matched to the maximum eigenvalues. The method of proximity assessment of vectors composed of principal components essentially influences the recognition quality. In the paper the use of different indices of conjugation with subspace stretched on training vectors is considered as a proximity measure. It is shown that this approach is very effective in the case of a small number of training examples. The results of experiments for a standard ORL-face database are presented.","PeriodicalId":371050,"journal":{"name":"2007 IEEE Conference on Advanced Video and Signal Based Surveillance","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116885296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-09-05DOI: 10.1109/AVSS.2007.4425352
Liya Ding, Aleix M. Martinez
Manual signs in American sign language (ASL) are constructed using three building blocks -handshape, motion, and place of articulations. Only when these three are successfully estimated, can a sign by uniquely identified. Hence, the use of pattern recognition techniques that use only a subset of these is inappropriate. To achieve accurate classifications, the motion, the handshape and their three-dimensional position need to be recovered. In this paper, we define an algorithm to determine these three components form a single video sequence of two-dimensional pictures of a sign. We demonstrated the use of our algorithm in describing and recognizing a set of manual signs in ASL.
{"title":"Recovering the linguistic components of the manual signs in American Sign Language","authors":"Liya Ding, Aleix M. Martinez","doi":"10.1109/AVSS.2007.4425352","DOIUrl":"https://doi.org/10.1109/AVSS.2007.4425352","url":null,"abstract":"Manual signs in American sign language (ASL) are constructed using three building blocks -handshape, motion, and place of articulations. Only when these three are successfully estimated, can a sign by uniquely identified. Hence, the use of pattern recognition techniques that use only a subset of these is inappropriate. To achieve accurate classifications, the motion, the handshape and their three-dimensional position need to be recovered. In this paper, we define an algorithm to determine these three components form a single video sequence of two-dimensional pictures of a sign. We demonstrated the use of our algorithm in describing and recognizing a set of manual signs in ASL.","PeriodicalId":371050,"journal":{"name":"2007 IEEE Conference on Advanced Video and Signal Based Surveillance","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115734475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-09-05DOI: 10.1109/AVSS.2007.4425289
A. Hampapur, L. Brown, R. Feris, A. Senior, Chiao-Fe Shu, Ying-li Tian, Y. Zhai, M. Lu
Surveillance video is used in two key modes, watching for known threats in real-time and searching for events of interest after the fact. Typically, real-time alerting is a localized function, e.g. airport security center receives and reacts to a "perimeter breach alert", while investigations often tend to encompass a large number of geographically distributed cameras like the London bombing, or Washington sniper incidents. Enabling effective search of surveillance video for investigation & preemption, involves indexing the video along multiple dimensions. This paper presents a framework for surveillance search which includes, video parsing, indexing and query mechanisms. It explores video parsing techniques which automatically extract index data from video, indexing which stores data in relational tables, retrieval which uses SQL queries to retrieve events of interest and the software architecture that integrates these technologies.
{"title":"Searching surveillance video","authors":"A. Hampapur, L. Brown, R. Feris, A. Senior, Chiao-Fe Shu, Ying-li Tian, Y. Zhai, M. Lu","doi":"10.1109/AVSS.2007.4425289","DOIUrl":"https://doi.org/10.1109/AVSS.2007.4425289","url":null,"abstract":"Surveillance video is used in two key modes, watching for known threats in real-time and searching for events of interest after the fact. Typically, real-time alerting is a localized function, e.g. airport security center receives and reacts to a \"perimeter breach alert\", while investigations often tend to encompass a large number of geographically distributed cameras like the London bombing, or Washington sniper incidents. Enabling effective search of surveillance video for investigation & preemption, involves indexing the video along multiple dimensions. This paper presents a framework for surveillance search which includes, video parsing, indexing and query mechanisms. It explores video parsing techniques which automatically extract index data from video, indexing which stores data in relational tables, retrieval which uses SQL queries to retrieve events of interest and the software architecture that integrates these technologies.","PeriodicalId":371050,"journal":{"name":"2007 IEEE Conference on Advanced Video and Signal Based Surveillance","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114556775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-09-05DOI: 10.1109/AVSS.2007.4425317
P. L. Venetianer, Zhong Zhang, Weihong Yin, A. Lipton
Detecting stationary objects, such as an abandoned baggage or a parked vehicle is crucial in a wide range of video surveillance and monitoring applications. ObjectVideo, the leader in intelligent video software has been deploying commercial products to address these problems for the last 5 years. The ObjectVideo VEW and OnBoard system addresses these problems using an array of algorithms optimized for various scenario types and can be selected dynamically. This paper describes the key challenges and algorithms, and presents results on the standard i-LIDS dataset.
{"title":"Stationary target detection using the objectvideo surveillance system","authors":"P. L. Venetianer, Zhong Zhang, Weihong Yin, A. Lipton","doi":"10.1109/AVSS.2007.4425317","DOIUrl":"https://doi.org/10.1109/AVSS.2007.4425317","url":null,"abstract":"Detecting stationary objects, such as an abandoned baggage or a parked vehicle is crucial in a wide range of video surveillance and monitoring applications. ObjectVideo, the leader in intelligent video software has been deploying commercial products to address these problems for the last 5 years. The ObjectVideo VEW and OnBoard system addresses these problems using an array of algorithms optimized for various scenario types and can be selected dynamically. This paper describes the key challenges and algorithms, and presents results on the standard i-LIDS dataset.","PeriodicalId":371050,"journal":{"name":"2007 IEEE Conference on Advanced Video and Signal Based Surveillance","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128109491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}