We propose an edge segment based statistical backgroundmodeling algorithm and a moving edge detectionframework for the detection of moving objects. We analyzethe performance of the proposed segment based statisticalbackground model with traditional pixel based, edge pixelbased and edge segment based approaches. Existing edgebased moving object detection algorithms fetches difficultydue to the change in background motion, object shape, illuminationvariation and noise. The proposed algorithmmakes efficient use of statistical background model usingthe edge-segment structure. Experiments with natural imagesequences show that our method can detect moving objectsefficiently under the above mentioned environments.
{"title":"Statistical Background Modeling: An Edge Segment Based Moving Object Detection Approach","authors":"M. Murshed, Adín Ramírez Rivera, O. Chae","doi":"10.1109/AVSS.2010.18","DOIUrl":"https://doi.org/10.1109/AVSS.2010.18","url":null,"abstract":"We propose an edge segment based statistical backgroundmodeling algorithm and a moving edge detectionframework for the detection of moving objects. We analyzethe performance of the proposed segment based statisticalbackground model with traditional pixel based, edge pixelbased and edge segment based approaches. Existing edgebased moving object detection algorithms fetches difficultydue to the change in background motion, object shape, illuminationvariation and noise. The proposed algorithmmakes efficient use of statistical background model usingthe edge-segment structure. Experiments with natural imagesequences show that our method can detect moving objectsefficiently under the above mentioned environments.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123161279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-09-01DOI: 10.1109/AVSS.2010.5767512
S. Guler, Jason A. Silverstein, Ian A. Pushee, Xiang Ma, Ashutosh Morde
This paper presents an end-user application centric view of surveillance video analysis and describes a flexible, extensible and modular approach to video content extraction. Various detection and extraction components including tracking of moving objects, detection of text, faces, and face based soft biometric for gender, age and ethnicity classification are described within the general framework for real-time and post event analysis applications Panoptes and VideoRecall. Some end-user applications that are built on this framework are discussed.
{"title":"Who, what, when, where, why and how in video analysis: an application centric view","authors":"S. Guler, Jason A. Silverstein, Ian A. Pushee, Xiang Ma, Ashutosh Morde","doi":"10.1109/AVSS.2010.5767512","DOIUrl":"https://doi.org/10.1109/AVSS.2010.5767512","url":null,"abstract":"This paper presents an end-user application centric view of surveillance video analysis and describes a flexible, extensible and modular approach to video content extraction. Various detection and extraction components including tracking of moving objects, detection of text, faces, and face based soft biometric for gender, age and ethnicity classification are described within the general framework for real-time and post event analysis applications Panoptes and VideoRecall. Some end-user applications that are built on this framework are discussed.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131054881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To use intra-body propagation signals for biometric authenticationhave been proposed. The intra-body propagationsignals are hid in human bodies; therefore, they havetolerability to circumvention using artifacts. Additionally,utilizing the signals in the body enables liveness detectionwith no additional scheme. The problem is, however, verificationperformance using the intra-body propagation signalis not so high. In this paper, in order to improve the performancewe propose to use user-specific frequency bandsfor all users in verification. The verification performance isimproved to 70 %. Furthermore, we introduce the supportvector machine (SVM) into the verification process. It isconfirmed that verification rate of about 86 % is achieved.
{"title":"SVM-Based Biometric Authentication Using Intra-Body Propagation Signals","authors":"I. Nakanishi, Yuuta Sodani","doi":"10.1109/AVSS.2010.12","DOIUrl":"https://doi.org/10.1109/AVSS.2010.12","url":null,"abstract":"To use intra-body propagation signals for biometric authenticationhave been proposed. The intra-body propagationsignals are hid in human bodies; therefore, they havetolerability to circumvention using artifacts. Additionally,utilizing the signals in the body enables liveness detectionwith no additional scheme. The problem is, however, verificationperformance using the intra-body propagation signalis not so high. In this paper, in order to improve the performancewe propose to use user-specific frequency bandsfor all users in verification. The verification performance isimproved to 70 %. Furthermore, we introduce the supportvector machine (SVM) into the verification process. It isconfirmed that verification rate of about 86 % is achieved.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126832743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In Intelligent Video Systems, most of the recent advanced performance evaluation metrics perform a stage of mapping data between the system results and ground truth. This paper aims to review these metrics using a proposed framework. It will focus on metrics for events detection, objects detection and objects tracking systems.
{"title":"Intelligent Video Systems: A Review of Performance Evaluation Metrics that Use Mapping Procedures","authors":"X. Desurmont, C. Carincotte, F. Brémond","doi":"10.1109/AVSS.2010.88","DOIUrl":"https://doi.org/10.1109/AVSS.2010.88","url":null,"abstract":"In Intelligent Video Systems, most of the recent advanced performance evaluation metrics perform a stage of mapping data between the system results and ground truth. This paper aims to review these metrics using a proposed framework. It will focus on metrics for events detection, objects detection and objects tracking systems.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122660150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In public venues, crowd size is a key indicator of crowdsafety and stability. In this paper we propose a crowd count-ing algorithm that uses tracking and local features to countthe number of people in each group as represented by a fore-ground blob segment, so that the total crowd estimate is thesum of the group sizes. Tracking is employed to improve therobustness of the estimate, by analysing the history of eachgroup, including splitting and merging events. A simpli-fied ground truth annotation strategy results in an approachwith minimal setup requirements that is highly accurate.
{"title":"Crowd Counting Using Group Tracking and Local Features","authors":"D. Ryan, S. Denman, C. Fookes, S. Sridharan","doi":"10.1109/AVSS.2010.30","DOIUrl":"https://doi.org/10.1109/AVSS.2010.30","url":null,"abstract":"In public venues, crowd size is a key indicator of crowdsafety and stability. In this paper we propose a crowd count-ing algorithm that uses tracking and local features to countthe number of people in each group as represented by a fore-ground blob segment, so that the total crowd estimate is thesum of the group sizes. Tracking is employed to improve therobustness of the estimate, by analysing the history of eachgroup, including splitting and merging events. A simpli-fied ground truth annotation strategy results in an approachwith minimal setup requirements that is highly accurate.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122838705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a novel local feature descriptor, theLocal Directional Pattern (LDP), for describing localimage feature. A LDP feature is obtained by computing theedge response values in all eight directions at each pixelposition and generating a code from the relative strengthmagnitude. Each bit of code sequence is determined byconsidering a local neighborhood hence becomes robust innoisy situation. A rotation invariant LDP code is alsointroduced which uses the direction of the most prominentedge response. Finally an image descriptor is formed todescribe the image (or image region) by accumulating theoccurrence of LDP feature over the whole input image (orimage region). Experimental results on the Brodatz texturedatabase show that LDP impressively outperforms theother commonly used dense descriptors (e.g.,Gabor-wavelet and LBP).
{"title":"Local Directional Pattern (LDP) – A Robust Image Descriptor for Object Recognition","authors":"T. Jabid, M. H. Kabir, O. Chae","doi":"10.1109/AVSS.2010.17","DOIUrl":"https://doi.org/10.1109/AVSS.2010.17","url":null,"abstract":"This paper presents a novel local feature descriptor, theLocal Directional Pattern (LDP), for describing localimage feature. A LDP feature is obtained by computing theedge response values in all eight directions at each pixelposition and generating a code from the relative strengthmagnitude. Each bit of code sequence is determined byconsidering a local neighborhood hence becomes robust innoisy situation. A rotation invariant LDP code is alsointroduced which uses the direction of the most prominentedge response. Finally an image descriptor is formed todescribe the image (or image region) by accumulating theoccurrence of LDP feature over the whole input image (orimage region). Experimental results on the Brodatz texturedatabase show that LDP impressively outperforms theother commonly used dense descriptors (e.g.,Gabor-wavelet and LBP).","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128619582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pose estimation of people have had great progress in recentyears but so far research has dealt with single persons.In this paper we address some of the challenges that arisewhen doing pose estimation of interacting people. We buildon the pictorial structures framework and make importantcontributions by combining color-based appearance andedge information using a measure of the local quality ofthe appearance feature. In this way we not only combinethe two types of features but dynamically find the optimalweighting of them. We further enable the method to handleocclusions by searching a foreground mask for possibleoccluded body parts and then applying extra strong kinematicconstraints to find the true occluded body parts. Theeffect of applying our two contributions are show throughboth qualitative and quantitative tests and show a clear improvementon the ability to correctly localize body parts.
{"title":"Pose Estimation of Interacting People using Pictorial Structures","authors":"P. Fihl, T. Moeslund","doi":"10.1109/AVSS.2010.27","DOIUrl":"https://doi.org/10.1109/AVSS.2010.27","url":null,"abstract":"Pose estimation of people have had great progress in recentyears but so far research has dealt with single persons.In this paper we address some of the challenges that arisewhen doing pose estimation of interacting people. We buildon the pictorial structures framework and make importantcontributions by combining color-based appearance andedge information using a measure of the local quality ofthe appearance feature. In this way we not only combinethe two types of features but dynamically find the optimalweighting of them. We further enable the method to handleocclusions by searching a foreground mask for possibleoccluded body parts and then applying extra strong kinematicconstraints to find the true occluded body parts. Theeffect of applying our two contributions are show throughboth qualitative and quantitative tests and show a clear improvementon the ability to correctly localize body parts.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"536 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134379792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nan Dong, Zhen Jia, Jie Shao, Ziyou Xiong, Zhi-peng Li, Fuqiang Liu, Jianwei Zhao, Pei-Yuan Peng
Automatic traffic abnormality detection through visualsurveillance is one of the critical requirements for IntelligentTransportation Systems (ITS). In this paper, wepresent a novel algorithm to detect abnormal traffic eventsin crowded scenes. Our algorithm can be deployed with fewsetup steps to automatically monitor traffic status. Differentfrom other approaches, we don’t need to define region ofinterests (ROI) or tripwires nor to configure object detectionand tracking parameters. A novel object behavior descriptor- directional motion behavior descriptors are proposed.The directional motion behavior descriptors collectforeground objects’ direction and speed information from avideo sequence with normal traffic events, and then thesedescriptors are accumulated to generate a directional motionbehavior map which models the normal traffic status.During detection steps, we first extract the directional motionbehavior map from the newly observed video and thenmeasure the differences between the normal behavior mapand the new map. If new direction motion behaviors arevery different from the descriptors in the normal behaviormap, then the corresponding regions in the observed videocontain traffic abnormalities. Our proposed algorithm hasbeen tested using both synthesized and real surveillancevideos. Experimental results demonstrated that our algorithmis effective and efficient for practical real-time trafficsurveillance applications.
{"title":"Traffic Abnormality Detection through Directional Motion Behavior Map","authors":"Nan Dong, Zhen Jia, Jie Shao, Ziyou Xiong, Zhi-peng Li, Fuqiang Liu, Jianwei Zhao, Pei-Yuan Peng","doi":"10.1109/AVSS.2010.61","DOIUrl":"https://doi.org/10.1109/AVSS.2010.61","url":null,"abstract":"Automatic traffic abnormality detection through visualsurveillance is one of the critical requirements for IntelligentTransportation Systems (ITS). In this paper, wepresent a novel algorithm to detect abnormal traffic eventsin crowded scenes. Our algorithm can be deployed with fewsetup steps to automatically monitor traffic status. Differentfrom other approaches, we don’t need to define region ofinterests (ROI) or tripwires nor to configure object detectionand tracking parameters. A novel object behavior descriptor- directional motion behavior descriptors are proposed.The directional motion behavior descriptors collectforeground objects’ direction and speed information from avideo sequence with normal traffic events, and then thesedescriptors are accumulated to generate a directional motionbehavior map which models the normal traffic status.During detection steps, we first extract the directional motionbehavior map from the newly observed video and thenmeasure the differences between the normal behavior mapand the new map. If new direction motion behaviors arevery different from the descriptors in the normal behaviormap, then the corresponding regions in the observed videocontain traffic abnormalities. Our proposed algorithm hasbeen tested using both synthesized and real surveillancevideos. Experimental results demonstrated that our algorithmis effective and efficient for practical real-time trafficsurveillance applications.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"61 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133672703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper describes a body of multicamera humanaction video data with manually annotated silhouette datathat has been generated for the purpose of evaluatingsilhouette-based human action recognition methods. Itprovides a realistic challenge to both the segmentationand human action recognition communities and can act asa benchmark to objectively compare proposed algorithms.The public multi-camera, multi-action dataset is animprovement over existing datasets (e.g. PETS, CAVIAR,soccerdataset) that have not been developed specificallyfor human action recognition and complements otheraction recognition datasets (KTH, Weizmann, IXMAS,HumanEva, CMU Motion). It consists of 17 action classes,14 actors and 8 cameras. Each actor performs an actionseveral times in the action zone. The paper describes thedataset and illustrates a possible approach to algorithmevaluation using a previously published action simplerecognition method. In addition to showing an evaluationmethodology, these results establish a baseline for otherresearchers to improve upon.
{"title":"MuHAVi: A Multicamera Human Action Video Dataset for the Evaluation of Action Recognition Methods","authors":"Sanchit Singh, S. Velastín, Hossein Ragheb","doi":"10.1109/AVSS.2010.63","DOIUrl":"https://doi.org/10.1109/AVSS.2010.63","url":null,"abstract":"This paper describes a body of multicamera humanaction video data with manually annotated silhouette datathat has been generated for the purpose of evaluatingsilhouette-based human action recognition methods. Itprovides a realistic challenge to both the segmentationand human action recognition communities and can act asa benchmark to objectively compare proposed algorithms.The public multi-camera, multi-action dataset is animprovement over existing datasets (e.g. PETS, CAVIAR,soccerdataset) that have not been developed specificallyfor human action recognition and complements otheraction recognition datasets (KTH, Weizmann, IXMAS,HumanEva, CMU Motion). It consists of 17 action classes,14 actors and 8 cameras. Each actor performs an actionseveral times in the action zone. The paper describes thedataset and illustrates a possible approach to algorithmevaluation using a previously published action simplerecognition method. In addition to showing an evaluationmethodology, these results establish a baseline for otherresearchers to improve upon.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127123367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The automatic detection of meaningful phases in a soccergame depends on the accurate localization of playersand the ball at each moment. However, the automatic analysisof soccer sequences is a challenging task due to thepresence of fast moving multiple objects. For this purpose,we present a multi-camera analysis system that yields theposition of the ball and players on a common ground plane.The detection in each camera is based on a code-book algorithmand different features are used to classify the detectedblobs. The detection results of each camera are transformedusing homography to a virtual top-view of the playing field.Within this virtual top-view we merge trajectory informationof the different cameras allowing to refine the foundpositions. In this paper, we evaluate the system on a publicSOCCER dataset and end with a discussion of possibleimprovements of the dataset.
{"title":"Multi-Camera Analysis of Soccer Sequences","authors":"C. Poppe, S. D. Bruyne, S. Verstockt, R. Walle","doi":"10.1109/AVSS.2010.64","DOIUrl":"https://doi.org/10.1109/AVSS.2010.64","url":null,"abstract":"The automatic detection of meaningful phases in a soccergame depends on the accurate localization of playersand the ball at each moment. However, the automatic analysisof soccer sequences is a challenging task due to thepresence of fast moving multiple objects. For this purpose,we present a multi-camera analysis system that yields theposition of the ball and players on a common ground plane.The detection in each camera is based on a code-book algorithmand different features are used to classify the detectedblobs. The detection results of each camera are transformedusing homography to a virtual top-view of the playing field.Within this virtual top-view we merge trajectory informationof the different cameras allowing to refine the foundpositions. In this paper, we evaluate the system on a publicSOCCER dataset and end with a discussion of possibleimprovements of the dataset.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122389046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}