Jun Yang, Yang Wang, A. Sowmya, Bang Zhang, Jie Xu, Zhidong Li
In this paper, we investigate the applicability of the newlyproposed data clustering method, affinity propagation, infeature points clustering and the task of vehicle detectionand tracking in road traffic surveillance. We propose amodel-based temporal association scheme and novel preprocessingand postprocessing operations which togetherwith affinity propagation make a quite successful method forthe given task. Our experiments demonstrate the effectivenessand efficiency of our method and its superiority overthe state-of-the-art algorithm.
{"title":"Affinity Propagation Feature Clustering with Application to Vehicle Detection and Tracking in Road Traffic Surveillance","authors":"Jun Yang, Yang Wang, A. Sowmya, Bang Zhang, Jie Xu, Zhidong Li","doi":"10.1109/AVSS.2010.40","DOIUrl":"https://doi.org/10.1109/AVSS.2010.40","url":null,"abstract":"In this paper, we investigate the applicability of the newlyproposed data clustering method, affinity propagation, infeature points clustering and the task of vehicle detectionand tracking in road traffic surveillance. We propose amodel-based temporal association scheme and novel preprocessingand postprocessing operations which togetherwith affinity propagation make a quite successful method forthe given task. Our experiments demonstrate the effectivenessand efficiency of our method and its superiority overthe state-of-the-art algorithm.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131082374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a background subtraction approach aimedat efficiency and accuracy also in presence of commonsources of disturbance such as illumination changes, cameragain and exposure variations, noise. The novelty ofthe proposal relies on a-priori modeling the local effect ofdisturbs on small neighborhoods of pixel intensities as amonotonic, homogeneous, second-degree polynomial transformationplus additive Gaussian noise. This allows forclassifying pixels as changed or unchanged by an efficientinequality-constrained least-squares fitting procedure. Experimentsprove that the approach is state-of-the-art interms of efficiency-accuracy tradeoff on challenging sequencescharacterized by disturbs yielding sudden andstrong variations of the background appearance.
{"title":"Accurate and Efficient Background Subtraction by Monotonic Second-Degree Polynomial Fitting","authors":"A. Lanza, Federico Tombari, L. D. Stefano","doi":"10.1109/AVSS.2010.45","DOIUrl":"https://doi.org/10.1109/AVSS.2010.45","url":null,"abstract":"We present a background subtraction approach aimedat efficiency and accuracy also in presence of commonsources of disturbance such as illumination changes, cameragain and exposure variations, noise. The novelty ofthe proposal relies on a-priori modeling the local effect ofdisturbs on small neighborhoods of pixel intensities as amonotonic, homogeneous, second-degree polynomial transformationplus additive Gaussian noise. This allows forclassifying pixels as changed or unchanged by an efficientinequality-constrained least-squares fitting procedure. Experimentsprove that the approach is state-of-the-art interms of efficiency-accuracy tradeoff on challenging sequencescharacterized by disturbs yielding sudden andstrong variations of the background appearance.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"230 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134592237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Battery-powered wireless embedded smart cameras havelimited processing power, memory and energy. Since videoprocessing tasks consume significant amount of power,the problem of limited resources becomes even more pro-nounced, and necessitates designing light-weight algo-rithms suitable for embedded platforms. In this paper, wepresent a resource-efficient salient foreground detection andtracking algorithm. Contrary to traditional methods thatimplement foreground object detection and tracking inde-pendently and in a sequential manner, the proposed methoduses the feedback from the tracking stage in the foregroundobject detection. We compare the proposed method with asequential method on the microprocessor of an embeddedsmart camera, and present the savings in the processingtime and energy consumption and the gain in the lifetimeof a battery-powered camera for different scenarios. Thepresented method provides significant savings in terms ofthe processing time of a frame. We take advantage of thesesavings by sending the microprocessor to idle state at theend of processing a frame, and when the scene is empty.
{"title":"Resource-Efficient Salient Foreground Detection for Embedded Smart Cameras br Tracking Feedback","authors":"Mauricio Casares, Senem Velipasalar","doi":"10.1109/AVSS.2010.50","DOIUrl":"https://doi.org/10.1109/AVSS.2010.50","url":null,"abstract":"Battery-powered wireless embedded smart cameras havelimited processing power, memory and energy. Since videoprocessing tasks consume significant amount of power,the problem of limited resources becomes even more pro-nounced, and necessitates designing light-weight algo-rithms suitable for embedded platforms. In this paper, wepresent a resource-efficient salient foreground detection andtracking algorithm. Contrary to traditional methods thatimplement foreground object detection and tracking inde-pendently and in a sequential manner, the proposed methoduses the feedback from the tracking stage in the foregroundobject detection. We compare the proposed method with asequential method on the microprocessor of an embeddedsmart camera, and present the savings in the processingtime and energy consumption and the gain in the lifetimeof a battery-powered camera for different scenarios. Thepresented method provides significant savings in terms ofthe processing time of a frame. We take advantage of thesesavings by sending the microprocessor to idle state at theend of processing a frame, and when the scene is empty.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"26 30","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132273747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vision algorithms face many challenging issues when itcomes to analyze human activities in video surveillance applications.For instance, occlusions makes the detectionand tracking of people a hard task to perform. Hence advancedand adapted solutions are required to analyze thecontent of video sequences. We here present a people detectionalgorithm based on a hierarchical tree of Histogramof Oriented Gradients referred to as HOG. The detectionis coupled with independently trained body part detectorsto enhance the detection performance and to reach state ofthe art performances. We adopt a person tracking schemewhich calculates HOG dissimilarities between detected personsthroughout a sequence. The algorithms are tested invideos with challenging situations such as occlusions. Falsealarms are further reduced by using 2D and 3D informationof moving objects segmented from a background referenceframe.
{"title":"Body Parts Detection for People Tracking Using Trees of Histogram of Oriented Gradient Descriptors","authors":"E. Corvée, F. Brémond","doi":"10.1109/AVSS.2010.51","DOIUrl":"https://doi.org/10.1109/AVSS.2010.51","url":null,"abstract":"Vision algorithms face many challenging issues when itcomes to analyze human activities in video surveillance applications.For instance, occlusions makes the detectionand tracking of people a hard task to perform. Hence advancedand adapted solutions are required to analyze thecontent of video sequences. We here present a people detectionalgorithm based on a hierarchical tree of Histogramof Oriented Gradients referred to as HOG. The detectionis coupled with independently trained body part detectorsto enhance the detection performance and to reach state ofthe art performances. We adopt a person tracking schemewhich calculates HOG dissimilarities between detected personsthroughout a sequence. The algorithms are tested invideos with challenging situations such as occlusions. Falsealarms are further reduced by using 2D and 3D informationof moving objects segmented from a background referenceframe.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"255 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133426447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, an algorithm for multiple camera based persontracking is presented. Region covariance matrixes areused to model the target appearance. The correspondencebetween multiple camera views is established via homography.It is utilized to improve the tracking of people under assumptionthat they are at the common ground plane. If thereis occlusion in one view, the homography to this view fromanother view is utilized to locate the object template. Theinformation about the true location of the template helpsthe tracker to resume, even in case of substantial temporalocclusions or large object movements. The object templateis represented by multiple non-overlapping patches. Owingto such an object representation the tracker is capable bothdetecting the occlusion and handling considerable partialocclusions. The object tracking is achieved using particleswarm optimization. The objective function is based on theLog-Euclidean Riemannian metric. Experimental resultsthat were obtained on surveillance videos show the feasibilityof the presented approach.
{"title":"Multi Camera-Based Person Tracking Using Region Covariance and Homography Constraint","authors":"B. Kwolek","doi":"10.1109/AVSS.2010.20","DOIUrl":"https://doi.org/10.1109/AVSS.2010.20","url":null,"abstract":"In this paper, an algorithm for multiple camera based persontracking is presented. Region covariance matrixes areused to model the target appearance. The correspondencebetween multiple camera views is established via homography.It is utilized to improve the tracking of people under assumptionthat they are at the common ground plane. If thereis occlusion in one view, the homography to this view fromanother view is utilized to locate the object template. Theinformation about the true location of the template helpsthe tracker to resume, even in case of substantial temporalocclusions or large object movements. The object templateis represented by multiple non-overlapping patches. Owingto such an object representation the tracker is capable bothdetecting the occlusion and handling considerable partialocclusions. The object tracking is achieved using particleswarm optimization. The objective function is based on theLog-Euclidean Riemannian metric. Experimental resultsthat were obtained on surveillance videos show the feasibilityof the presented approach.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"520 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134485494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we present a new and fast techniquefor background estimation from cluttered image sequences.Most of the background initialization approaches developedso far collect a number of initial frames and then requirea slow estimation step which introduces a delay wheneverit is applied. Conversely, the proposed technique redistributesthe computational load among all the frames bymeans of a patch by patch preprocessing, which makesthe overall algorithm more suitable for real-time applications.For each patch location a prototype set is created andmaintained. The background is then iteratively estimatedby choosing from each set the most appropriate candidatepatch, which should verify a sort of frequency coherencewith its neighbors. To this aim, the Hadamard transformhas been adopted which requires less computation time thanthe commonly used DCT. Finally, a refinement step exploitsspatial continuity constraints along the patch borders toprevent erroneous patch selections. The approach has beencompared with the state of the art on videos from availabledatasets (ViSOR and CAVIAR), showing a speed up of about10 times and an improved accuracy.
{"title":"Fast Background Initialization with Recursive Hadamard Transform","authors":"Davide Baltieri, R. Vezzani, R. Cucchiara","doi":"10.1109/AVSS.2010.43","DOIUrl":"https://doi.org/10.1109/AVSS.2010.43","url":null,"abstract":"In this paper, we present a new and fast techniquefor background estimation from cluttered image sequences.Most of the background initialization approaches developedso far collect a number of initial frames and then requirea slow estimation step which introduces a delay wheneverit is applied. Conversely, the proposed technique redistributesthe computational load among all the frames bymeans of a patch by patch preprocessing, which makesthe overall algorithm more suitable for real-time applications.For each patch location a prototype set is created andmaintained. The background is then iteratively estimatedby choosing from each set the most appropriate candidatepatch, which should verify a sort of frequency coherencewith its neighbors. To this aim, the Hadamard transformhas been adopted which requires less computation time thanthe commonly used DCT. Finally, a refinement step exploitsspatial continuity constraints along the patch borders toprevent erroneous patch selections. The approach has beencompared with the state of the art on videos from availabledatasets (ViSOR and CAVIAR), showing a speed up of about10 times and an improved accuracy.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132895956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The utilization of multimedia devices is growing rapidlyin surveillance and monitoring applications. These multimediasurveillance systems need to process large amountsof multimodal sensor data in order to detect events and objects.While processing this large amount of data, the systemfaces many processing and network bottlenecks. Thedesign of efficient multimedia surveillance system requiresintelligent architectural decisions and performance evaluationto cope with these resource demands. One critical issueamong all these architectures is task assignment amongprocessing units. To study the effect of this task assignmenton system performance with quantifiable performancemeasures is very useful and challenging. We define a FunctionalityDelegation Coefficient which abstracts the delegationof functionality among processing units of a distributedsurveillance system and show its effect on event blockingprobability and response time. Simulation and real implementationresults are provided to validate the model.
{"title":"Functionality Delegation in Distributed Surveillance Systems","authors":"M. Saini, P. Atrey, S. Emmanuel, M. Kankanhalli","doi":"10.1109/AVSS.2010.58","DOIUrl":"https://doi.org/10.1109/AVSS.2010.58","url":null,"abstract":"The utilization of multimedia devices is growing rapidlyin surveillance and monitoring applications. These multimediasurveillance systems need to process large amountsof multimodal sensor data in order to detect events and objects.While processing this large amount of data, the systemfaces many processing and network bottlenecks. Thedesign of efficient multimedia surveillance system requiresintelligent architectural decisions and performance evaluationto cope with these resource demands. One critical issueamong all these architectures is task assignment amongprocessing units. To study the effect of this task assignmenton system performance with quantifiable performancemeasures is very useful and challenging. We define a FunctionalityDelegation Coefficient which abstracts the delegationof functionality among processing units of a distributedsurveillance system and show its effect on event blockingprobability and response time. Simulation and real implementationresults are provided to validate the model.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114152425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Y. Benabbas, Nacim Ihaddadene, Tarek Yahiaoui, T. Urruty, C. Djeraba
In this paper, we present a new approach to count thenumber of people that cross a counting line from monocularvideo images. The proposed approach accumulates imageslices and estimates the optical flow on them. Then, it performsan online blob detection on these slices in order toextract the crossing persons. The number of persons associatedto each blob is determined using a linear regressionmodel applied to blob features which are the position, velocity,orientation and size. The proposed approach is validatedon several datasets captured using either a verticaloverhead or an oblique mounted camera. The real-time performanceand the high counting accuracy of this approachin indoor and outdoor environments are also demonstrated.
{"title":"Spatio-Temporal Optical Flow Analysis for People Counting","authors":"Y. Benabbas, Nacim Ihaddadene, Tarek Yahiaoui, T. Urruty, C. Djeraba","doi":"10.1109/AVSS.2010.29","DOIUrl":"https://doi.org/10.1109/AVSS.2010.29","url":null,"abstract":"In this paper, we present a new approach to count thenumber of people that cross a counting line from monocularvideo images. The proposed approach accumulates imageslices and estimates the optical flow on them. Then, it performsan online blob detection on these slices in order toextract the crossing persons. The number of persons associatedto each blob is determined using a linear regressionmodel applied to blob features which are the position, velocity,orientation and size. The proposed approach is validatedon several datasets captured using either a verticaloverhead or an oblique mounted camera. The real-time performanceand the high counting accuracy of this approachin indoor and outdoor environments are also demonstrated.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124371783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a novel framework for extracting “pathlets”from tracking data. A pathlet is defined as a motion regionthat contains tracks having the same origin and destinationin the scene and that are temporally correlated. The proposedmethod requires only weak tracking data (multiplefragmented tracks per target). We employ a probabilisticstate space representation to construct a Markovian transitionmodel and estimate the scene entry/exit locations. Theresulting model is treated as a set of vertices in a graph anda similarity matrix is built which describes broader nonlocalrelationships between states. A Spectral Clusteringapproach is then used to automatically extract the pathletsof the scene. We present experimental results from scenes ofvarying difficulty and compare against other approaches.
{"title":"Extracting Pathlets FromWeak Tracking Data","authors":"Kevin Streib, James W. Davis","doi":"10.1109/AVSS.2010.24","DOIUrl":"https://doi.org/10.1109/AVSS.2010.24","url":null,"abstract":"We present a novel framework for extracting “pathlets”from tracking data. A pathlet is defined as a motion regionthat contains tracks having the same origin and destinationin the scene and that are temporally correlated. The proposedmethod requires only weak tracking data (multiplefragmented tracks per target). We employ a probabilisticstate space representation to construct a Markovian transitionmodel and estimate the scene entry/exit locations. Theresulting model is treated as a set of vertices in a graph anda similarity matrix is built which describes broader nonlocalrelationships between states. A Spectral Clusteringapproach is then used to automatically extract the pathletsof the scene. We present experimental results from scenes ofvarying difficulty and compare against other approaches.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122994654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a new approach that realizes an imagebasedfault tolerant distance computation for a multi-viewcamera system which conservatively approximates theshortest distance between unknown objects and 3Dvolumes. Our method addresses the industrial applicationof vision-based protective devices which are used to detectintrusions of humans into areas of dangerous machinery,in order to prevent injuries. This requires hardwareredundancy for compensation of hardware failureswithout loss of functionality and safety. By taking sensorfailures during the fusion process of distances fromdifferent cameras into account, this is realized implicitly,with the benefit of no additional hardware cost. Inparticular we employ multiple camera perspectives forsafe and non-conservative occlusion handling of obstaclesand formulate general system assumptions which are alsoappropriate for other applications like multi-viewreconstruction methods.
{"title":"A Safe Fault Tolerant Multi-view Approach for Vision-Based Protective Devices","authors":"Antje Ober, D. Henrich","doi":"10.1109/AVSS.2010.69","DOIUrl":"https://doi.org/10.1109/AVSS.2010.69","url":null,"abstract":"We present a new approach that realizes an imagebasedfault tolerant distance computation for a multi-viewcamera system which conservatively approximates theshortest distance between unknown objects and 3Dvolumes. Our method addresses the industrial applicationof vision-based protective devices which are used to detectintrusions of humans into areas of dangerous machinery,in order to prevent injuries. This requires hardwareredundancy for compensation of hardware failureswithout loss of functionality and safety. By taking sensorfailures during the fusion process of distances fromdifferent cameras into account, this is realized implicitly,with the benefit of no additional hardware cost. Inparticular we employ multiple camera perspectives forsafe and non-conservative occlusion handling of obstaclesand formulate general system assumptions which are alsoappropriate for other applications like multi-viewreconstruction methods.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131087068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}