{"title":"Rule-based real-time detection of context-independent events in video shots","authors":"Aishy Amer , Eric Dubois , Amar Mitiche","doi":"10.1016/j.rti.2004.12.001","DOIUrl":null,"url":null,"abstract":"<div><p>The purpose of this paper is to investigate a real-time system to detect context-independent events in video shots. We test the system in video surveillance environments with a fixed camera. We assume that objects have been segmented (not necessarily perfectly) and reason with their low-level features, such as shape, and mid-level features, such as trajectory, to infer events related to moving objects.</p><p>Our goal is to detect generic events, i.e., events that are independent of the context of where or how they occur. Events are detected based on a formal definition of these and on approximate but efficient world models. This is done by continually monitoring changes and behavior of features of video objects. When certain conditions are met, events are detected. We classify events into four types: primitive, action, interaction, and composite.</p><p>Our system includes three interacting video processing layers: <em>enhancement</em><span> to estimate and reduce additive noise, </span><em>analysis</em> to segment and track video objects, and <em>interpretation</em> to detect context-independent events. The contributions in this paper are the interpretation of spatio-temporal object features to detect context-independent events in real time, the adaptation to noise, and the correction and compensation of low-level processing errors at higher layers where more information is available.</p><p>The effectiveness and real-time response of our system are demonstrated by extensive experimentation on indoor and outdoor video shots in the presence of multi-object occlusion, different noise levels, and coding artifacts.</p></div>","PeriodicalId":101062,"journal":{"name":"Real-Time Imaging","volume":"11 3","pages":"Pages 244-256"},"PeriodicalIF":0.0000,"publicationDate":"2005-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.rti.2004.12.001","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Real-Time Imaging","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1077201405000021","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
The purpose of this paper is to investigate a real-time system to detect context-independent events in video shots. We test the system in video surveillance environments with a fixed camera. We assume that objects have been segmented (not necessarily perfectly) and reason with their low-level features, such as shape, and mid-level features, such as trajectory, to infer events related to moving objects.
Our goal is to detect generic events, i.e., events that are independent of the context of where or how they occur. Events are detected based on a formal definition of these and on approximate but efficient world models. This is done by continually monitoring changes and behavior of features of video objects. When certain conditions are met, events are detected. We classify events into four types: primitive, action, interaction, and composite.
Our system includes three interacting video processing layers: enhancement to estimate and reduce additive noise, analysis to segment and track video objects, and interpretation to detect context-independent events. The contributions in this paper are the interpretation of spatio-temporal object features to detect context-independent events in real time, the adaptation to noise, and the correction and compensation of low-level processing errors at higher layers where more information is available.
The effectiveness and real-time response of our system are demonstrated by extensive experimentation on indoor and outdoor video shots in the presence of multi-object occlusion, different noise levels, and coding artifacts.