首页 > 最新文献

2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance最新文献

英文 中文
Human Action Recognition using a Hybrid NTLD Classifier 基于混合NTLD分类器的人体动作识别
A. Rani, Sanjeev Kumar, C. Micheloni, G. Foresti
This work proposes a hybrid classifier to recognize humanactions in different contexts. In particular, the proposedhybrid classifier (a neural tree with linear discriminantnodes NTLD), is a neural tree whose nodes can be eithersimple preceptrons or recursive fisher linear discriminant(RFLD) classifiers. A novel technique to substitute badtrained perceptron with more performant linear discriminatorsis introduced. For a given frame, geometrical featuresare extracted from the skeleton of the human blob (silhouette).These geometrical features are collected for a fixednumber of consecutive frames to recognize the correspondingactivity. The resulting feature vector is adopted as inputto the NTLD classifier. The performance of the proposedclassifier has been evaluated on two available databases.
这项工作提出了一种混合分类器来识别不同背景下的人类行为。特别是,所提出的混合分类器(具有线性判别节点的神经树NTLD)是一种神经树,其节点可以是简单感知器或递归fisher线性判别(RFLD)分类器。介绍了一种用性能更好的线性判别器代替训练不好的感知器的新方法。对于给定的帧,从人体斑点(轮廓)的骨架中提取几何特征。在固定数量的连续帧中收集这些几何特征以识别相应的活动。将得到的特征向量作为NTLD分类器的输入。在两个可用的数据库上对所提出的分类器的性能进行了评估。
{"title":"Human Action Recognition using a Hybrid NTLD Classifier","authors":"A. Rani, Sanjeev Kumar, C. Micheloni, G. Foresti","doi":"10.1109/AVSS.2010.11","DOIUrl":"https://doi.org/10.1109/AVSS.2010.11","url":null,"abstract":"This work proposes a hybrid classifier to recognize humanactions in different contexts. In particular, the proposedhybrid classifier (a neural tree with linear discriminantnodes NTLD), is a neural tree whose nodes can be eithersimple preceptrons or recursive fisher linear discriminant(RFLD) classifiers. A novel technique to substitute badtrained perceptron with more performant linear discriminatorsis introduced. For a given frame, geometrical featuresare extracted from the skeleton of the human blob (silhouette).These geometrical features are collected for a fixednumber of consecutive frames to recognize the correspondingactivity. The resulting feature vector is adopted as inputto the NTLD classifier. The performance of the proposedclassifier has been evaluated on two available databases.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130556956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Automatic Inter-image Homography Estimation from Person Detections 基于人的图像间单应性自动估计
M. Thaler, R. Mörzinger
Inter-image homographies are essential for many differenttasks involving projective geometry. This paper proposesan adaptive correspondence estimation approach betweenperson detections in a planar scene not relying oncorrespondence features as it is the case in many otherRANSAC-based approaches. The result is a planar interimagehomography calculated from estimated point correspondences.The approach is self-configurable, adaptiveand provides robustness over time by exploiting temporaland geometric information. We demonstrate the manifoldapplicability of the proposed approach on a variety ofdatasets. Improved results compared to a common baselineapproach are shown and the influence of error sources suchas missed detections, false detections and non overlappingfield of views is investigated.
图象间同形性对于许多涉及射影几何的不同任务是必不可少的。本文提出了一种平面场景中人物检测之间的自适应对应估计方法,而不像许多其他基于ransac的方法那样依赖于对应特征。结果是根据估计的点对应计算出平面像间单应性。该方法是自配置的,自适应的,并且通过利用时间和几何信息提供随时间的鲁棒性。我们证明了所提出的方法在各种数据集上的多种适用性。与普通基线方法相比,给出了改进的结果,并研究了误检、误检和视场不重叠等误差源的影响。
{"title":"Automatic Inter-image Homography Estimation from Person Detections","authors":"M. Thaler, R. Mörzinger","doi":"10.1109/AVSS.2010.35","DOIUrl":"https://doi.org/10.1109/AVSS.2010.35","url":null,"abstract":"Inter-image homographies are essential for many differenttasks involving projective geometry. This paper proposesan adaptive correspondence estimation approach betweenperson detections in a planar scene not relying oncorrespondence features as it is the case in many otherRANSAC-based approaches. The result is a planar interimagehomography calculated from estimated point correspondences.The approach is self-configurable, adaptiveand provides robustness over time by exploiting temporaland geometric information. We demonstrate the manifoldapplicability of the proposed approach on a variety ofdatasets. Improved results compared to a common baselineapproach are shown and the influence of error sources suchas missed detections, false detections and non overlappingfield of views is investigated.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125872306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Learning Directed Intention-driven Activities using Co-Clustering 使用共聚类学习定向意图驱动的活动
K. Sankaranarayanan, James W. Davis
We present a novel approach for discovering directedintention-driven pedestrian activities across large urban areas.The proposed approach is based on a mutual informationco-clustering technique that simultaneously clusterstrajectory start locations in the scene which have similardistributions across stop locations and vice-versa. The clusteringassignments are obtained by minimizing the loss ofmutual information between a trajectory start-stop associationmatrix and a compressed co-clustered matrix, afterwhich the scene activities are inferred from the compressedmatrix. We demonstrate our approach using a dataset oflong duration trajectories from multiple PTZ cameras coveringa large area and show improved results over two otherpopular trajectory clustering and entry-exit learning approaches.
我们提出了一种新的方法来发现大型城市地区定向导向的行人活动。该方法基于互信息聚类技术,同时对场景中具有相似停止位置分布的轨迹起始位置进行聚类,反之亦然。通过最小化轨迹启停关联矩阵和压缩共聚类矩阵之间的互信息损失来获得聚类分配,然后从压缩矩阵中推断出场景活动。我们使用覆盖大面积的多个PTZ相机的长时间轨迹数据集来演示我们的方法,并展示了比其他两种流行的轨迹聚类和入口-出口学习方法更好的结果。
{"title":"Learning Directed Intention-driven Activities using Co-Clustering","authors":"K. Sankaranarayanan, James W. Davis","doi":"10.1109/AVSS.2010.41","DOIUrl":"https://doi.org/10.1109/AVSS.2010.41","url":null,"abstract":"We present a novel approach for discovering directedintention-driven pedestrian activities across large urban areas.The proposed approach is based on a mutual informationco-clustering technique that simultaneously clusterstrajectory start locations in the scene which have similardistributions across stop locations and vice-versa. The clusteringassignments are obtained by minimizing the loss ofmutual information between a trajectory start-stop associationmatrix and a compressed co-clustered matrix, afterwhich the scene activities are inferred from the compressedmatrix. We demonstrate our approach using a dataset oflong duration trajectories from multiple PTZ cameras coveringa large area and show improved results over two otherpopular trajectory clustering and entry-exit learning approaches.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"169 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126021083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Multi-Modal Object Tracking using Dynamic Performance Metrics 使用动态性能指标的多模态对象跟踪
S. Denman, C. Fookes, S. Sridharan, D. Ryan
Intelligent surveillance systems typically use a single visualspectrum modality for their input. These systems workwell in controlled conditions, but often fail when lightingis poor, or environmental effects such as shadows, dust orsmoke are present. Thermal spectrum imagery is not as susceptibleto environmental effects, however thermal imagingsensors are more sensitive to noise and they are onlygray scale, making distinguishing between objects difficult.Several approaches to combining the visual and thermalmodalities have been proposed, however they are limited byassuming that both modalities are perfuming equally well.When one modality fails, existing approaches are unable todetect the drop in performance and disregard the under performingmodality. In this paper, a novel middle fusion approachfor combining visual and thermal spectrum imagesfor object tracking is proposed. Motion and object detectionis performed on each modality and the object detectionresults for each modality are fused base on the currentperformance of each modality. Modality performance is determinedby comparing the number of objects tracked by thesystem with the number detected by each mode, with a smallallowance made for objects entering and exiting the scene.The tracking performance of the proposed fusion schemeis compared with performance of the visual and thermalmodes individually, and a baseline middle fusion scheme.Improvement in tracking performance using the proposedfusion approach is demonstrated. The proposed approachis also shown to be able to detect the failure of an individualmodality and disregard its results, ensuring performance isnot degraded in such situations.
智能监控系统通常使用单一的视觉光谱模式作为其输入。这些系统在受控条件下工作良好,但在光线不足或存在阴影、灰尘或烟雾等环境影响时往往会失效。热光谱图像不容易受到环境的影响,然而热成像传感器对噪声更敏感,而且它们只是灰度的,这使得区分物体变得困难。已经提出了几种结合视觉和热模态的方法,但是由于假设两种模态的香味同样好,它们受到限制。当一种模态失效时,现有的方法无法检测到性能的下降,而忽略了表现不佳的模态。本文提出了一种结合视觉和热光谱图像进行目标跟踪的中间融合方法。对每个模态进行运动和目标检测,并根据每个模态的当前性能融合每个模态的目标检测结果。模态性能是通过将系统跟踪的物体数量与每种模式检测到的物体数量进行比较来确定的,并且对进入和退出场景的物体进行了少量的允许。将该融合方案的跟踪性能分别与视觉模式和热模式以及基线中间融合方案的跟踪性能进行了比较。利用所提出的融合方法改进了跟踪性能。所提出的方法还显示能够检测单个模态的故障并忽略其结果,确保在这种情况下性能不会降低。
{"title":"Multi-Modal Object Tracking using Dynamic Performance Metrics","authors":"S. Denman, C. Fookes, S. Sridharan, D. Ryan","doi":"10.1109/AVSS.2010.16","DOIUrl":"https://doi.org/10.1109/AVSS.2010.16","url":null,"abstract":"Intelligent surveillance systems typically use a single visualspectrum modality for their input. These systems workwell in controlled conditions, but often fail when lightingis poor, or environmental effects such as shadows, dust orsmoke are present. Thermal spectrum imagery is not as susceptibleto environmental effects, however thermal imagingsensors are more sensitive to noise and they are onlygray scale, making distinguishing between objects difficult.Several approaches to combining the visual and thermalmodalities have been proposed, however they are limited byassuming that both modalities are perfuming equally well.When one modality fails, existing approaches are unable todetect the drop in performance and disregard the under performingmodality. In this paper, a novel middle fusion approachfor combining visual and thermal spectrum imagesfor object tracking is proposed. Motion and object detectionis performed on each modality and the object detectionresults for each modality are fused base on the currentperformance of each modality. Modality performance is determinedby comparing the number of objects tracked by thesystem with the number detected by each mode, with a smallallowance made for objects entering and exiting the scene.The tracking performance of the proposed fusion schemeis compared with performance of the visual and thermalmodes individually, and a baseline middle fusion scheme.Improvement in tracking performance using the proposedfusion approach is demonstrated. The proposed approachis also shown to be able to detect the failure of an individualmodality and disregard its results, ensuring performance isnot degraded in such situations.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123891171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Local Feature Based Person Reidentification in Infrared Image Sequences 基于局部特征的红外图像序列人物再识别
K. Jüngling, Michael Arens
In this paper, we address the task of appearance basedperson reidentification in infrared image sequences. Whilecommon approaches for appearance based person reidentificationin the visible spectrum acquire color histograms ofa person, this technique is not applicable in infrared for obviousreasons. To tackle the more difficult problem of personreidentification in infrared, we introduce an approachthat relies on local image features only and thus is completelyindependent of sensor specific features which mightbe available only in the visible spectrum. Our approachfits into an Implicit Shape Model (ISM) based person detectionand tracking strategy described in previous work.Local features collected during tracking are employed forperson reidentification while the generalizing appearancecodebook used for person detection serves as structuringelement to generate person signatures. By this, we gain anintegrated approach that allows for fast online model generation,a compact representation, and fast model matching.Since the model allows for a joined representation ofappearance and spatial information, no complex representationmodels like graph structures are needed. We evaluateour person reidentification approach on a subset of the CASIAinfrared dataset.
本文研究了红外图像序列中基于外观的人物再识别问题。在可见光谱中,基于外观的人再识别方法通常获得人的颜色直方图,但由于明显的原因,该技术不适用于红外。为了解决红外中更困难的人物识别问题,我们引入了一种仅依赖于局部图像特征的方法,因此完全独立于可能仅在可见光谱中可用的传感器特定特征。我们在之前的工作中描述了一种基于隐式形状模型(ISM)的人员检测和跟踪策略。在跟踪过程中收集的局部特征用于人员再识别,而用于人员检测的泛化外观代码簿作为结构元素生成人员签名。通过这种方法,我们获得了一种集成的方法,该方法允许快速在线模型生成,紧凑的表示和快速的模型匹配。由于该模型允许外观和空间信息的联合表示,因此不需要像图结构这样复杂的表示模型。我们在casia红外数据集的一个子集上评估了我们的人员再识别方法。
{"title":"Local Feature Based Person Reidentification in Infrared Image Sequences","authors":"K. Jüngling, Michael Arens","doi":"10.1109/AVSS.2010.75","DOIUrl":"https://doi.org/10.1109/AVSS.2010.75","url":null,"abstract":"In this paper, we address the task of appearance basedperson reidentification in infrared image sequences. Whilecommon approaches for appearance based person reidentificationin the visible spectrum acquire color histograms ofa person, this technique is not applicable in infrared for obviousreasons. To tackle the more difficult problem of personreidentification in infrared, we introduce an approachthat relies on local image features only and thus is completelyindependent of sensor specific features which mightbe available only in the visible spectrum. Our approachfits into an Implicit Shape Model (ISM) based person detectionand tracking strategy described in previous work.Local features collected during tracking are employed forperson reidentification while the generalizing appearancecodebook used for person detection serves as structuringelement to generate person signatures. By this, we gain anintegrated approach that allows for fast online model generation,a compact representation, and fast model matching.Since the model allows for a joined representation ofappearance and spatial information, no complex representationmodels like graph structures are needed. We evaluateour person reidentification approach on a subset of the CASIAinfrared dataset.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128948026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
Subjective Logic Based Hybrid Approach to Conditional Evidence Fusion for Forensic Visual Surveillance 基于主观逻辑的法医视觉监控条件证据融合混合方法
Seunghan Han, Bonjung Koo, A. Hutter, V. Shet, W. Stechele
In forensic analysis of visual surveillance data, condi-tional knowledge representation and inference under un-certainty play an important role for deriving new contex-tual cues by fusing relevant evidential patterns. To addressthis aspect, both rule-based (aka. extensional) and statebased (aka. intensional) approaches have been adoptedfor situation or visual event analysis. The former providesflexible expressive power and computational efficiency buttypically allows only one directional inference. The latteris computationally expensive but allows bidirectional inter-pretation of conditionals by treating antecedent and conse-quent of conditionals as mutually relevant states. In visualsurveillance, considering the varying semantics and poten-tially ambiguous causality in conditionals, it would be use-ful to combine the expressive power of rule-based systemwith the ability of bidirectional interpretation. In this paper,we propose a hybrid approach that, while relying mainly ona rule-based architecture, also provides an intensional wayof on-demand conditional modeling using conditional op-erators in subjective logic. We first show how conditionalscan be assessed via explicit representation of ignorance insubjective logic. We then describe the proposed hybrid con-ditional handling framework. Finally we present an exper-imental case study from a typical airport scene taken fromvisual surveillance data.
在视觉监控数据的取证分析中,不确定性条件下的知识表示和推理在融合相关证据模式获取新的情境线索方面发挥着重要作用。为了解决这方面的问题,基于规则的(又名。扩展的)和基于状态的(又名。情景分析或视觉事件分析采用了内涵分析方法。前者提供了灵活的表达能力和计算效率,但通常只允许一个方向推理。后者的计算成本很高,但通过将条件的先决条件和结果条件视为相互相关的状态,允许对条件进行双向解释。在视觉监视中,考虑到条件中变化的语义和潜在的模糊因果关系,将基于规则的系统的表达能力与双向解释能力相结合将是有用的。在本文中,我们提出了一种混合方法,在主要依赖于基于规则的架构的同时,也提供了一种在主观逻辑中使用条件算子的按需条件建模的内涵方法。我们首先展示了如何通过主观逻辑中无知的显式表示来评估条件。然后,我们描述了提出的混合条件处理框架。最后,我们提出了一个基于视觉监控数据的典型机场场景的实验案例研究。
{"title":"Subjective Logic Based Hybrid Approach to Conditional Evidence Fusion for Forensic Visual Surveillance","authors":"Seunghan Han, Bonjung Koo, A. Hutter, V. Shet, W. Stechele","doi":"10.1109/AVSS.2010.19","DOIUrl":"https://doi.org/10.1109/AVSS.2010.19","url":null,"abstract":"In forensic analysis of visual surveillance data, condi-tional knowledge representation and inference under un-certainty play an important role for deriving new contex-tual cues by fusing relevant evidential patterns. To addressthis aspect, both rule-based (aka. extensional) and statebased (aka. intensional) approaches have been adoptedfor situation or visual event analysis. The former providesflexible expressive power and computational efficiency buttypically allows only one directional inference. The latteris computationally expensive but allows bidirectional inter-pretation of conditionals by treating antecedent and conse-quent of conditionals as mutually relevant states. In visualsurveillance, considering the varying semantics and poten-tially ambiguous causality in conditionals, it would be use-ful to combine the expressive power of rule-based systemwith the ability of bidirectional interpretation. In this paper,we propose a hybrid approach that, while relying mainly ona rule-based architecture, also provides an intensional wayof on-demand conditional modeling using conditional op-erators in subjective logic. We first show how conditionalscan be assessed via explicit representation of ignorance insubjective logic. We then describe the proposed hybrid con-ditional handling framework. Finally we present an exper-imental case study from a typical airport scene taken fromvisual surveillance data.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129013624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Task-Oriented Object Tracking in Large Distributed Camera Networks 大型分布式摄像机网络中面向任务的目标跟踪
Eduardo Monari, K. Kroschel
In this paper a task-oriented approach for object trackingin large distributed camera networks is presented. Thiswork includes three main contributions. First a generic processframework is presented, which has been designed fortask-oriented video processing. Second, system componentsof the task-oriented framework needed for the task of multicameraperson tracking are introduced in detail. Third, foran efficient task-oriented processing in large camera networksthe capability of dynamic sensor scheduling by themulti-camera tracking processes is indispensable. For thispurpose an efficient sensor selection approach is proposed.
本文提出了一种面向任务的大型分布式摄像机网络目标跟踪方法。这项工作包括三个主要贡献。首先提出了面向任务的视频处理通用流程框架;其次,详细介绍了多摄像机跟踪任务所需的面向任务框架的系统组成。第三,为了在大型摄像机网络中实现高效的面向任务的处理,多摄像机跟踪过程的动态传感器调度能力是必不可少的。为此,提出了一种有效的传感器选择方法。
{"title":"Task-Oriented Object Tracking in Large Distributed Camera Networks","authors":"Eduardo Monari, K. Kroschel","doi":"10.1109/AVSS.2010.66","DOIUrl":"https://doi.org/10.1109/AVSS.2010.66","url":null,"abstract":"In this paper a task-oriented approach for object trackingin large distributed camera networks is presented. Thiswork includes three main contributions. First a generic processframework is presented, which has been designed fortask-oriented video processing. Second, system componentsof the task-oriented framework needed for the task of multicameraperson tracking are introduced in detail. Third, foran efficient task-oriented processing in large camera networksthe capability of dynamic sensor scheduling by themulti-camera tracking processes is indispensable. For thispurpose an efficient sensor selection approach is proposed.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114621845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Human Action Recognition and Localization in Video Using Structured Learning of Local Space-Time Features 基于局部时空特征结构化学习的视频人体动作识别与定位
Tuan Hue Thi, Jian Zhang, Li Cheng, Li Wang, S. Satoh
This paper presents a unified framework for human actionclassification and localization in video using structuredlearning of local space-time features. Each human actionclass is represented by a set of its own compact set of localpatches. In our approach, we first use a discriminativehierarchical Bayesian classifier to select those space-timeinterest points that are constructive for each particular action.Those concise local features are then passed to a SupportVector Machine with Principal Component Analysisprojection for the classification task. Meanwhile, the actionlocalization is done using Dynamic Conditional RandomFields developed to incorporate the spatial and temporalstructure constraints of superpixels extracted aroundthose features. Each superpixel in the video is defined by theshape and motion information of its corresponding featureregion. Compelling results obtained from experiments onKTH [22], Weizmann [1], HOHA [13] and TRECVid [23]datasets have proven the efficiency and robustness of ourframework for the task of human action recognition and localizationin video.
本文提出了一种基于局部时空特征结构化学习的视频中人类动作分类和定位的统一框架。每个人类动作类都由一组自己的紧凑的localpatch集表示。在我们的方法中,我们首先使用判别层次贝叶斯分类器来选择那些对每个特定动作具有建设性的时空兴趣点。然后将这些简洁的局部特征传递给具有主成分分析投影的支持向量机,用于分类任务。同时,使用动态条件随机域(Dynamic Conditional RandomFields)来完成动作定位,该随机域结合了这些特征周围提取的超像素的空间和时间结构约束。视频中的每个超像素由其对应特征区域的形状和运动信息来定义。在kth[22]、Weizmann[1]、HOHA[13]和TRECVid[23]数据集上的实验结果证明了我们的框架在视频中人类动作识别和定位任务中的有效性和鲁棒性。
{"title":"Human Action Recognition and Localization in Video Using Structured Learning of Local Space-Time Features","authors":"Tuan Hue Thi, Jian Zhang, Li Cheng, Li Wang, S. Satoh","doi":"10.1109/AVSS.2010.76","DOIUrl":"https://doi.org/10.1109/AVSS.2010.76","url":null,"abstract":"This paper presents a unified framework for human actionclassification and localization in video using structuredlearning of local space-time features. Each human actionclass is represented by a set of its own compact set of localpatches. In our approach, we first use a discriminativehierarchical Bayesian classifier to select those space-timeinterest points that are constructive for each particular action.Those concise local features are then passed to a SupportVector Machine with Principal Component Analysisprojection for the classification task. Meanwhile, the actionlocalization is done using Dynamic Conditional RandomFields developed to incorporate the spatial and temporalstructure constraints of superpixels extracted aroundthose features. Each superpixel in the video is defined by theshape and motion information of its corresponding featureregion. Compelling results obtained from experiments onKTH [22], Weizmann [1], HOHA [13] and TRECVid [23]datasets have proven the efficiency and robustness of ourframework for the task of human action recognition and localizationin video.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115239856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Histogram-Based Training Initialisation of Hidden Markov Models for Human Action Recognition 基于直方图的隐马尔可夫模型训练初始化人类动作识别
Z. Moghaddam, M. Piccardi
Human action recognition is often addressed by use oflatent-state models such as the hidden Markov model andsimilar graphical models. As such models requireExpectation-Maximisation training, arbitrary choicesmust be made for training initialisation, with major impacton the final recognition accuracy. In this paper, wepropose a histogram-based deterministic initialisation andcompare it with both random and a time-baseddeterministic initialisations. Experiments on a humanaction dataset show that the accuracy of the proposedmethod proved higher than that of the other testedmethods.
人类行为识别通常通过使用平面状态模型,如隐马尔可夫模型和类似的图形模型来解决。由于这些模型需要期望最大化训练,因此必须对训练初始化进行任意选择,这对最终的识别准确性有重大影响。在本文中,我们提出了一种基于直方图的确定性初始化,并将其与随机初始化和基于时间的确定性初始化进行了比较。在人体动作数据集上的实验表明,该方法的准确率高于其他测试方法。
{"title":"Histogram-Based Training Initialisation of Hidden Markov Models for Human Action Recognition","authors":"Z. Moghaddam, M. Piccardi","doi":"10.1109/AVSS.2010.25","DOIUrl":"https://doi.org/10.1109/AVSS.2010.25","url":null,"abstract":"Human action recognition is often addressed by use oflatent-state models such as the hidden Markov model andsimilar graphical models. As such models requireExpectation-Maximisation training, arbitrary choicesmust be made for training initialisation, with major impacton the final recognition accuracy. In this paper, wepropose a histogram-based deterministic initialisation andcompare it with both random and a time-baseddeterministic initialisations. Experiments on a humanaction dataset show that the accuracy of the proposedmethod proved higher than that of the other testedmethods.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"550 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125342061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Surveillance Camera Calibration from Observations of a Pedestrian 从行人观察中校准监控摄像头
M. Evans, J. Ferryman
Calibrated cameras are an extremely useful resource forcomputer vision scenarios. Typically, cameras are calibratedthrough calibration targets, measurements of the observedscene, or self-calibrated through features matchedbetween cameras with overlapping fields of view. This paperconsiders an approach to camera calibration based onobservations of a pedestrian and compares the resultingcalibration to a commonly used approach requiring thatmeasurements be made of the scene.
校准相机是计算机视觉场景中非常有用的资源。通常,相机通过校准目标,观察场景的测量进行校准,或通过具有重叠视场的相机之间匹配的特征进行自校准。本文考虑了一种基于行人观察的相机校准方法,并将结果校准与需要对场景进行测量的常用方法进行了比较。
{"title":"Surveillance Camera Calibration from Observations of a Pedestrian","authors":"M. Evans, J. Ferryman","doi":"10.1109/AVSS.2010.32","DOIUrl":"https://doi.org/10.1109/AVSS.2010.32","url":null,"abstract":"Calibrated cameras are an extremely useful resource forcomputer vision scenarios. Typically, cameras are calibratedthrough calibration targets, measurements of the observedscene, or self-calibrated through features matchedbetween cameras with overlapping fields of view. This paperconsiders an approach to camera calibration based onobservations of a pedestrian and compares the resultingcalibration to a commonly used approach requiring thatmeasurements be made of the scene.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125201451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
期刊
2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1