基于局部时空特征结构化学习的视频人体动作识别与定位

Tuan Hue Thi, Jian Zhang, Li Cheng, Li Wang, S. Satoh
{"title":"基于局部时空特征结构化学习的视频人体动作识别与定位","authors":"Tuan Hue Thi, Jian Zhang, Li Cheng, Li Wang, S. Satoh","doi":"10.1109/AVSS.2010.76","DOIUrl":null,"url":null,"abstract":"This paper presents a unified framework for human actionclassification and localization in video using structuredlearning of local space-time features. Each human actionclass is represented by a set of its own compact set of localpatches. In our approach, we first use a discriminativehierarchical Bayesian classifier to select those space-timeinterest points that are constructive for each particular action.Those concise local features are then passed to a SupportVector Machine with Principal Component Analysisprojection for the classification task. Meanwhile, the actionlocalization is done using Dynamic Conditional RandomFields developed to incorporate the spatial and temporalstructure constraints of superpixels extracted aroundthose features. Each superpixel in the video is defined by theshape and motion information of its corresponding featureregion. Compelling results obtained from experiments onKTH [22], Weizmann [1], HOHA [13] and TRECVid [23]datasets have proven the efficiency and robustness of ourframework for the task of human action recognition and localizationin video.","PeriodicalId":415758,"journal":{"name":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"32","resultStr":"{\"title\":\"Human Action Recognition and Localization in Video Using Structured Learning of Local Space-Time Features\",\"authors\":\"Tuan Hue Thi, Jian Zhang, Li Cheng, Li Wang, S. Satoh\",\"doi\":\"10.1109/AVSS.2010.76\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a unified framework for human actionclassification and localization in video using structuredlearning of local space-time features. Each human actionclass is represented by a set of its own compact set of localpatches. In our approach, we first use a discriminativehierarchical Bayesian classifier to select those space-timeinterest points that are constructive for each particular action.Those concise local features are then passed to a SupportVector Machine with Principal Component Analysisprojection for the classification task. Meanwhile, the actionlocalization is done using Dynamic Conditional RandomFields developed to incorporate the spatial and temporalstructure constraints of superpixels extracted aroundthose features. Each superpixel in the video is defined by theshape and motion information of its corresponding featureregion. Compelling results obtained from experiments onKTH [22], Weizmann [1], HOHA [13] and TRECVid [23]datasets have proven the efficiency and robustness of ourframework for the task of human action recognition and localizationin video.\",\"PeriodicalId\":415758,\"journal\":{\"name\":\"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"32\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AVSS.2010.76\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AVSS.2010.76","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 32

摘要

本文提出了一种基于局部时空特征结构化学习的视频中人类动作分类和定位的统一框架。每个人类动作类都由一组自己的紧凑的localpatch集表示。在我们的方法中,我们首先使用判别层次贝叶斯分类器来选择那些对每个特定动作具有建设性的时空兴趣点。然后将这些简洁的局部特征传递给具有主成分分析投影的支持向量机,用于分类任务。同时,使用动态条件随机域(Dynamic Conditional RandomFields)来完成动作定位,该随机域结合了这些特征周围提取的超像素的空间和时间结构约束。视频中的每个超像素由其对应特征区域的形状和运动信息来定义。在kth[22]、Weizmann[1]、HOHA[13]和TRECVid[23]数据集上的实验结果证明了我们的框架在视频中人类动作识别和定位任务中的有效性和鲁棒性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Human Action Recognition and Localization in Video Using Structured Learning of Local Space-Time Features
This paper presents a unified framework for human actionclassification and localization in video using structuredlearning of local space-time features. Each human actionclass is represented by a set of its own compact set of localpatches. In our approach, we first use a discriminativehierarchical Bayesian classifier to select those space-timeinterest points that are constructive for each particular action.Those concise local features are then passed to a SupportVector Machine with Principal Component Analysisprojection for the classification task. Meanwhile, the actionlocalization is done using Dynamic Conditional RandomFields developed to incorporate the spatial and temporalstructure constraints of superpixels extracted aroundthose features. Each superpixel in the video is defined by theshape and motion information of its corresponding featureregion. Compelling results obtained from experiments onKTH [22], Weizmann [1], HOHA [13] and TRECVid [23]datasets have proven the efficiency and robustness of ourframework for the task of human action recognition and localizationin video.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Statistical Background Modeling: An Edge Segment Based Moving Object Detection Approach Who, what, when, where, why and how in video analysis: an application centric view Trajectory Based Activity Discovery Local Abnormality Detection in Video Using Subspace Learning Functionality Delegation in Distributed Surveillance Systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1