一个视频就够了?基于活动视频合成的人类活动识别

M. Ryoo, Wonpil Yu
{"title":"一个视频就够了?基于活动视频合成的人类活动识别","authors":"M. Ryoo, Wonpil Yu","doi":"10.1109/WACV.2011.5711564","DOIUrl":null,"url":null,"abstract":"In this paper, we present a novel human activity recognition approach that only requires a single video example per activity. We introduce the paradigm of active video composition, which enables one-example recognition of complex activities. The idea is to automatically create a large number of semi-artificial training videos called composed videos by manipulating an original human activity video. A methodology to automatically compose activity videos having different backgrounds, translations, scales, actors, and movement structures is described in this paper. Furthermore, an active learning algorithm to model the temporal structure of the human activity has been designed, preventing the generation of composed training videos violating the structural constraints of the activity. The intention is to generate composed videos having correct organizations, and take advantage of them for the training of the recognition system. In contrast to previous passive recognition systems relying only on given training videos, our methodology actively composes necessary training videos that the system is expected to observe in its environment. Experimental results illustrate that a single fully labeled video per activity is sufficient for our methodology to reliably recognize human activities by utilizing composed training videos.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"One video is sufficient? Human activity recognition using active video composition\",\"authors\":\"M. Ryoo, Wonpil Yu\",\"doi\":\"10.1109/WACV.2011.5711564\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present a novel human activity recognition approach that only requires a single video example per activity. We introduce the paradigm of active video composition, which enables one-example recognition of complex activities. The idea is to automatically create a large number of semi-artificial training videos called composed videos by manipulating an original human activity video. A methodology to automatically compose activity videos having different backgrounds, translations, scales, actors, and movement structures is described in this paper. Furthermore, an active learning algorithm to model the temporal structure of the human activity has been designed, preventing the generation of composed training videos violating the structural constraints of the activity. The intention is to generate composed videos having correct organizations, and take advantage of them for the training of the recognition system. In contrast to previous passive recognition systems relying only on given training videos, our methodology actively composes necessary training videos that the system is expected to observe in its environment. Experimental results illustrate that a single fully labeled video per activity is sufficient for our methodology to reliably recognize human activities by utilizing composed training videos.\",\"PeriodicalId\":424724,\"journal\":{\"name\":\"2011 IEEE Workshop on Applications of Computer Vision (WACV)\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-01-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE Workshop on Applications of Computer Vision (WACV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WACV.2011.5711564\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WACV.2011.5711564","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

摘要

在本文中,我们提出了一种新的人类活动识别方法,每个活动只需要一个视频示例。我们介绍了主动视频合成的范例,它可以实现对复杂活动的单例识别。这个想法是通过操纵原始的人类活动视频,自动创建大量的半人工训练视频,称为合成视频。本文描述了一种自动合成具有不同背景、翻译、尺度、演员和运动结构的活动视频的方法。此外,设计了一种主动学习算法来模拟人类活动的时间结构,防止生成违反活动结构约束的合成训练视频。其目的是生成具有正确组织的合成视频,并利用它们来训练识别系统。与以前的被动识别系统只依赖于给定的训练视频相比,我们的方法积极地组成系统在其环境中预期观察到的必要的训练视频。实验结果表明,每个活动一个完全标记的视频足以让我们的方法通过使用合成的训练视频来可靠地识别人类活动。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
One video is sufficient? Human activity recognition using active video composition
In this paper, we present a novel human activity recognition approach that only requires a single video example per activity. We introduce the paradigm of active video composition, which enables one-example recognition of complex activities. The idea is to automatically create a large number of semi-artificial training videos called composed videos by manipulating an original human activity video. A methodology to automatically compose activity videos having different backgrounds, translations, scales, actors, and movement structures is described in this paper. Furthermore, an active learning algorithm to model the temporal structure of the human activity has been designed, preventing the generation of composed training videos violating the structural constraints of the activity. The intention is to generate composed videos having correct organizations, and take advantage of them for the training of the recognition system. In contrast to previous passive recognition systems relying only on given training videos, our methodology actively composes necessary training videos that the system is expected to observe in its environment. Experimental results illustrate that a single fully labeled video per activity is sufficient for our methodology to reliably recognize human activities by utilizing composed training videos.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Tracking planes with Time of Flight cameras and J-linkage Multi-modal visual concept classification of images via Markov random walk over tags Real-time illumination-invariant motion detection in spatio-temporal image volumes An evaluation of bags-of-words and spatio-temporal shapes for action recognition Illumination change compensation techniques to improve kinematic tracking
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1