Saliency-context two-stream convnets for action recognition

Quan-Qi Chen, Feng Liu, Xue Li, Baodi Liu, Yujin Zhang
{"title":"Saliency-context two-stream convnets for action recognition","authors":"Quan-Qi Chen, Feng Liu, Xue Li, Baodi Liu, Yujin Zhang","doi":"10.1109/ICIP.2016.7532925","DOIUrl":null,"url":null,"abstract":"Recently, very deep two-stream ConvNets have achieved great discriminative power for video classification, which is especially the case for the temporal ConvNets when trained on multi-frame optical flow. However, action recognition in videos often fall prey to the wild camera motion, which poses challenges on the extraction of reliable optical flow for human body. In light of this, we propose a novel method to remove the global camera motion, which explicitly calculates a homography between two consecutive frames without human detection. Given the estimated homography due to camera motion, background motion can be canceled out from the warped optical flow. We take this a step further and design a new architecture called Saliency-Context two-stream ConvNets, where the context two-stream ConvNets are employed to recognize the entire scene in video frames, whilst the saliency streams are trained on salient human motion regions that are detected from the warped optical flow. Finally, the Saliency-Context two-stream ConvNets allow us to capture complementary information and achieve state-of-the-art performance on UCF101 dataset.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"32 1","pages":"3076-3080"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Image Processing (ICIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIP.2016.7532925","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

Recently, very deep two-stream ConvNets have achieved great discriminative power for video classification, which is especially the case for the temporal ConvNets when trained on multi-frame optical flow. However, action recognition in videos often fall prey to the wild camera motion, which poses challenges on the extraction of reliable optical flow for human body. In light of this, we propose a novel method to remove the global camera motion, which explicitly calculates a homography between two consecutive frames without human detection. Given the estimated homography due to camera motion, background motion can be canceled out from the warped optical flow. We take this a step further and design a new architecture called Saliency-Context two-stream ConvNets, where the context two-stream ConvNets are employed to recognize the entire scene in video frames, whilst the saliency streams are trained on salient human motion regions that are detected from the warped optical flow. Finally, the Saliency-Context two-stream ConvNets allow us to capture complementary information and achieve state-of-the-art performance on UCF101 dataset.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于动作识别的显著性-上下文双流卷积
近年来,深度双流卷积神经网络在视频分类中取得了较好的判别能力,特别是在多帧光流训练下的时域卷积神经网络。然而,视频中的动作识别往往受到摄像机剧烈运动的影响,这对提取可靠的人体光流提出了挑战。鉴于此,我们提出了一种新的消除全局摄像机运动的方法,该方法在不需要人工检测的情况下显式计算两个连续帧之间的单应性。给定由相机运动引起的估计单应性,背景运动可以从扭曲的光流中抵消。我们进一步设计了一种名为“显著性-上下文两流卷积神经网络”的新架构,其中上下文两流卷积神经网络用于识别视频帧中的整个场景,而显著性流则在从扭曲光流检测到的显著人体运动区域上进行训练。最后,显著性-上下文两流卷积神经网络允许我们捕获互补信息,并在UCF101数据集上实现最先进的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Content-adaptive pyramid representation for 3D object classification Automating the measurement of physiological parameters: A case study in the image analysis of cilia motion Horizon based orientation estimation for planetary surface navigation Softcast with per-carrier power-constrained channels Speeding-up a convolutional neural network by connecting an SVM network
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1