Video Summarization Via Actionness Ranking

Mohamed Elfeki, A. Borji
{"title":"Video Summarization Via Actionness Ranking","authors":"Mohamed Elfeki, A. Borji","doi":"10.1109/WACV.2019.00085","DOIUrl":null,"url":null,"abstract":"To automatically produce a brief yet expressive summary of a long video, an automatic algorithm should start by resembling the human process of summary generation. Prior work proposed supervised and unsupervised algorithms to train models for learning the underlying behavior of humans by increasing modeling complexity or craft-designing better heuristics to simulate human summary generation process. In this work, we take a different approach by analyzing a major cue that humans exploit for summary generation; the nature and intensity of actions. We empirically observed that a frame is more likely to be included in human-generated summaries if it contains a substantial amount of deliberate motion performed by an agent, which is referred to as actionness. Therefore, we hypothesize that learning to automatically generate summaries involves an implicit knowledge of actionness estimation and ranking. We validate our hypothesis by running a user study that explores the correlation between human-generated summaries and actionness ranks. We also run a consensus and behavioral analysis between human subjects to ensure reliable and consistent results. The analysis exhibits a considerable degree of agreement among subjects within obtained data and verifying our initial hypothesis. Based on the study findings, we develop a method to incorporate actionness data to explicitly regulate a learning algorithm that is trained for summary generation. We assess the performance of our approach on 4 summarization benchmark datasets, and demonstrate an evident advantage compared to state-of-the-art summarization methods.","PeriodicalId":436637,"journal":{"name":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"103 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"35","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Winter Conference on Applications of Computer Vision (WACV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WACV.2019.00085","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 35

Abstract

To automatically produce a brief yet expressive summary of a long video, an automatic algorithm should start by resembling the human process of summary generation. Prior work proposed supervised and unsupervised algorithms to train models for learning the underlying behavior of humans by increasing modeling complexity or craft-designing better heuristics to simulate human summary generation process. In this work, we take a different approach by analyzing a major cue that humans exploit for summary generation; the nature and intensity of actions. We empirically observed that a frame is more likely to be included in human-generated summaries if it contains a substantial amount of deliberate motion performed by an agent, which is referred to as actionness. Therefore, we hypothesize that learning to automatically generate summaries involves an implicit knowledge of actionness estimation and ranking. We validate our hypothesis by running a user study that explores the correlation between human-generated summaries and actionness ranks. We also run a consensus and behavioral analysis between human subjects to ensure reliable and consistent results. The analysis exhibits a considerable degree of agreement among subjects within obtained data and verifying our initial hypothesis. Based on the study findings, we develop a method to incorporate actionness data to explicitly regulate a learning algorithm that is trained for summary generation. We assess the performance of our approach on 4 summarization benchmark datasets, and demonstrate an evident advantage compared to state-of-the-art summarization methods.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过行动排名进行视频总结
为了自动生成一个简短而富有表现力的长视频摘要,一个自动算法应该从类似于人类生成摘要的过程开始。先前的工作提出了监督和无监督算法,通过增加建模复杂性或工艺设计更好的启发式来模拟人类摘要生成过程,来训练模型以学习人类的潜在行为。在这项工作中,我们采取了不同的方法,通过分析人类利用摘要生成的主要线索;行动的性质和强度。我们根据经验观察到,如果一个框架包含大量由代理执行的故意动作(即行动性),那么它更有可能被包含在人类生成的摘要中。因此,我们假设学习自动生成摘要涉及对行动估计和排序的隐性知识。我们通过运行一项用户研究来验证我们的假设,该研究探索了人工生成的摘要与行动等级之间的相关性。我们还在人类受试者之间进行共识和行为分析,以确保可靠和一致的结果。该分析在获得的数据和验证我们最初的假设中显示出相当程度的一致性。基于研究结果,我们开发了一种方法,将行动性数据纳入明确规范学习算法,该算法被训练用于摘要生成。我们在4个摘要基准数据集上评估了我们的方法的性能,并展示了与最先进的摘要方法相比的明显优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Ancient Painting to Natural Image: A New Solution for Painting Processing GAN-Based Pose-Aware Regulation for Video-Based Person Re-Identification Coupled Generative Adversarial Network for Continuous Fine-Grained Action Segmentation Dense 3D Point Cloud Reconstruction Using a Deep Pyramid Network 3D Reconstruction and Texture Optimization Using a Sparse Set of RGB-D Cameras
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1