Motion Based Video Skimming

I. Alam, Devesh Jalan, Priti Shaw, Partha Pratim Mohanta
{"title":"Motion Based Video Skimming","authors":"I. Alam, Devesh Jalan, Priti Shaw, Partha Pratim Mohanta","doi":"10.1109/CALCON49167.2020.9106488","DOIUrl":null,"url":null,"abstract":"Automatic video summarization is a sustainable method that provides efficient browsing and searching mechanism for long videos. Video skimming is one of the popular ways to represent a summary of a full-length video. This work describes an unsupervised technique that automatically extracts the important clips from an input video and generates a summarized version of that video. The proposed scheme of video skimming is composed of three parts: extraction of motion based features, selection of important clips, detection, and removal of the shot boundary, if any, within a clip. Each frame is represented by a 32-dimensional feature vector that is generated using the slope and magnitude of the motion vectors. A set of representative frames of the entire video is obtained using the k-means clustering followed by the Maximal Spanning Tree (MxST). These representative frames are the center point of the clips to be generated. A window is considered around these representative frames and the clip is formed. A shot boundary may exist within the clip. To detect such a shot boundary, a method is proposed considering the variations present in the pixel intensities of the frames of a clip. The variation among the frames is captured using the standard deviation of the distribution of the pixel intensities. The clips are reformed in case the boundary is detected. Finally, the skim is generated by concatenating extracted video clips in a sequential manner. The obtained video summaries are concise and the proper representation of the input videos. The experiment is performed on two benchmark datasets namely SumMe and TVSum. Experimental results show that the proposed method outperforms the state-of-the-art methods.","PeriodicalId":318478,"journal":{"name":"2020 IEEE Calcutta Conference (CALCON)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Calcutta Conference (CALCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CALCON49167.2020.9106488","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Automatic video summarization is a sustainable method that provides efficient browsing and searching mechanism for long videos. Video skimming is one of the popular ways to represent a summary of a full-length video. This work describes an unsupervised technique that automatically extracts the important clips from an input video and generates a summarized version of that video. The proposed scheme of video skimming is composed of three parts: extraction of motion based features, selection of important clips, detection, and removal of the shot boundary, if any, within a clip. Each frame is represented by a 32-dimensional feature vector that is generated using the slope and magnitude of the motion vectors. A set of representative frames of the entire video is obtained using the k-means clustering followed by the Maximal Spanning Tree (MxST). These representative frames are the center point of the clips to be generated. A window is considered around these representative frames and the clip is formed. A shot boundary may exist within the clip. To detect such a shot boundary, a method is proposed considering the variations present in the pixel intensities of the frames of a clip. The variation among the frames is captured using the standard deviation of the distribution of the pixel intensities. The clips are reformed in case the boundary is detected. Finally, the skim is generated by concatenating extracted video clips in a sequential manner. The obtained video summaries are concise and the proper representation of the input videos. The experiment is performed on two benchmark datasets namely SumMe and TVSum. Experimental results show that the proposed method outperforms the state-of-the-art methods.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于动作的视频浏览
自动视频摘要是一种可持续的方法,为长视频提供了高效的浏览和搜索机制。视频略读是一种流行的方式来表示一个完整的视频的摘要。这项工作描述了一种无监督技术,该技术自动从输入视频中提取重要片段,并生成该视频的摘要版本。提出的视频浏览方案由三个部分组成:基于运动特征的提取、重要片段的选择、检测和去除片段内的镜头边界(如果有的话)。每个帧由一个32维特征向量表示,该特征向量是使用运动向量的斜率和幅度生成的。通过k-means聚类和最大生成树(maximum Spanning Tree, MxST),得到了一组具有代表性的视频帧。这些代表性帧是要生成的剪辑的中心点。在这些有代表性的框架周围考虑一个窗口,并形成剪辑。镜头边界可能存在于剪辑中。为了检测这样的镜头边界,提出了一种考虑到剪辑帧的像素强度变化的方法。使用像素强度分布的标准差捕获帧之间的变化。如果检测到边界,则对剪辑进行重组。最后,通过按顺序将提取的视频片段连接起来生成略读。得到的视频摘要简洁,对输入的视频进行了恰当的表示。实验在SumMe和TVSum两个基准数据集上进行。实验结果表明,该方法优于现有方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Comparative Study between Rectangular, Circular and Annular Ring Shaped Microstrip Antennas Dynamic and Transient State Analysis of Islanded Microgrid A Fault-Detecting and Motion-Sensing Wireless Light Controller for LED Lighting System Improved Sparsity Aware Collaborative Spectrum Estimation for Small Cell Networks Harmony Search based Adaptive Sliding Mode Control for Inverted Pendulum
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1