Video Summarization with Global and Local Features

2012 IEEE International Conference on Multimedia and Expo Workshops Pub Date : 2012-07-09 DOI:10.1109/ICMEW.2012.105

Genliang Guan, Zhiyong Wang, Kaimin Yu, Shaohui Mei, Mingyi He, D. Feng

{"title":"Video Summarization with Global and Local Features","authors":"Genliang Guan, Zhiyong Wang, Kaimin Yu, Shaohui Mei, Mingyi He, D. Feng","doi":"10.1109/ICMEW.2012.105","DOIUrl":null,"url":null,"abstract":"Video summarization has been crucial for effective and efficient access of video content due to the ever increasing amount of video data. Most of the existing key frame based summarization approaches represent individual frames with global features, which neglects the local details of visual content. Considering that a video generally depicts a story with a number of scenes in different temporal order and shooting angles, we formulate scene summarization as identifying a set of frames which best covers the key point pool constructed from the scene. Therefore, our approach is a two-step process, identifying scenes and selecting representative content for each scene. Global features are utilized to identify scenes through clustering due to the visual similarity among video frames of the same scene, and local features to summarize each scene. We develop a key point based key frame selection method to identify representative content of a scene, which allows users to flexibly tune summarization length. Our preliminary results indicate that the proposed approach is very promising and potentially robust to clustering based scene identification.","PeriodicalId":385797,"journal":{"name":"2012 IEEE International Conference on Multimedia and Expo Workshops","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Conference on Multimedia and Expo Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMEW.2012.105","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 33

Abstract

Video summarization has been crucial for effective and efficient access of video content due to the ever increasing amount of video data. Most of the existing key frame based summarization approaches represent individual frames with global features, which neglects the local details of visual content. Considering that a video generally depicts a story with a number of scenes in different temporal order and shooting angles, we formulate scene summarization as identifying a set of frames which best covers the key point pool constructed from the scene. Therefore, our approach is a two-step process, identifying scenes and selecting representative content for each scene. Global features are utilized to identify scenes through clustering due to the visual similarity among video frames of the same scene, and local features to summarize each scene. We develop a key point based key frame selection method to identify representative content of a scene, which allows users to flexibly tune summarization length. Our preliminary results indicate that the proposed approach is very promising and potentially robust to clustering based scene identification.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

具有全局和局部特征的视频摘要

随着视频数据量的不断增加，视频摘要对视频内容的有效访问至关重要。现有的基于关键帧的摘要方法大多是用全局特征表示单个帧，忽略了视觉内容的局部细节。考虑到一个视频通常用不同时间顺序和拍摄角度的多个场景来描述一个故事，我们将场景摘要定义为识别一组最能覆盖由场景构建的关键点池的帧。因此，我们的方法是一个两步的过程，识别场景和为每个场景选择代表性内容。利用全局特征通过聚类来识别同一场景的视频帧之间的视觉相似性，利用局部特征来总结每个场景。我们开发了一种基于关键点的关键帧选择方法来识别场景的代表性内容，使用户可以灵活地调整摘要长度。我们的初步结果表明，该方法在基于聚类的场景识别中非常有前途，具有潜在的鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2012 IEEE International Conference on Multimedia and Expo Workshops

自引率

0.00%

发文量