{"title":"Video Summarization with Global and Local Features","authors":"Genliang Guan, Zhiyong Wang, Kaimin Yu, Shaohui Mei, Mingyi He, D. Feng","doi":"10.1109/ICMEW.2012.105","DOIUrl":null,"url":null,"abstract":"Video summarization has been crucial for effective and efficient access of video content due to the ever increasing amount of video data. Most of the existing key frame based summarization approaches represent individual frames with global features, which neglects the local details of visual content. Considering that a video generally depicts a story with a number of scenes in different temporal order and shooting angles, we formulate scene summarization as identifying a set of frames which best covers the key point pool constructed from the scene. Therefore, our approach is a two-step process, identifying scenes and selecting representative content for each scene. Global features are utilized to identify scenes through clustering due to the visual similarity among video frames of the same scene, and local features to summarize each scene. We develop a key point based key frame selection method to identify representative content of a scene, which allows users to flexibly tune summarization length. Our preliminary results indicate that the proposed approach is very promising and potentially robust to clustering based scene identification.","PeriodicalId":385797,"journal":{"name":"2012 IEEE International Conference on Multimedia and Expo Workshops","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2012-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Conference on Multimedia and Expo Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMEW.2012.105","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 33
Abstract
Video summarization has been crucial for effective and efficient access of video content due to the ever increasing amount of video data. Most of the existing key frame based summarization approaches represent individual frames with global features, which neglects the local details of visual content. Considering that a video generally depicts a story with a number of scenes in different temporal order and shooting angles, we formulate scene summarization as identifying a set of frames which best covers the key point pool constructed from the scene. Therefore, our approach is a two-step process, identifying scenes and selecting representative content for each scene. Global features are utilized to identify scenes through clustering due to the visual similarity among video frames of the same scene, and local features to summarize each scene. We develop a key point based key frame selection method to identify representative content of a scene, which allows users to flexibly tune summarization length. Our preliminary results indicate that the proposed approach is very promising and potentially robust to clustering based scene identification.