{"title":"Automatic video summarization by graph modeling","authors":"C. Ngo, Yu-Fei Ma, HongJiang Zhang","doi":"10.1109/ICCV.2003.1238320","DOIUrl":null,"url":null,"abstract":"We propose a unified approach for summarization based on the analysis of video structures and video highlights. Our approach emphasizes both the content balance and perceptual quality of a summary. Normalized cut algorithm is employed to globally and optimally partition a video into clusters. A motion attention model based on human perception is employed to compute the perceptual quality of shots and clusters. The clusters, together with the computed attention values, form a temporal graph similar to Markov chain that inherently describes the evolution and perceptual importance of video clusters. In our application, the flow of a temporal graph is utilized to group similar clusters into scenes, while the attention values are used as guidelines to select appropriate subshots in scenes for summarization.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"155","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings Ninth IEEE International Conference on Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCV.2003.1238320","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 155
Abstract
We propose a unified approach for summarization based on the analysis of video structures and video highlights. Our approach emphasizes both the content balance and perceptual quality of a summary. Normalized cut algorithm is employed to globally and optimally partition a video into clusters. A motion attention model based on human perception is employed to compute the perceptual quality of shots and clusters. The clusters, together with the computed attention values, form a temporal graph similar to Markov chain that inherently describes the evolution and perceptual importance of video clusters. In our application, the flow of a temporal graph is utilized to group similar clusters into scenes, while the attention values are used as guidelines to select appropriate subshots in scenes for summarization.