{"title":"Extraction of Semantic Keyframes Based on Visual Attention and Affective Models","authors":"Zhicheng Zhao, A. Cai","doi":"10.1109/CIS.2007.9","DOIUrl":null,"url":null,"abstract":"The Extraction of video keyframe is convenient for browsing and retrieving of video content. However, since the \"keyframe\" is a subjective concept which involves in vision and psychology, it is difficult to be described by low-level features of video. In this paper, we propose a method of keyframe extraction based on visual attention and affective models. To be concrete, film elements such as character, lighting and camera motion, crucial to human attention, are fused into a visual attention model, and the film is segmented into scenes according to a short-time memory model. The \"scene importance\" is then computed by using the affective arousal which determines audience's excitability in the 2D emotion space. Finally, according to the attention model and the scene importance, scene keyframes are extracted. Experimental results indicate that keyframes extracted by our approach are coincident with human perception, and would be in favor of further semantic analysis.","PeriodicalId":127238,"journal":{"name":"2007 International Conference on Computational Intelligence and Security (CIS 2007)","volume":"413 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 International Conference on Computational Intelligence and Security (CIS 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIS.2007.9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
The Extraction of video keyframe is convenient for browsing and retrieving of video content. However, since the "keyframe" is a subjective concept which involves in vision and psychology, it is difficult to be described by low-level features of video. In this paper, we propose a method of keyframe extraction based on visual attention and affective models. To be concrete, film elements such as character, lighting and camera motion, crucial to human attention, are fused into a visual attention model, and the film is segmented into scenes according to a short-time memory model. The "scene importance" is then computed by using the affective arousal which determines audience's excitability in the 2D emotion space. Finally, according to the attention model and the scene importance, scene keyframes are extracted. Experimental results indicate that keyframes extracted by our approach are coincident with human perception, and would be in favor of further semantic analysis.