Kaiming Li, Lei Guo, C. Faraco, Dajiang Zhu, Fan Deng, Tuo Zhang, Xi Jiang, Degang Zhang, Hanbo Chen, Xintao Hu, L. Miller, Tianming Liu
{"title":"Human-centered attention models for video summarization","authors":"Kaiming Li, Lei Guo, C. Faraco, Dajiang Zhu, Fan Deng, Tuo Zhang, Xi Jiang, Degang Zhang, Hanbo Chen, Xintao Hu, L. Miller, Tianming Liu","doi":"10.1145/1891903.1891938","DOIUrl":null,"url":null,"abstract":"A variety of user attention models for video/audio streams have been developed for video summarization and abstraction, in order to facilitate efficient video browsing and indexing. Essentially, human brain is the end user and evaluator of multimedia content and representation, and its responses can provide meaningful guidelines for multimedia stream summarization. For example, video/audio segments that significantly activate the visual, auditory, language and working memory systems of the human brain should be considered more important than others. It should be noted that user experience studies could be useful for such evaluations, but are suboptimal in terms of their capability of accurately capturing the full-length dynamics and interactions of the brain's response. This paper presents our preliminary efforts in applying the brain imaging technique of functional magnetic resonance imaging (fMRI) to quantify and model the dynamics and interactions between multimedia streams and brain response, when the human subjects are presented with the multimedia clips, in order to develop human-centered attention models that can be used to guide and facilitate more effective and efficient multimedia summarization. Our initial results are encouraging.","PeriodicalId":181145,"journal":{"name":"ICMI-MLMI '10","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICMI-MLMI '10","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1891903.1891938","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
A variety of user attention models for video/audio streams have been developed for video summarization and abstraction, in order to facilitate efficient video browsing and indexing. Essentially, human brain is the end user and evaluator of multimedia content and representation, and its responses can provide meaningful guidelines for multimedia stream summarization. For example, video/audio segments that significantly activate the visual, auditory, language and working memory systems of the human brain should be considered more important than others. It should be noted that user experience studies could be useful for such evaluations, but are suboptimal in terms of their capability of accurately capturing the full-length dynamics and interactions of the brain's response. This paper presents our preliminary efforts in applying the brain imaging technique of functional magnetic resonance imaging (fMRI) to quantify and model the dynamics and interactions between multimedia streams and brain response, when the human subjects are presented with the multimedia clips, in order to develop human-centered attention models that can be used to guide and facilitate more effective and efficient multimedia summarization. Our initial results are encouraging.