{"title":"基于显著性的视频视觉场景语义相似度计算","authors":"Wei Wei, Tianyun Yan, Yuan-Mao Zhang","doi":"10.1109/ICMLC.2010.5580490","DOIUrl":null,"url":null,"abstract":"Based on saliency region representation of visual scene, a framework for quantifying the semantic similarity of two video scenes is proposed in this paper. Frame-segment key-frame strategy is used to concisely represent video content in temporal domain. Spatio-temporal conspicuity model for basic visual semantics, a neuromorphic model that simulates human visual system, is used to select dynamic and static spatial salient areas. With pattern classification technique, the basic visual semantics are recognized. Then, the similarity of two visual scenes is calculated according to information theoretic similarity principles and Tversky's set-theoretic similarity. Experiment results demonstrate the framework could compute quantitative semantic similarity of two video scenes.","PeriodicalId":126080,"journal":{"name":"2010 International Conference on Machine Learning and Cybernetics","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Conspicuity-based visual scene semantic similarity computing for video\",\"authors\":\"Wei Wei, Tianyun Yan, Yuan-Mao Zhang\",\"doi\":\"10.1109/ICMLC.2010.5580490\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Based on saliency region representation of visual scene, a framework for quantifying the semantic similarity of two video scenes is proposed in this paper. Frame-segment key-frame strategy is used to concisely represent video content in temporal domain. Spatio-temporal conspicuity model for basic visual semantics, a neuromorphic model that simulates human visual system, is used to select dynamic and static spatial salient areas. With pattern classification technique, the basic visual semantics are recognized. Then, the similarity of two visual scenes is calculated according to information theoretic similarity principles and Tversky's set-theoretic similarity. Experiment results demonstrate the framework could compute quantitative semantic similarity of two video scenes.\",\"PeriodicalId\":126080,\"journal\":{\"name\":\"2010 International Conference on Machine Learning and Cybernetics\",\"volume\":\"47 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-07-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 International Conference on Machine Learning and Cybernetics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLC.2010.5580490\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 International Conference on Machine Learning and Cybernetics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLC.2010.5580490","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Conspicuity-based visual scene semantic similarity computing for video
Based on saliency region representation of visual scene, a framework for quantifying the semantic similarity of two video scenes is proposed in this paper. Frame-segment key-frame strategy is used to concisely represent video content in temporal domain. Spatio-temporal conspicuity model for basic visual semantics, a neuromorphic model that simulates human visual system, is used to select dynamic and static spatial salient areas. With pattern classification technique, the basic visual semantics are recognized. Then, the similarity of two visual scenes is calculated according to information theoretic similarity principles and Tversky's set-theoretic similarity. Experiment results demonstrate the framework could compute quantitative semantic similarity of two video scenes.