{"title":"Interclass visual similarity based visual vocabulary learning","authors":"Guangming Chang, Chunfen Yuan, Weiming Hu","doi":"10.1109/ACPR.2011.6166597","DOIUrl":null,"url":null,"abstract":"Visual vocabulary is now widely used in many video analysis tasks, such as event detection, video retrieval and video classification. In most approaches the vocabularies are solely based on statistics of visual features and generated by clustering. Little attention has been paid to the interclass similarity among different events or actions. In this paper, we present a novel approach to mine the interclass visual similarity statistically and then use it to supervise the generation of visual vocabulary. We construct a measurement of interclass similarity, embed the similarity to the Euclidean distance and use the refined distance to generate visual vocabulary iteratively. The experiments in Weizmann and KTH datasets show that our approach outperforms the traditional vocabulary based approach by about 5%.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"87 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The First Asian Conference on Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACPR.2011.6166597","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Visual vocabulary is now widely used in many video analysis tasks, such as event detection, video retrieval and video classification. In most approaches the vocabularies are solely based on statistics of visual features and generated by clustering. Little attention has been paid to the interclass similarity among different events or actions. In this paper, we present a novel approach to mine the interclass visual similarity statistically and then use it to supervise the generation of visual vocabulary. We construct a measurement of interclass similarity, embed the similarity to the Euclidean distance and use the refined distance to generate visual vocabulary iteratively. The experiments in Weizmann and KTH datasets show that our approach outperforms the traditional vocabulary based approach by about 5%.