使用语义共现的整体上下文建模

2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI:10.1109/CVPR.2009.5206826

Nikhil Rasiwasia, N. Vasconcelos

{"title":"使用语义共现的整体上下文建模","authors":"Nikhil Rasiwasia, N. Vasconcelos","doi":"10.1109/CVPR.2009.5206826","DOIUrl":null,"url":null,"abstract":"We present a simple framework to model contextual relationships between visual concepts. The new framework combines ideas from previous object-centric methods (which model contextual relationships between objects in an image, such as their co-occurrence patterns) and scene-centric methods (which learn a holistic context model from the entire image, known as its “gist”). This is accomplished without demarcating individual concepts or regions in the image. First, using the output of a generic appearance based concept detection system, a semantic space is formulated, where each axis represents a semantic feature. Next, context models are learned for each of the concepts in the semantic space, using mixtures of Dirichlet distributions. Finally, an image is represented as a vector of posterior concept probabilities under these contextual concept models. It is shown that these posterior probabilities are remarkably noise-free, and an effective model of the contextual relationships between semantic concepts in natural images. This is further demonstrated through an experimental evaluation with respect to two vision tasks, viz. scene classification and image annotation, on benchmark datasets. The results show that, besides quite simple to compute, the proposed context models attain superior performance than state of the art systems in both tasks.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"49","resultStr":"{\"title\":\"Holistic context modeling using semantic co-occurrences\",\"authors\":\"Nikhil Rasiwasia, N. Vasconcelos\",\"doi\":\"10.1109/CVPR.2009.5206826\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a simple framework to model contextual relationships between visual concepts. The new framework combines ideas from previous object-centric methods (which model contextual relationships between objects in an image, such as their co-occurrence patterns) and scene-centric methods (which learn a holistic context model from the entire image, known as its “gist”). This is accomplished without demarcating individual concepts or regions in the image. First, using the output of a generic appearance based concept detection system, a semantic space is formulated, where each axis represents a semantic feature. Next, context models are learned for each of the concepts in the semantic space, using mixtures of Dirichlet distributions. Finally, an image is represented as a vector of posterior concept probabilities under these contextual concept models. It is shown that these posterior probabilities are remarkably noise-free, and an effective model of the contextual relationships between semantic concepts in natural images. This is further demonstrated through an experimental evaluation with respect to two vision tasks, viz. scene classification and image annotation, on benchmark datasets. The results show that, besides quite simple to compute, the proposed context models attain superior performance than state of the art systems in both tasks.\",\"PeriodicalId\":386532,\"journal\":{\"name\":\"2009 IEEE Conference on Computer Vision and Pattern Recognition\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"49\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 IEEE Conference on Computer Vision and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CVPR.2009.5206826\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE Conference on Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2009.5206826","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 49

摘要

我们提出了一个简单的框架来模拟视觉概念之间的上下文关系。新框架结合了以前以对象为中心的方法(为图像中对象之间的上下文关系建模，例如它们的共现模式)和以场景为中心的方法(从整个图像中学习整体上下文模型，称为“要点”)的思想。这是在不划分图像中的单个概念或区域的情况下完成的。首先，使用基于通用外观的概念检测系统的输出，形成一个语义空间，其中每个轴表示一个语义特征。接下来，使用Dirichlet分布的混合，为语义空间中的每个概念学习上下文模型。最后，在这些上下文概念模型下，将图像表示为后验概念概率向量。结果表明，这些后验概率具有显著的无噪声性，是自然图像中语义概念之间上下文关系的有效模型。通过在基准数据集上对两个视觉任务(即场景分类和图像注释)进行实验评估，进一步证明了这一点。结果表明，除了计算简单之外，所提出的上下文模型在这两个任务中都比目前的系统具有更好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Holistic context modeling using semantic co-occurrences

We present a simple framework to model contextual relationships between visual concepts. The new framework combines ideas from previous object-centric methods (which model contextual relationships between objects in an image, such as their co-occurrence patterns) and scene-centric methods (which learn a holistic context model from the entire image, known as its “gist”). This is accomplished without demarcating individual concepts or regions in the image. First, using the output of a generic appearance based concept detection system, a semantic space is formulated, where each axis represents a semantic feature. Next, context models are learned for each of the concepts in the semantic space, using mixtures of Dirichlet distributions. Finally, an image is represented as a vector of posterior concept probabilities under these contextual concept models. It is shown that these posterior probabilities are remarkably noise-free, and an effective model of the contextual relationships between semantic concepts in natural images. This is further demonstrated through an experimental evaluation with respect to two vision tasks, viz. scene classification and image annotation, on benchmark datasets. The results show that, besides quite simple to compute, the proposed context models attain superior performance than state of the art systems in both tasks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2009 IEEE Conference on Computer Vision and Pattern Recognition

自引率

0.00%

发文量