Understanding scenes on many levels

2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI:10.1109/ICCV.2011.6126260

Joseph Tighe, S. Lazebnik

引用次数: 40

Abstract

This paper presents a framework for image parsing with multiple label sets. For example, we may want to simultaneously label every image region according to its basic-level object category (car, building, road, tree, etc.), superordinate category (animal, vehicle, manmade object, natural object, etc.), geometric orientation (horizontal, vertical, etc.), and material (metal, glass, wood, etc.). Some object regions may also be given part names (a car can have wheels, doors, windshield, etc.). We compute co-occurrence statistics between different label types of the same region to capture relationships such as “roads are horizontal,” “cars are made of metal,” “cars have wheels” but “horses have legs,” and so on. By incorporating these constraints into a Markov Random Field inference framework and jointly solving for all the label sets, we are able to improve the classification accuracy for all the label sets at once, achieving a richer form of image understanding.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

从多个层面理解场景

提出了一种基于多标签集的图像解析框架。例如，我们可能希望同时根据每个图像区域的基本对象类别(汽车、建筑、道路、树木等)、上级类别(动物、车辆、人造物体、自然物体等)、几何方向(水平、垂直等)和材料(金属、玻璃、木材等)来标记每个图像区域。一些对象区域也可以被赋予部件名称(汽车可以有轮子、门、挡风玻璃等)。我们计算同一区域的不同标签类型之间的共现统计，以捕获诸如“道路是水平的”、“汽车是金属制成的”、“汽车有轮子”但“马有腿”等关系。通过将这些约束纳入马尔可夫随机场推理框架，并对所有标签集进行联合求解，我们可以一次提高所有标签集的分类精度，实现更丰富的图像理解形式。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2011 International Conference on Computer Vision

自引率

0.00%

发文量

期刊最新文献

Robust and efficient parametric face alignment Video parsing for abnormality detection From learning models of natural image patches to whole image restoration Discriminative figure-centric models for joint action localization and recognition A general preconditioning scheme for difference measures in deformable registration