{"title":"Visual grouping and object recognition","authors":"Jitendra Malik","doi":"10.1109/ICIAP.2001.957078","DOIUrl":null,"url":null,"abstract":"We develop a two-stage framework for parsing and understanding images, a process of image segmentation grouping pixels to form regions of coherent color and texture, and a process of recognition - comparing assemblies of such regions, hypothesized to correspond to a single object, with views of stored prototypes. We treat segmenting images into regions as an optimization problem: partition the image into regions such that there is high similarity within a region and low similarity across regions. This is formalized as the minimization of the normalized cut between regions. Using ideas from spectral graph theory, the minimization can be set as an eigenvalue problem. Visual attributes such as color, texture, contour and motion are encoded in this framework by suitable specification of graph edge weights. The recognition problem requires us to compare assemblies of image regions with previously stored proto-typical views of known objects. We have devised a novel algorithm for shape matching based on a relationship descriptor called the shape context. This enables us to compute similarity measures between shapes which, together with similarity measures for texture and color, can be used for object recognition. The shape matching algorithm has yielded excellent results on a variety of different 2D and 3D recognition problems.","PeriodicalId":365627,"journal":{"name":"Proceedings 11th International Conference on Image Analysis and Processing","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 11th International Conference on Image Analysis and Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIAP.2001.957078","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
We develop a two-stage framework for parsing and understanding images, a process of image segmentation grouping pixels to form regions of coherent color and texture, and a process of recognition - comparing assemblies of such regions, hypothesized to correspond to a single object, with views of stored prototypes. We treat segmenting images into regions as an optimization problem: partition the image into regions such that there is high similarity within a region and low similarity across regions. This is formalized as the minimization of the normalized cut between regions. Using ideas from spectral graph theory, the minimization can be set as an eigenvalue problem. Visual attributes such as color, texture, contour and motion are encoded in this framework by suitable specification of graph edge weights. The recognition problem requires us to compare assemblies of image regions with previously stored proto-typical views of known objects. We have devised a novel algorithm for shape matching based on a relationship descriptor called the shape context. This enables us to compute similarity measures between shapes which, together with similarity measures for texture and color, can be used for object recognition. The shape matching algorithm has yielded excellent results on a variety of different 2D and 3D recognition problems.