{"title":"A multi-scale generative model for animate shapes and parts","authors":"A. Dubinskiy, Song-Chun Zhu","doi":"10.1109/ICCV.2003.1238350","DOIUrl":null,"url":null,"abstract":"We present a multiscale generative model for representing animate shapes and extracting meaningful parts of objects. The model assumes that animate shapes (2D simple dosed curves) are formed by a linear superposition of a number of shape bases. These shape bases resemble the multiscale Gabor bases in image pyramid representation, are well localized in both spatial and frequency domains, and form an over-complete dictionary. This model is simpler than the popular B-spline representation since it does not engage a domain partition. Thus it eliminates the interference between adjacent B-spline bases, and becomes a true linear additive model. We pursue the bases by reconstructing the shape in a coarse-to-fine procedure through curve evolution. These shape bases are further organized in a tree-structure, where the bases in each subtree sum up to an intuitive part of the object. To build probabilistic model for a class of objects, we propose a Markov random field model at each level of the tree representation to account for the spatial relationship between bases. Thus the final model integrates a Markov tree (generative) model over scales and a Markov random field over space. We adopt EM-type algorithm for learning the meaningful parts for a shape class, and show some results on shape synthesis.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings Ninth IEEE International Conference on Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCV.2003.1238350","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22
Abstract
We present a multiscale generative model for representing animate shapes and extracting meaningful parts of objects. The model assumes that animate shapes (2D simple dosed curves) are formed by a linear superposition of a number of shape bases. These shape bases resemble the multiscale Gabor bases in image pyramid representation, are well localized in both spatial and frequency domains, and form an over-complete dictionary. This model is simpler than the popular B-spline representation since it does not engage a domain partition. Thus it eliminates the interference between adjacent B-spline bases, and becomes a true linear additive model. We pursue the bases by reconstructing the shape in a coarse-to-fine procedure through curve evolution. These shape bases are further organized in a tree-structure, where the bases in each subtree sum up to an intuitive part of the object. To build probabilistic model for a class of objects, we propose a Markov random field model at each level of the tree representation to account for the spatial relationship between bases. Thus the final model integrates a Markov tree (generative) model over scales and a Markov random field over space. We adopt EM-type algorithm for learning the meaningful parts for a shape class, and show some results on shape synthesis.