Estimating mixture models of images and inferring spatial transformations using the EM algorithm

B. Frey, N. Jojic
{"title":"Estimating mixture models of images and inferring spatial transformations using the EM algorithm","authors":"B. Frey, N. Jojic","doi":"10.1109/CVPR.1999.786972","DOIUrl":null,"url":null,"abstract":"Mixture modeling and clustering algorithms are effective, simple ways to represent images using a set of data centers. However, in situations where the images include background clutter and transformations such as translation, rotation, shearing and warping, these methods extract data centers that include clutter and represent different transformations of essentially the same data. Taking face images as an example, it would be more useful for the different clusters to represent different poses and expressions, instead of cluttered versions of different translations, scales and rotations. By including clutter and transformation as unobserved, latent variables in a mixture model, we obtain a new \"transformed mixture of Gaussians\", which is invariant to a specified set of transformations. We show how a linear-time EM algorithm can be used to fit this model by jointly estimating a mixture model for the data and inferring the transformation for each image. We show that this algorithm can jointly align images of a human head and learn different poses. We also find that the algorithm performs better than k-nearest neighbors and mixtures of Gaussians on handwritten digit recognition.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":"57 1","pages":"416-422 Vol. 1"},"PeriodicalIF":0.0000,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"94","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.1999.786972","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 94

Abstract

Mixture modeling and clustering algorithms are effective, simple ways to represent images using a set of data centers. However, in situations where the images include background clutter and transformations such as translation, rotation, shearing and warping, these methods extract data centers that include clutter and represent different transformations of essentially the same data. Taking face images as an example, it would be more useful for the different clusters to represent different poses and expressions, instead of cluttered versions of different translations, scales and rotations. By including clutter and transformation as unobserved, latent variables in a mixture model, we obtain a new "transformed mixture of Gaussians", which is invariant to a specified set of transformations. We show how a linear-time EM algorithm can be used to fit this model by jointly estimating a mixture model for the data and inferring the transformation for each image. We show that this algorithm can jointly align images of a human head and learn different poses. We also find that the algorithm performs better than k-nearest neighbors and mixtures of Gaussians on handwritten digit recognition.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用EM算法估计图像混合模型和推断空间变换
混合建模和聚类算法是使用一组数据中心表示图像的有效、简单的方法。但是,在图像包含背景杂波和转换(如平移、旋转、剪切和翘曲)的情况下,这些方法提取包含杂波的数据中心,并表示本质上相同数据的不同转换。以人脸图像为例,不同的簇表示不同的姿势和表情,而不是不同的平移、缩放和旋转的杂乱版本,会更有用。通过将杂波和变换作为未观察到的潜在变量加入混合模型中,我们得到了一种新的“变换后的高斯混合”,它对一组特定的变换是不变的。我们展示了线性时间EM算法如何通过联合估计数据的混合模型和推断每个图像的转换来拟合该模型。我们证明了该算法可以联合对齐人类头部图像并学习不同的姿势。我们还发现该算法在手写数字识别上的性能优于k近邻和混合高斯。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Visual signature verification using affine arc-length A novel Bayesian method for fitting parametric and non-parametric models to noisy data Material classification for 3D objects in aerial hyperspectral images Deformable template and distribution mixture-based data modeling for the endocardial contour tracking in an echographic sequence Applying perceptual grouping to content-based image retrieval: building images
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1