Augmenting CRFs with Boltzmann Machine Shape Priors for Image Labeling

Andrew Kae, Kihyuk Sohn, Honglak Lee, E. Learned-Miller
{"title":"Augmenting CRFs with Boltzmann Machine Shape Priors for Image Labeling","authors":"Andrew Kae, Kihyuk Sohn, Honglak Lee, E. Learned-Miller","doi":"10.1109/CVPR.2013.263","DOIUrl":null,"url":null,"abstract":"Conditional random fields (CRFs) provide powerful tools for building models to label image segments. They are particularly well-suited to modeling local interactions among adjacent regions (e.g., super pixels). However, CRFs are limited in dealing with complex, global (long-range) interactions between regions. Complementary to this, restricted Boltzmann machines (RBMs) can be used to model global shapes produced by segmentation models. In this work, we present a new model that uses the combined power of these two network types to build a state-of-the-art labeler. Although the CRF is a good baseline labeler, we show how an RBM can be added to the architecture to provide a global shape bias that complements the local modeling provided by the CRF. We demonstrate its labeling performance for the parts of complex face images from the Labeled Faces in the Wild data set. This hybrid model produces results that are both quantitatively and qualitatively better than the CRF alone. In addition to high-quality labeling results, we demonstrate that the hidden units in the RBM portion of our model can be interpreted as face attributes that have been learned without any attribute-level supervision.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"27 1","pages":"2019-2026"},"PeriodicalIF":0.0000,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"186","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE Conference on Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2013.263","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 186

Abstract

Conditional random fields (CRFs) provide powerful tools for building models to label image segments. They are particularly well-suited to modeling local interactions among adjacent regions (e.g., super pixels). However, CRFs are limited in dealing with complex, global (long-range) interactions between regions. Complementary to this, restricted Boltzmann machines (RBMs) can be used to model global shapes produced by segmentation models. In this work, we present a new model that uses the combined power of these two network types to build a state-of-the-art labeler. Although the CRF is a good baseline labeler, we show how an RBM can be added to the architecture to provide a global shape bias that complements the local modeling provided by the CRF. We demonstrate its labeling performance for the parts of complex face images from the Labeled Faces in the Wild data set. This hybrid model produces results that are both quantitatively and qualitatively better than the CRF alone. In addition to high-quality labeling results, we demonstrate that the hidden units in the RBM portion of our model can be interpreted as face attributes that have been learned without any attribute-level supervision.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于玻尔兹曼机器形状先验的crf图像标注
条件随机场(CRFs)为建立模型标记图像片段提供了强大的工具。它们特别适合于模拟相邻区域之间的局部相互作用(例如,超级像素)。然而,CRFs在处理区域之间复杂的全局(远程)相互作用方面是有限的。与此相辅相成的是,受限玻尔兹曼机(rbm)可用于建模由分割模型产生的全局形状。在这项工作中,我们提出了一个新的模型,该模型使用这两种网络类型的综合能力来构建最先进的标注器。尽管CRF是一个很好的基线标记器,但我们将展示如何将RBM添加到体系结构中,以提供全局形状偏差,以补充CRF提供的局部建模。我们展示了它对野生数据集中标记的复杂人脸图像部分的标记性能。这种混合模型产生的结果在数量和质量上都优于单独的CRF。除了高质量的标记结果外,我们还证明了我们模型中RBM部分的隐藏单元可以被解释为在没有任何属性级监督的情况下学习到的人脸属性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Segment-Tree Based Cost Aggregation for Stereo Matching Event Retrieval in Large Video Collections with Circulant Temporal Encoding Articulated and Restricted Motion Subspaces and Their Signatures Subspace Interpolation via Dictionary Learning for Unsupervised Domain Adaptation Learning Video Saliency from Human Gaze Using Candidate Selection
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1