基于玻尔兹曼机器形状先验的crf图像标注

2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI:10.1109/CVPR.2013.263

Andrew Kae, Kihyuk Sohn, Honglak Lee, E. Learned-Miller

{"title":"基于玻尔兹曼机器形状先验的crf图像标注","authors":"Andrew Kae, Kihyuk Sohn, Honglak Lee, E. Learned-Miller","doi":"10.1109/CVPR.2013.263","DOIUrl":null,"url":null,"abstract":"Conditional random fields (CRFs) provide powerful tools for building models to label image segments. They are particularly well-suited to modeling local interactions among adjacent regions (e.g., super pixels). However, CRFs are limited in dealing with complex, global (long-range) interactions between regions. Complementary to this, restricted Boltzmann machines (RBMs) can be used to model global shapes produced by segmentation models. In this work, we present a new model that uses the combined power of these two network types to build a state-of-the-art labeler. Although the CRF is a good baseline labeler, we show how an RBM can be added to the architecture to provide a global shape bias that complements the local modeling provided by the CRF. We demonstrate its labeling performance for the parts of complex face images from the Labeled Faces in the Wild data set. This hybrid model produces results that are both quantitatively and qualitatively better than the CRF alone. In addition to high-quality labeling results, we demonstrate that the hidden units in the RBM portion of our model can be interpreted as face attributes that have been learned without any attribute-level supervision.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"27 1","pages":"2019-2026"},"PeriodicalIF":0.0000,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"186","resultStr":"{\"title\":\"Augmenting CRFs with Boltzmann Machine Shape Priors for Image Labeling\",\"authors\":\"Andrew Kae, Kihyuk Sohn, Honglak Lee, E. Learned-Miller\",\"doi\":\"10.1109/CVPR.2013.263\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Conditional random fields (CRFs) provide powerful tools for building models to label image segments. They are particularly well-suited to modeling local interactions among adjacent regions (e.g., super pixels). However, CRFs are limited in dealing with complex, global (long-range) interactions between regions. Complementary to this, restricted Boltzmann machines (RBMs) can be used to model global shapes produced by segmentation models. In this work, we present a new model that uses the combined power of these two network types to build a state-of-the-art labeler. Although the CRF is a good baseline labeler, we show how an RBM can be added to the architecture to provide a global shape bias that complements the local modeling provided by the CRF. We demonstrate its labeling performance for the parts of complex face images from the Labeled Faces in the Wild data set. This hybrid model produces results that are both quantitatively and qualitatively better than the CRF alone. In addition to high-quality labeling results, we demonstrate that the hidden units in the RBM portion of our model can be interpreted as face attributes that have been learned without any attribute-level supervision.\",\"PeriodicalId\":6343,\"journal\":{\"name\":\"2013 IEEE Conference on Computer Vision and Pattern Recognition\",\"volume\":\"27 1\",\"pages\":\"2019-2026\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-06-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"186\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE Conference on Computer Vision and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CVPR.2013.263\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE Conference on Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2013.263","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 186

摘要

条件随机场(CRFs)为建立模型标记图像片段提供了强大的工具。它们特别适合于模拟相邻区域之间的局部相互作用(例如，超级像素)。然而，CRFs在处理区域之间复杂的全局(远程)相互作用方面是有限的。与此相辅相成的是，受限玻尔兹曼机(rbm)可用于建模由分割模型产生的全局形状。在这项工作中，我们提出了一个新的模型，该模型使用这两种网络类型的综合能力来构建最先进的标注器。尽管CRF是一个很好的基线标记器，但我们将展示如何将RBM添加到体系结构中，以提供全局形状偏差，以补充CRF提供的局部建模。我们展示了它对野生数据集中标记的复杂人脸图像部分的标记性能。这种混合模型产生的结果在数量和质量上都优于单独的CRF。除了高质量的标记结果外，我们还证明了我们模型中RBM部分的隐藏单元可以被解释为在没有任何属性级监督的情况下学习到的人脸属性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Augmenting CRFs with Boltzmann Machine Shape Priors for Image Labeling

Conditional random fields (CRFs) provide powerful tools for building models to label image segments. They are particularly well-suited to modeling local interactions among adjacent regions (e.g., super pixels). However, CRFs are limited in dealing with complex, global (long-range) interactions between regions. Complementary to this, restricted Boltzmann machines (RBMs) can be used to model global shapes produced by segmentation models. In this work, we present a new model that uses the combined power of these two network types to build a state-of-the-art labeler. Although the CRF is a good baseline labeler, we show how an RBM can be added to the architecture to provide a global shape bias that complements the local modeling provided by the CRF. We demonstrate its labeling performance for the parts of complex face images from the Labeled Faces in the Wild data set. This hybrid model produces results that are both quantitatively and qualitatively better than the CRF alone. In addition to high-quality labeling results, we demonstrate that the hidden units in the RBM portion of our model can be interpreted as face attributes that have been learned without any attribute-level supervision.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2013 IEEE Conference on Computer Vision and Pattern Recognition

自引率

0.00%

发文量

期刊最新文献

Segment-Tree Based Cost Aggregation for Stereo Matching Event Retrieval in Large Video Collections with Circulant Temporal Encoding Articulated and Restricted Motion Subspaces and Their Signatures Subspace Interpolation via Dictionary Learning for Unsupervised Domain Adaptation Learning Video Saliency from Human Gaze Using Candidate Selection