人体属性、部位和姿势联合估计的属性语法

2015 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2015-12-07 DOI:10.1109/ICCV.2015.273

Seyoung Park, Song-Chun Zhu

{"title":"人体属性、部位和姿势联合估计的属性语法","authors":"Seyoung Park, Song-Chun Zhu","doi":"10.1109/ICCV.2015.273","DOIUrl":null,"url":null,"abstract":"In this paper, we are interested in developing compositional models to explicit representing pose, parts and attributes and tackling the tasks of attribute recognition, pose estimation and part localization jointly. This is different from the recent trend of using CNN-based approaches for training and testing on these tasks separately with a large amount of data. Conventional attribute models typically use a large number of region-based attribute classifiers on parts of pre-trained pose estimator without explicitly detecting the object or its parts, or considering the correlations between attributes. In contrast, our approach jointly represents both the object parts and their semantic attributes within a unified compositional hierarchy. We apply our attributed grammar model to the task of human parsing by simultaneously performing part localization and attribute recognition. We show our modeling helps performance improvements on pose-estimation task and also outperforms on other existing methods on attribute prediction task.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"43 1","pages":"2372-2380"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":"{\"title\":\"Attributed Grammars for Joint Estimation of Human Attributes, Part and Pose\",\"authors\":\"Seyoung Park, Song-Chun Zhu\",\"doi\":\"10.1109/ICCV.2015.273\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we are interested in developing compositional models to explicit representing pose, parts and attributes and tackling the tasks of attribute recognition, pose estimation and part localization jointly. This is different from the recent trend of using CNN-based approaches for training and testing on these tasks separately with a large amount of data. Conventional attribute models typically use a large number of region-based attribute classifiers on parts of pre-trained pose estimator without explicitly detecting the object or its parts, or considering the correlations between attributes. In contrast, our approach jointly represents both the object parts and their semantic attributes within a unified compositional hierarchy. We apply our attributed grammar model to the task of human parsing by simultaneously performing part localization and attribute recognition. We show our modeling helps performance improvements on pose-estimation task and also outperforms on other existing methods on attribute prediction task.\",\"PeriodicalId\":6633,\"journal\":{\"name\":\"2015 IEEE International Conference on Computer Vision (ICCV)\",\"volume\":\"43 1\",\"pages\":\"2372-2380\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-12-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"23\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE International Conference on Computer Vision (ICCV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCV.2015.273\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Computer Vision (ICCV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCV.2015.273","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 23

摘要

在本文中，我们感兴趣的是开发组合模型来显式表示姿态、部件和属性，并共同解决属性识别、姿态估计和部件定位的任务。这与最近使用基于cnn的方法对这些任务分别进行大量数据的训练和测试的趋势不同。传统的属性模型通常在预训练姿态估计器的部分上使用大量基于区域的属性分类器，而没有明确检测物体或其部分，也没有考虑属性之间的相关性。相反，我们的方法在一个统一的组合层次结构中联合表示对象部分及其语义属性。我们通过同时进行部分定位和属性识别，将我们的属性语法模型应用于人工解析任务。我们的模型有助于姿态估计任务的性能改进，并且在属性预测任务上优于其他现有方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Attributed Grammars for Joint Estimation of Human Attributes, Part and Pose

In this paper, we are interested in developing compositional models to explicit representing pose, parts and attributes and tackling the tasks of attribute recognition, pose estimation and part localization jointly. This is different from the recent trend of using CNN-based approaches for training and testing on these tasks separately with a large amount of data. Conventional attribute models typically use a large number of region-based attribute classifiers on parts of pre-trained pose estimator without explicitly detecting the object or its parts, or considering the correlations between attributes. In contrast, our approach jointly represents both the object parts and their semantic attributes within a unified compositional hierarchy. We apply our attributed grammar model to the task of human parsing by simultaneously performing part localization and attribute recognition. We show our modeling helps performance improvements on pose-estimation task and also outperforms on other existing methods on attribute prediction task.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 IEEE International Conference on Computer Vision (ICCV)

自引率

0.00%

发文量

期刊最新文献

Listening with Your Eyes: Towards a Practical Visual Speech Recognition System Using Deep Boltzmann Machines Self-Calibration of Optical Lenses Single Image Pop-Up from Discriminatively Learned Parts Multi-task Recurrent Neural Network for Immediacy Prediction Low-Rank Tensor Approximation with Laplacian Scale Mixture Modeling for Multiframe Image Denoising