{"title":"Multi-task Recurrent Neural Network for Immediacy Prediction","authors":"Xiao Chu, Wanli Ouyang, Wei Yang, Xiaogang Wang","doi":"10.1109/ICCV.2015.383","DOIUrl":null,"url":null,"abstract":"In this paper, we propose to predict immediacy for interacting persons from still images. A complete immediacy set includes interactions, relative distance, body leaning direction and standing orientation. These measures are found to be related to the attitude, social relationship, social interaction, action, nationality, and religion of the communicators. A large-scale dataset with 10,000 images is constructed, in which all the immediacy measures and the human poses are annotated. We propose a rich set of immediacy representations that help to predict immediacy from imperfect 1-person and 2-person pose estimation results. A multi-task deep recurrent neural network is constructed to take the proposed rich immediacy representation as input and learn the complex relationship among immediacy predictions multiple steps of refinement. The effectiveness of the proposed approach is proved through extensive experiments on the large scale dataset.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"93 1","pages":"3352-3360"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"47","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Computer Vision (ICCV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCV.2015.383","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 47
Abstract
In this paper, we propose to predict immediacy for interacting persons from still images. A complete immediacy set includes interactions, relative distance, body leaning direction and standing orientation. These measures are found to be related to the attitude, social relationship, social interaction, action, nationality, and religion of the communicators. A large-scale dataset with 10,000 images is constructed, in which all the immediacy measures and the human poses are annotated. We propose a rich set of immediacy representations that help to predict immediacy from imperfect 1-person and 2-person pose estimation results. A multi-task deep recurrent neural network is constructed to take the proposed rich immediacy representation as input and learn the complex relationship among immediacy predictions multiple steps of refinement. The effectiveness of the proposed approach is proved through extensive experiments on the large scale dataset.