Demo2Vec: Reasoning Object Affordances from Online Videos

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI:10.1109/CVPR.2018.00228

Kuan Fang, Te-Lin Wu, Daniel Yang, S. Savarese, Joseph J. Lim

引用次数: 69

Abstract

Watching expert demonstrations is an important way for humans and robots to reason about affordances of unseen objects. In this paper, we consider the problem of reasoning object affordances through the feature embedding of demonstration videos. We design the Demo2Vec model which learns to extract embedded vectors of demonstration videos and predicts the interaction region and the action label on a target image of the same object. We introduce the Online Product Review dataset for Affordance (OPRA) by collecting and labeling diverse YouTube product review videos. Our Demo2Vec model outperforms various recurrent neural network baselines on the collected dataset.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Demo2Vec:从在线视频中推理对象的能力

观看专家演示是人类和机器人对看不见的物体的可见性进行推理的重要方式。在本文中，我们考虑了通过演示视频的特征嵌入来推理对象的辅助性问题。我们设计了Demo2Vec模型，该模型学习提取演示视频的嵌入向量，并预测同一对象的目标图像上的交互区域和动作标签。我们通过收集和标记不同的YouTube产品评论视频，引入在线产品评论数据集(OPRA)。我们的Demo2Vec模型在收集的数据集上优于各种循环神经网络基线。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition

自引率

0.00%

发文量

期刊最新文献

Multistage Adversarial Losses for Pose-Based Human Image Synthesis Document Enhancement Using Visibility Detection Demo2Vec: Reasoning Object Affordances from Online Videos Planar Shape Detection at Structural Scales Where and Why are They Looking? Jointly Inferring Human Attention and Intentions in Complex Tasks