基于三维空间对象关系和语言指令的语义场景处理

2020 IEEE-RAS 20th International Conference on Humanoid Robots (Humanoids) Pub Date : 2021-07-19 DOI:10.1109/HUMANOIDS47582.2021.9555802

Rainer Kartmann, Danqing Liu, T. Asfour

{"title":"基于三维空间对象关系和语言指令的语义场景处理","authors":"Rainer Kartmann, Danqing Liu, T. Asfour","doi":"10.1109/HUMANOIDS47582.2021.9555802","DOIUrl":null,"url":null,"abstract":"Robot understanding of spatial object relations is key for a symbiotic human-robot interaction. Understanding the meaning of such relations between objects in a current scene and target relations specified in natural language commands is essential for the generation of robot manipulation action goals to change the scene by relocating objects relative to each other to fulfill the desired spatial relations. This ability requires a representation of spatial relations, which maps spatial relation symbols extracted from language instructions to subsymbolic object goal locations in the world. We present a generative model of static and dynamic 3D spatial relations between multiple reference objects. The model is based on a parametric probability distribution defined in cylindrical coordinates and is learned from examples provided by humans manipulating a scene in the real world. We demonstrate the ability of our representation to generate suitable object goal positions for a pick-and-place task on a humanoid robot, where object relations specified in natural language commands are extracted, object goal positions are determined and used for parametrizing the actions needed to transfer a given scene into a new one that fulfills the specified relations.","PeriodicalId":320510,"journal":{"name":"2020 IEEE-RAS 20th International Conference on Humanoid Robots (Humanoids)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Semantic Scene Manipulation Based on 3D Spatial Object Relations and Language Instructions\",\"authors\":\"Rainer Kartmann, Danqing Liu, T. Asfour\",\"doi\":\"10.1109/HUMANOIDS47582.2021.9555802\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Robot understanding of spatial object relations is key for a symbiotic human-robot interaction. Understanding the meaning of such relations between objects in a current scene and target relations specified in natural language commands is essential for the generation of robot manipulation action goals to change the scene by relocating objects relative to each other to fulfill the desired spatial relations. This ability requires a representation of spatial relations, which maps spatial relation symbols extracted from language instructions to subsymbolic object goal locations in the world. We present a generative model of static and dynamic 3D spatial relations between multiple reference objects. The model is based on a parametric probability distribution defined in cylindrical coordinates and is learned from examples provided by humans manipulating a scene in the real world. We demonstrate the ability of our representation to generate suitable object goal positions for a pick-and-place task on a humanoid robot, where object relations specified in natural language commands are extracted, object goal positions are determined and used for parametrizing the actions needed to transfer a given scene into a new one that fulfills the specified relations.\",\"PeriodicalId\":320510,\"journal\":{\"name\":\"2020 IEEE-RAS 20th International Conference on Humanoid Robots (Humanoids)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE-RAS 20th International Conference on Humanoid Robots (Humanoids)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HUMANOIDS47582.2021.9555802\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE-RAS 20th International Conference on Humanoid Robots (Humanoids)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HUMANOIDS47582.2021.9555802","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

摘要

机器人对空间对象关系的理解是实现人机共生交互的关键。理解当前场景中物体之间的这种关系和自然语言命令中指定的目标关系的含义，对于生成机器人操作动作目标，通过相对地重新定位物体来改变场景，以实现所需的空间关系至关重要。这种能力需要空间关系的表示，它将从语言指令中提取的空间关系符号映射到世界上的次符号对象目标位置。我们提出了一个多参考对象之间静态和动态三维空间关系的生成模型。该模型基于在柱坐标中定义的参数概率分布，并从人类在现实世界中操纵场景提供的示例中学习。我们展示了我们的表征为人形机器人的拾取任务生成合适的对象目标位置的能力，其中提取自然语言命令中指定的对象关系，确定对象目标位置并用于参数化将给定场景转换为满足指定关系的新场景所需的动作。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Semantic Scene Manipulation Based on 3D Spatial Object Relations and Language Instructions

Robot understanding of spatial object relations is key for a symbiotic human-robot interaction. Understanding the meaning of such relations between objects in a current scene and target relations specified in natural language commands is essential for the generation of robot manipulation action goals to change the scene by relocating objects relative to each other to fulfill the desired spatial relations. This ability requires a representation of spatial relations, which maps spatial relation symbols extracted from language instructions to subsymbolic object goal locations in the world. We present a generative model of static and dynamic 3D spatial relations between multiple reference objects. The model is based on a parametric probability distribution defined in cylindrical coordinates and is learned from examples provided by humans manipulating a scene in the real world. We demonstrate the ability of our representation to generate suitable object goal positions for a pick-and-place task on a humanoid robot, where object relations specified in natural language commands are extracted, object goal positions are determined and used for parametrizing the actions needed to transfer a given scene into a new one that fulfills the specified relations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 IEEE-RAS 20th International Conference on Humanoid Robots (Humanoids)

自引率

0.00%

发文量