基于密集标注双松弛对比学习的开放词汇移动操作

IF 4.6 2区 计算机科学 Q2 ROBOTICS IEEE Robotics and Automation Letters Pub Date : 2024-12-26 DOI:10.1109/LRA.2024.3522841
Daichi Yashima;Ryosuke Korekata;Komei Sugiura
{"title":"基于密集标注双松弛对比学习的开放词汇移动操作","authors":"Daichi Yashima;Ryosuke Korekata;Komei Sugiura","doi":"10.1109/LRA.2024.3522841","DOIUrl":null,"url":null,"abstract":"Growing labor shortages are increasing the demand for domestic service robots (DSRs) to assist in various settings. In this study, we develop a DSR that transports everyday objects to specified pieces of furniture based on open-vocabulary instructions. Our approach focuses on retrieving images of target objects and receptacles from pre-collected images of indoor environments. For example, given an instruction “Please get the right red towel hanging on the metal towel rack and put it in the white washing machine on the left,” the DSR is expected to carry the red towel to the washing machine based on the retrieved images. This is challenging because the correct images should be retrieved from thousands of collected images, which may include many images of similar towels and appliances. To address this, we propose RelaX-Former, which learns diverse and robust representations from among positive, unlabeled positive, and negative samples. We evaluated RelaX-Former on a dataset containing real-world indoor images and human annotated instructions including complex referring expressions. The experimental results demonstrate that RelaX-Former outperformed existing baseline models across standard image retrieval metrics. Moreover, we performed physical experiments using a DSR to evaluate the performance of our approach in a zero-shot transfer setting. The experiments involved the DSR to carry objects to specific receptacles based on open-vocabulary instructions, achieving an overall success rate of 75%.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 2","pages":"1728-1735"},"PeriodicalIF":4.6000,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Open-Vocabulary Mobile Manipulation Based on Double Relaxed Contrastive Learning With Dense Labeling\",\"authors\":\"Daichi Yashima;Ryosuke Korekata;Komei Sugiura\",\"doi\":\"10.1109/LRA.2024.3522841\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Growing labor shortages are increasing the demand for domestic service robots (DSRs) to assist in various settings. In this study, we develop a DSR that transports everyday objects to specified pieces of furniture based on open-vocabulary instructions. Our approach focuses on retrieving images of target objects and receptacles from pre-collected images of indoor environments. For example, given an instruction “Please get the right red towel hanging on the metal towel rack and put it in the white washing machine on the left,” the DSR is expected to carry the red towel to the washing machine based on the retrieved images. This is challenging because the correct images should be retrieved from thousands of collected images, which may include many images of similar towels and appliances. To address this, we propose RelaX-Former, which learns diverse and robust representations from among positive, unlabeled positive, and negative samples. We evaluated RelaX-Former on a dataset containing real-world indoor images and human annotated instructions including complex referring expressions. The experimental results demonstrate that RelaX-Former outperformed existing baseline models across standard image retrieval metrics. Moreover, we performed physical experiments using a DSR to evaluate the performance of our approach in a zero-shot transfer setting. The experiments involved the DSR to carry objects to specific receptacles based on open-vocabulary instructions, achieving an overall success rate of 75%.\",\"PeriodicalId\":13241,\"journal\":{\"name\":\"IEEE Robotics and Automation Letters\",\"volume\":\"10 2\",\"pages\":\"1728-1735\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2024-12-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Robotics and Automation Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10816380/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Robotics and Automation Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10816380/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0

摘要

日益严重的劳动力短缺增加了对家庭服务机器人(dsr)的需求,以协助各种设置。在这项研究中,我们开发了一个DSR,可以根据开放词汇指令将日常物品运送到指定的家具上。我们的方法侧重于从预先收集的室内环境图像中检索目标物体和容器的图像。例如,给DSR“请把右边的红色毛巾挂在金属毛巾架上,放到左边的白色洗衣机里”的指令,DSR就会根据检索到的图像把红色毛巾送到洗衣机里。这是具有挑战性的,因为正确的图像应该从数千个收集的图像中检索,其中可能包括许多类似毛巾和电器的图像。为了解决这个问题,我们提出了RelaX-Former,它从正样本、未标记的正样本和负样本中学习不同的鲁棒表示。我们在包含真实世界室内图像和包含复杂引用表达式的人类注释指令的数据集上评估了RelaX-Former。实验结果表明,RelaX-Former在标准图像检索指标上优于现有的基线模型。此外,我们使用DSR进行了物理实验,以评估我们的方法在零射转移设置中的性能。在实验中,DSR根据开放词汇指令将物体搬运到特定的容器中,总体成功率为75%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Open-Vocabulary Mobile Manipulation Based on Double Relaxed Contrastive Learning With Dense Labeling
Growing labor shortages are increasing the demand for domestic service robots (DSRs) to assist in various settings. In this study, we develop a DSR that transports everyday objects to specified pieces of furniture based on open-vocabulary instructions. Our approach focuses on retrieving images of target objects and receptacles from pre-collected images of indoor environments. For example, given an instruction “Please get the right red towel hanging on the metal towel rack and put it in the white washing machine on the left,” the DSR is expected to carry the red towel to the washing machine based on the retrieved images. This is challenging because the correct images should be retrieved from thousands of collected images, which may include many images of similar towels and appliances. To address this, we propose RelaX-Former, which learns diverse and robust representations from among positive, unlabeled positive, and negative samples. We evaluated RelaX-Former on a dataset containing real-world indoor images and human annotated instructions including complex referring expressions. The experimental results demonstrate that RelaX-Former outperformed existing baseline models across standard image retrieval metrics. Moreover, we performed physical experiments using a DSR to evaluate the performance of our approach in a zero-shot transfer setting. The experiments involved the DSR to carry objects to specific receptacles based on open-vocabulary instructions, achieving an overall success rate of 75%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Robotics and Automation Letters
IEEE Robotics and Automation Letters Computer Science-Computer Science Applications
CiteScore
9.60
自引率
15.40%
发文量
1428
期刊介绍: The scope of this journal is to publish peer-reviewed articles that provide a timely and concise account of innovative research ideas and application results, reporting significant theoretical findings and application case studies in areas of robotics and automation.
期刊最新文献
RA-RRTV*: Risk-Averse RRT* With Local Vine Expansion for Path Planning in Narrow Passages Under Localization Uncertainty Controlling Pneumatic Bending Actuator With Gain-Scheduled Feedforward and Physical Reservoir Computing State Estimation Funabot-Sleeve: A Wearable Device Employing McKibben Artificial Muscles for Haptic Sensation in the Forearm 3D Guidance Law for Flexible Target Enclosing With Inherent Safety Learning Agile Swimming: An End-to-End Approach Without CPGs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1