ZISVFM: Zero-Shot Object Instance Segmentation in Indoor Robotic Environments With Vision Foundation Models

IF 10.5 1区 计算机科学 Q1 ROBOTICS IEEE Transactions on Robotics Pub Date : 2025-02-05 DOI:10.1109/TRO.2025.3539198
Ying Zhang;Maoliang Yin;Wenfu Bi;Haibao Yan;Shaohan Bian;Cui-Hua Zhang;Changchun Hua
{"title":"ZISVFM: Zero-Shot Object Instance Segmentation in Indoor Robotic Environments With Vision Foundation Models","authors":"Ying Zhang;Maoliang Yin;Wenfu Bi;Haibao Yan;Shaohan Bian;Cui-Hua Zhang;Changchun Hua","doi":"10.1109/TRO.2025.3539198","DOIUrl":null,"url":null,"abstract":"Service robots operating in unstructured environments must effectively recognize and segment unknown objects to enhance their functionality. Traditional supervised learning-based segmentation techniques require extensive annotated datasets, which are impractical for the diversity of objects encountered in real-world scenarios. Unseen object instance segmentation (UOIS) methods aim to address this by training models on synthetic data to generalize to novel objects, but they often suffer from the simulation-to-reality gap. This article proposes a novel approach (ZISVFM) for solving UOIS by leveraging the powerful zero-shot capability of the segment anything model (SAM) and explicit visual representations from a self-supervised vision transformer (ViT). The proposed framework operates in the following three stages: generating object-agnostic mask proposals from colorized depth images using SAM, refining these proposals using attention-based features from the self-supervised ViT to filter nonobject masks, and applying K-Medoids clustering to generate point prompts that guide SAM toward precise object segmentation. Experimental validation on two benchmark datasets and a self-collected dataset demonstrates the superior performance of ZISVFM in complex environments, including hierarchical settings such as cabinets, drawers, and handheld objects.","PeriodicalId":50388,"journal":{"name":"IEEE Transactions on Robotics","volume":"41 ","pages":"1568-1580"},"PeriodicalIF":10.5000,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Robotics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10874172/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0

Abstract

Service robots operating in unstructured environments must effectively recognize and segment unknown objects to enhance their functionality. Traditional supervised learning-based segmentation techniques require extensive annotated datasets, which are impractical for the diversity of objects encountered in real-world scenarios. Unseen object instance segmentation (UOIS) methods aim to address this by training models on synthetic data to generalize to novel objects, but they often suffer from the simulation-to-reality gap. This article proposes a novel approach (ZISVFM) for solving UOIS by leveraging the powerful zero-shot capability of the segment anything model (SAM) and explicit visual representations from a self-supervised vision transformer (ViT). The proposed framework operates in the following three stages: generating object-agnostic mask proposals from colorized depth images using SAM, refining these proposals using attention-based features from the self-supervised ViT to filter nonobject masks, and applying K-Medoids clustering to generate point prompts that guide SAM toward precise object segmentation. Experimental validation on two benchmark datasets and a self-collected dataset demonstrates the superior performance of ZISVFM in complex environments, including hierarchical settings such as cabinets, drawers, and handheld objects.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ZISVFM:利用视觉基础模型进行室内机器人环境中的零镜头物体实例分割
在非结构化环境中工作的服务机器人必须有效地识别和分割未知物体,以增强其功能。传统的基于监督学习的分割技术需要大量带注释的数据集,这对于现实世界中遇到的对象的多样性是不切实际的。不可见对象实例分割(UOIS)方法旨在通过在合成数据上训练模型来泛化新对象来解决这一问题,但它们往往存在模拟与现实之间的差距。本文提出了一种新的解决UOIS的方法(ZISVFM),该方法利用了分段任意模型(SAM)强大的零射击能力和自监督视觉转换器(ViT)的显式视觉表示。所提出的框架分为以下三个阶段:使用SAM从彩色深度图像中生成与对象无关的掩码建议,使用自监督ViT中的基于注意力的特征对这些建议进行细化以过滤非对象掩码,并应用K-Medoids聚类生成引导SAM精确分割目标的点提示。在两个基准数据集和一个自收集数据集上的实验验证表明,ZISVFM在复杂环境(包括分层设置,如橱柜、抽屉和手持对象)中具有优越的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Robotics
IEEE Transactions on Robotics 工程技术-机器人学
CiteScore
14.90
自引率
5.10%
发文量
259
审稿时长
6.0 months
期刊介绍: The IEEE Transactions on Robotics (T-RO) is dedicated to publishing fundamental papers covering all facets of robotics, drawing on interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, and beyond. From industrial applications to service and personal assistants, surgical operations to space, underwater, and remote exploration, robots and intelligent machines play pivotal roles across various domains, including entertainment, safety, search and rescue, military applications, agriculture, and intelligent vehicles. Special emphasis is placed on intelligent machines and systems designed for unstructured environments, where a significant portion of the environment remains unknown and beyond direct sensing or control.
期刊最新文献
Robot Tracking Control With Natural Task-Space Decoupling Behavior-Controllable Stable Dynamics Models on Riemannian Configuration Manifolds Correspondence-Free, Function-Based Sim-to-Real Learning for Deformable Surface Control PushingBots: Collaborative Pushing Via Neural Accelerated Combinatorial Hybrid Optimization Optimal Virtual Model Control for Robotics: Design and Tuning of Passivity-Based Controllers
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1