以全局对象为中心的合成场景建模

IF 4.3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Machine Learning Pub Date : 2024-01-11 DOI:10.1007/s10994-023-06419-5
{"title":"以全局对象为中心的合成场景建模","authors":"","doi":"10.1007/s10994-023-06419-5","DOIUrl":null,"url":null,"abstract":"<h3>Abstract</h3> <p>The appearance of the same object may vary in different scene images due to occlusions between objects. Humans can quickly identify the same object, even if occlusions exist, by completing the occluded parts based on its complete canonical image in the memory. Achieving this ability is still challenging for existing models, especially in the unsupervised learning setting. Inspired by such an ability of humans, we propose a novel object-centric representation learning method to identify the same object in different scenes that may be occluded by learning global object-centric representations of complete canonical objects without supervision. The representation of each object is divided into an extrinsic part, which characterizes scene-dependent information (i.e., position and size), and an intrinsic part, which characterizes globally invariant information (i.e., appearance and shape). The former can be inferred with an improved IC-SBP module. The latter is extracted by combining rectangular and arbitrary-shaped attention and is used to infer the identity representation via a proposed patch-matching strategy with a set of learnable global object-centric representations of complete canonical objects. In the experiment, three 2D scene datasets are used to verify the proposed method’s ability to recognize the identity of the same object in different scenes. A complex 3D scene dataset and a real-world dataset are used to evaluate the performance of scene decomposition. Our experimental results demonstrate that the proposed method outperforms the comparison methods in terms of same object recognition and scene decomposition.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"14 1","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Compositional scene modeling with global object-centric representations\",\"authors\":\"\",\"doi\":\"10.1007/s10994-023-06419-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<h3>Abstract</h3> <p>The appearance of the same object may vary in different scene images due to occlusions between objects. Humans can quickly identify the same object, even if occlusions exist, by completing the occluded parts based on its complete canonical image in the memory. Achieving this ability is still challenging for existing models, especially in the unsupervised learning setting. Inspired by such an ability of humans, we propose a novel object-centric representation learning method to identify the same object in different scenes that may be occluded by learning global object-centric representations of complete canonical objects without supervision. The representation of each object is divided into an extrinsic part, which characterizes scene-dependent information (i.e., position and size), and an intrinsic part, which characterizes globally invariant information (i.e., appearance and shape). The former can be inferred with an improved IC-SBP module. The latter is extracted by combining rectangular and arbitrary-shaped attention and is used to infer the identity representation via a proposed patch-matching strategy with a set of learnable global object-centric representations of complete canonical objects. In the experiment, three 2D scene datasets are used to verify the proposed method’s ability to recognize the identity of the same object in different scenes. A complex 3D scene dataset and a real-world dataset are used to evaluate the performance of scene decomposition. Our experimental results demonstrate that the proposed method outperforms the comparison methods in terms of same object recognition and scene decomposition.</p>\",\"PeriodicalId\":49900,\"journal\":{\"name\":\"Machine Learning\",\"volume\":\"14 1\",\"pages\":\"\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-01-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Machine Learning\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s10994-023-06419-5\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10994-023-06419-5","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

摘要 在不同的场景图像中,由于物体之间的遮挡,同一物体的外观可能会有所不同。即使存在遮挡,人类也能根据记忆中的完整典型图像补全遮挡部分,从而快速识别同一物体。对于现有模型来说,尤其是在无监督学习环境下,实现这种能力仍是一项挑战。受人类这种能力的启发,我们提出了一种新颖的以对象为中心的表征学习方法,通过学习完整的典型对象的全局对象为中心的表征,在没有监督的情况下识别可能被遮挡的不同场景中的同一对象。每个物体的表征分为外在部分和内在部分,外在部分表征与场景相关的信息(即位置和大小),内在部分表征全局不变的信息(即外观和形状)。前者可以通过改进的 IC-SBP 模块来推断。后者通过结合矩形和任意形状的注意力来提取,并通过建议的补丁匹配策略与一套可学习的以完整典型对象为中心的全局对象表征来推断身份表征。在实验中,使用了三个二维场景数据集来验证所提出的方法在不同场景中识别同一物体身份的能力。一个复杂的三维场景数据集和一个真实世界数据集用于评估场景分解的性能。实验结果表明,在同一物体识别和场景分解方面,所提出的方法优于对比方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Compositional scene modeling with global object-centric representations

Abstract

The appearance of the same object may vary in different scene images due to occlusions between objects. Humans can quickly identify the same object, even if occlusions exist, by completing the occluded parts based on its complete canonical image in the memory. Achieving this ability is still challenging for existing models, especially in the unsupervised learning setting. Inspired by such an ability of humans, we propose a novel object-centric representation learning method to identify the same object in different scenes that may be occluded by learning global object-centric representations of complete canonical objects without supervision. The representation of each object is divided into an extrinsic part, which characterizes scene-dependent information (i.e., position and size), and an intrinsic part, which characterizes globally invariant information (i.e., appearance and shape). The former can be inferred with an improved IC-SBP module. The latter is extracted by combining rectangular and arbitrary-shaped attention and is used to infer the identity representation via a proposed patch-matching strategy with a set of learnable global object-centric representations of complete canonical objects. In the experiment, three 2D scene datasets are used to verify the proposed method’s ability to recognize the identity of the same object in different scenes. A complex 3D scene dataset and a real-world dataset are used to evaluate the performance of scene decomposition. Our experimental results demonstrate that the proposed method outperforms the comparison methods in terms of same object recognition and scene decomposition.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Machine Learning
Machine Learning 工程技术-计算机:人工智能
CiteScore
11.00
自引率
2.70%
发文量
162
审稿时长
3 months
期刊介绍: Machine Learning serves as a global platform dedicated to computational approaches in learning. The journal reports substantial findings on diverse learning methods applied to various problems, offering support through empirical studies, theoretical analysis, or connections to psychological phenomena. It demonstrates the application of learning methods to solve significant problems and aims to enhance the conduct of machine learning research with a focus on verifiable and replicable evidence in published papers.
期刊最新文献
On metafeatures’ ability of implicit concept identification Persistent Laplacian-enhanced algorithm for scarcely labeled data classification Towards a foundation large events model for soccer Conformal prediction for regression models with asymmetrically distributed errors: application to aircraft navigation during landing maneuver In-game soccer outcome prediction with offline reinforcement learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1