Metric-Semantic Factor Graph Generation based on Graph Neural Networks

Jose Andres Millan-Romera, Hriday Bavle, Muhammad Shaheer, Holger Voos, Jose Luis Sanchez-Lopez
{"title":"Metric-Semantic Factor Graph Generation based on Graph Neural Networks","authors":"Jose Andres Millan-Romera, Hriday Bavle, Muhammad Shaheer, Holger Voos, Jose Luis Sanchez-Lopez","doi":"arxiv-2409.11972","DOIUrl":null,"url":null,"abstract":"Understanding the relationships between geometric structures and semantic\nconcepts is crucial for building accurate models of complex environments. In\nindoors, certain spatial constraints, such as the relative positioning of\nplanes, remain consistent despite variations in layout. This paper explores how\nthese invariant relationships can be captured in a graph SLAM framework by\nrepresenting high-level concepts like rooms and walls, linking them to\ngeometric elements like planes through an optimizable factor graph. Several\nefforts have tackled this issue with add-hoc solutions for each concept\ngeneration and with manually-defined factors. This paper proposes a novel method for metric-semantic factor graph\ngeneration which includes defining a semantic scene graph, integrating\ngeometric information, and learning the interconnecting factors, all based on\nGraph Neural Networks (GNNs). An edge classification network (G-GNN) sorts the\nedges between planes into same room, same wall or none types. The resulting\nrelations are clustered, generating a room or wall for each cluster. A second\nfamily of networks (F-GNN) infers the geometrical origin of the new nodes. The\ndefinition of the factors employs the same F-GNN used for the metric attribute\nof the generated nodes. Furthermore, share the new factor graph with the\nS-Graphs+ algorithm, extending its graph expressiveness and scene\nrepresentation with the ultimate goal of improving the SLAM performance. The\ncomplexity of the environments is increased to N-plane rooms by training the\nnetworks on L-shaped rooms. The framework is evaluated in synthetic and\nsimulated scenarios as no real datasets of the required complex layouts are\navailable.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Robotics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11972","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Understanding the relationships between geometric structures and semantic concepts is crucial for building accurate models of complex environments. In indoors, certain spatial constraints, such as the relative positioning of planes, remain consistent despite variations in layout. This paper explores how these invariant relationships can be captured in a graph SLAM framework by representing high-level concepts like rooms and walls, linking them to geometric elements like planes through an optimizable factor graph. Several efforts have tackled this issue with add-hoc solutions for each concept generation and with manually-defined factors. This paper proposes a novel method for metric-semantic factor graph generation which includes defining a semantic scene graph, integrating geometric information, and learning the interconnecting factors, all based on Graph Neural Networks (GNNs). An edge classification network (G-GNN) sorts the edges between planes into same room, same wall or none types. The resulting relations are clustered, generating a room or wall for each cluster. A second family of networks (F-GNN) infers the geometrical origin of the new nodes. The definition of the factors employs the same F-GNN used for the metric attribute of the generated nodes. Furthermore, share the new factor graph with the S-Graphs+ algorithm, extending its graph expressiveness and scene representation with the ultimate goal of improving the SLAM performance. The complexity of the environments is increased to N-plane rooms by training the networks on L-shaped rooms. The framework is evaluated in synthetic and simulated scenarios as no real datasets of the required complex layouts are available.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于图神经网络的度量语义因子图生成
理解几何结构和语义概念之间的关系对于建立复杂环境的精确模型至关重要。在室内,尽管布局各不相同,但某些空间约束条件(如飞机的相对位置)仍然保持一致。本文探讨了如何在图 SLAM 框架中捕捉这些不变的关系,方法是表示房间和墙壁等高级概念,并通过可优化的因子图将它们与平面等几何元素联系起来。在解决这一问题的过程中,许多人都采用了针对每种概念生成和手动定义因子的临时解决方案。本文提出了一种新的度量-语义因子图生成方法,包括定义语义场景图、整合几何信息和学习相互连接的因子,所有这些都基于图神经网络(GNN)。边缘分类网络(G-GNN)将平面之间的边缘分为同一房间、同一墙壁或无类型。对由此产生的关系进行聚类,为每个聚类生成一个房间或一面墙。第二类网络(F-GNN)推断新节点的几何起源。因子的定义与 F-GNN 相同,用于生成节点的度量属性。此外,将新的因子图与 S-Graphs+ 算法共享,扩展了其图形表达能力和场景表示能力,最终目的是提高 SLAM 性能。通过在 L 型房间中训练网络,将环境复杂度提高到 N 平面房间。由于没有所需的复杂布局的真实数据集,因此在合成和模拟场景中对该框架进行了评估。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
IMRL: Integrating Visual, Physical, Temporal, and Geometric Representations for Enhanced Food Acquisition Human-Robot Cooperative Piano Playing with Learning-Based Real-Time Music Accompaniment GauTOAO: Gaussian-based Task-Oriented Affordance of Objects Reinforcement Learning with Lie Group Orientations for Robotics Haptic-ACT: Bridging Human Intuition with Compliant Robotic Manipulation via Immersive VR
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1