Jose Andres Millan-Romera, Hriday Bavle, Muhammad Shaheer, Holger Voos, Jose Luis Sanchez-Lopez
Understanding the relationships between geometric structures and semantic concepts is crucial for building accurate models of complex environments. In indoors, certain spatial constraints, such as the relative positioning of planes, remain consistent despite variations in layout. This paper explores how these invariant relationships can be captured in a graph SLAM framework by representing high-level concepts like rooms and walls, linking them to geometric elements like planes through an optimizable factor graph. Several efforts have tackled this issue with add-hoc solutions for each concept generation and with manually-defined factors. This paper proposes a novel method for metric-semantic factor graph generation which includes defining a semantic scene graph, integrating geometric information, and learning the interconnecting factors, all based on Graph Neural Networks (GNNs). An edge classification network (G-GNN) sorts the edges between planes into same room, same wall or none types. The resulting relations are clustered, generating a room or wall for each cluster. A second family of networks (F-GNN) infers the geometrical origin of the new nodes. The definition of the factors employs the same F-GNN used for the metric attribute of the generated nodes. Furthermore, share the new factor graph with the S-Graphs+ algorithm, extending its graph expressiveness and scene representation with the ultimate goal of improving the SLAM performance. The complexity of the environments is increased to N-plane rooms by training the networks on L-shaped rooms. The framework is evaluated in synthetic and simulated scenarios as no real datasets of the required complex layouts are available.
理解几何结构和语义概念之间的关系对于建立复杂环境的精确模型至关重要。在室内,尽管布局各不相同,但某些空间约束条件(如飞机的相对位置)仍然保持一致。本文探讨了如何在图 SLAM 框架中捕捉这些不变的关系,方法是表示房间和墙壁等高级概念,并通过可优化的因子图将它们与平面等几何元素联系起来。在解决这一问题的过程中,许多人都采用了针对每种概念生成和手动定义因子的临时解决方案。本文提出了一种新的度量-语义因子图生成方法,包括定义语义场景图、整合几何信息和学习相互连接的因子,所有这些都基于图神经网络(GNN)。边缘分类网络(G-GNN)将平面之间的边缘分为同一房间、同一墙壁或无类型。对由此产生的关系进行聚类,为每个聚类生成一个房间或一面墙。第二类网络(F-GNN)推断新节点的几何起源。因子的定义与 F-GNN 相同,用于生成节点的度量属性。此外,将新的因子图与 S-Graphs+ 算法共享,扩展了其图形表达能力和场景表示能力,最终目的是提高 SLAM 性能。通过在 L 型房间中训练网络,将环境复杂度提高到 N 平面房间。由于没有所需的复杂布局的真实数据集,因此在合成和模拟场景中对该框架进行了评估。
{"title":"Metric-Semantic Factor Graph Generation based on Graph Neural Networks","authors":"Jose Andres Millan-Romera, Hriday Bavle, Muhammad Shaheer, Holger Voos, Jose Luis Sanchez-Lopez","doi":"arxiv-2409.11972","DOIUrl":"https://doi.org/arxiv-2409.11972","url":null,"abstract":"Understanding the relationships between geometric structures and semantic\u0000concepts is crucial for building accurate models of complex environments. In\u0000indoors, certain spatial constraints, such as the relative positioning of\u0000planes, remain consistent despite variations in layout. This paper explores how\u0000these invariant relationships can be captured in a graph SLAM framework by\u0000representing high-level concepts like rooms and walls, linking them to\u0000geometric elements like planes through an optimizable factor graph. Several\u0000efforts have tackled this issue with add-hoc solutions for each concept\u0000generation and with manually-defined factors. This paper proposes a novel method for metric-semantic factor graph\u0000generation which includes defining a semantic scene graph, integrating\u0000geometric information, and learning the interconnecting factors, all based on\u0000Graph Neural Networks (GNNs). An edge classification network (G-GNN) sorts the\u0000edges between planes into same room, same wall or none types. The resulting\u0000relations are clustered, generating a room or wall for each cluster. A second\u0000family of networks (F-GNN) infers the geometrical origin of the new nodes. The\u0000definition of the factors employs the same F-GNN used for the metric attribute\u0000of the generated nodes. Furthermore, share the new factor graph with the\u0000S-Graphs+ algorithm, extending its graph expressiveness and scene\u0000representation with the ultimate goal of improving the SLAM performance. The\u0000complexity of the environments is increased to N-plane rooms by training the\u0000networks on L-shaped rooms. The framework is evaluated in synthetic and\u0000simulated scenarios as no real datasets of the required complex layouts are\u0000available.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Safety is a critical concern for urban flights of autonomous Unmanned Aerial Vehicles. In populated environments, risk should be accounted for to produce an effective and safe path, known as risk-aware path planning. Risk-aware path planning can be modeled as a Constrained Shortest Path (CSP) problem, aiming to identify the shortest possible route that adheres to specified safety thresholds. CSP is NP-hard and poses significant computational challenges. Although many traditional methods can solve it accurately, all of them are very slow. Our method introduces an additional safety dimension to the traditional A* (called ASD A*), enabling A* to handle CSP. Furthermore, we develop a custom learning-based heuristic using transformer-based neural networks, which significantly reduces the computational load and improves the performance of the ASD A* algorithm. The proposed method is well-validated with both random and realistic simulation scenarios.
{"title":"Learning-accelerated A* Search for Risk-aware Path Planning","authors":"Jun Xiang, Junfei Xie, Jun Chen","doi":"arxiv-2409.11634","DOIUrl":"https://doi.org/arxiv-2409.11634","url":null,"abstract":"Safety is a critical concern for urban flights of autonomous Unmanned Aerial\u0000Vehicles. In populated environments, risk should be accounted for to produce an\u0000effective and safe path, known as risk-aware path planning. Risk-aware path\u0000planning can be modeled as a Constrained Shortest Path (CSP) problem, aiming to\u0000identify the shortest possible route that adheres to specified safety\u0000thresholds. CSP is NP-hard and poses significant computational challenges.\u0000Although many traditional methods can solve it accurately, all of them are very\u0000slow. Our method introduces an additional safety dimension to the traditional\u0000A* (called ASD A*), enabling A* to handle CSP. Furthermore, we develop a custom\u0000learning-based heuristic using transformer-based neural networks, which\u0000significantly reduces the computational load and improves the performance of\u0000the ASD A* algorithm. The proposed method is well-validated with both random\u0000and realistic simulation scenarios.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present Residual Descent Differential Dynamic Game (RD3G), a Newton-based solver for constrained multi-agent game-control problems. The proposed solver seeks a local Nash equilibrium for problems where agents are coupled through their rewards and state constraints. We compare the proposed method against competing state-of-the-art techniques and showcase the computational benefits of the RD3G algorithm on several example problems.
{"title":"Residual Descent Differential Dynamic Game (RD3G) -- A Fast Newton Solver for Constrained General Sum Games","authors":"Zhiyuan Zhang, Panagiotis Tsiotras","doi":"arxiv-2409.12152","DOIUrl":"https://doi.org/arxiv-2409.12152","url":null,"abstract":"We present Residual Descent Differential Dynamic Game (RD3G), a Newton-based\u0000solver for constrained multi-agent game-control problems. The proposed solver\u0000seeks a local Nash equilibrium for problems where agents are coupled through\u0000their rewards and state constraints. We compare the proposed method against\u0000competing state-of-the-art techniques and showcase the computational benefits\u0000of the RD3G algorithm on several example problems.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The problem of safety for robotic systems has been extensively studied. However, little attention has been given to security issues for three-dimensional systems, such as quadrotors. Malicious adversaries can compromise robot sensors and communication networks, causing incidents, achieving illegal objectives, or even injuring people. This study first designs an intelligent control system for autonomous quadrotors. Then, it investigates the problems of optimal false data injection attack scheduling and countermeasure design for unmanned aerial vehicles. Using a state-of-the-art deep learning-based approach, an optimal false data injection attack scheme is proposed to deteriorate a quadrotor's tracking performance with limited attack energy. Subsequently, an optimal tracking control strategy is learned to mitigate attacks and recover the quadrotor's tracking performance. We base our work on Agilicious, a state-of-the-art quadrotor recently deployed for autonomous settings. This paper is the first in the United Kingdom to deploy this quadrotor and implement reinforcement learning on its platform. Therefore, to promote easy reproducibility with minimal engineering overhead, we further provide (1) a comprehensive breakdown of this quadrotor, including software stacks and hardware alternatives; (2) a detailed reinforcement-learning framework to train autonomous controllers on Agilicious agents; and (3) a new open-source environment that builds upon PyFlyt for future reinforcement learning research on Agilicious platforms. Both simulated and real-world experiments are conducted to show the effectiveness of the proposed frameworks in section 5.2.
{"title":"Secure Control Systems for Autonomous Quadrotors against Cyber-Attacks","authors":"Samuel Belkadi","doi":"arxiv-2409.11897","DOIUrl":"https://doi.org/arxiv-2409.11897","url":null,"abstract":"The problem of safety for robotic systems has been extensively studied.\u0000However, little attention has been given to security issues for\u0000three-dimensional systems, such as quadrotors. Malicious adversaries can\u0000compromise robot sensors and communication networks, causing incidents,\u0000achieving illegal objectives, or even injuring people. This study first designs\u0000an intelligent control system for autonomous quadrotors. Then, it investigates\u0000the problems of optimal false data injection attack scheduling and\u0000countermeasure design for unmanned aerial vehicles. Using a state-of-the-art\u0000deep learning-based approach, an optimal false data injection attack scheme is\u0000proposed to deteriorate a quadrotor's tracking performance with limited attack\u0000energy. Subsequently, an optimal tracking control strategy is learned to\u0000mitigate attacks and recover the quadrotor's tracking performance. We base our\u0000work on Agilicious, a state-of-the-art quadrotor recently deployed for\u0000autonomous settings. This paper is the first in the United Kingdom to deploy\u0000this quadrotor and implement reinforcement learning on its platform. Therefore,\u0000to promote easy reproducibility with minimal engineering overhead, we further\u0000provide (1) a comprehensive breakdown of this quadrotor, including software\u0000stacks and hardware alternatives; (2) a detailed reinforcement-learning\u0000framework to train autonomous controllers on Agilicious agents; and (3) a new\u0000open-source environment that builds upon PyFlyt for future reinforcement\u0000learning research on Agilicious platforms. Both simulated and real-world\u0000experiments are conducted to show the effectiveness of the proposed frameworks\u0000in section 5.2.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stefano Ferraro, Pietro Mazzaglia, Tim Verbelen, Bart Dhoedt, Sai Rajeswar
Object manipulation capabilities are essential skills that set apart embodied agents engaging with the world, especially in the realm of robotics. The ability to predict outcomes of interactions with objects is paramount in this setting. While model-based control methods have started to be employed for tackling manipulation tasks, they have faced challenges in accurately manipulating objects. As we analyze the causes of this limitation, we identify the cause of underperformance in the way current world models represent crucial positional information, especially about the target's goal specification for object positioning tasks. We introduce a general approach that empowers world model-based agents to effectively solve object-positioning tasks. We propose two declinations of this approach for generative world models: position-conditioned (PCP) and latent-conditioned (LCP) policy learning. In particular, LCP employs object-centric latent representations that explicitly capture object positional information for goal specification. This naturally leads to the emergence of multimodal capabilities, enabling the specification of goals through spatial coordinates or a visual goal. Our methods are rigorously evaluated across several manipulation environments, showing favorable performance compared to current model-based control approaches.
{"title":"Representing Positional Information in Generative World Models for Object Manipulation","authors":"Stefano Ferraro, Pietro Mazzaglia, Tim Verbelen, Bart Dhoedt, Sai Rajeswar","doi":"arxiv-2409.12005","DOIUrl":"https://doi.org/arxiv-2409.12005","url":null,"abstract":"Object manipulation capabilities are essential skills that set apart embodied\u0000agents engaging with the world, especially in the realm of robotics. The\u0000ability to predict outcomes of interactions with objects is paramount in this\u0000setting. While model-based control methods have started to be employed for\u0000tackling manipulation tasks, they have faced challenges in accurately\u0000manipulating objects. As we analyze the causes of this limitation, we identify\u0000the cause of underperformance in the way current world models represent crucial\u0000positional information, especially about the target's goal specification for\u0000object positioning tasks. We introduce a general approach that empowers world\u0000model-based agents to effectively solve object-positioning tasks. We propose\u0000two declinations of this approach for generative world models:\u0000position-conditioned (PCP) and latent-conditioned (LCP) policy learning. In\u0000particular, LCP employs object-centric latent representations that explicitly\u0000capture object positional information for goal specification. This naturally\u0000leads to the emergence of multimodal capabilities, enabling the specification\u0000of goals through spatial coordinates or a visual goal. Our methods are\u0000rigorously evaluated across several manipulation environments, showing\u0000favorable performance compared to current model-based control approaches.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the emergence of new flapping-wing micro aerial vehicle (FWMAV) designs, a need for extensive and advanced mission capabilities arises. FWMAVs try to adapt and emulate the flight features of birds and flying insects. While current designs already achieve high manoeuvrability, they still almost entirely lack perching and take-off abilities. These capabilities could, for instance, enable long-term monitoring and surveillance missions, and operations in cluttered environments or in proximity to humans and animals. We present the development and testing of a framework that enables repeatable perching and take-off for small to medium-sized FWMAVs, utilising soft, non-damaging grippers. Thanks to its novel active-passive actuation system, an energy-conserving state can be achieved and indefinitely maintained while the vehicle is perched. A prototype of the proposed system weighing under 39 g was manufactured and extensively tested on a 110 g flapping-wing robot. Successful free-flight tests demonstrated the full mission cycle of landing, perching and subsequent take-off. The telemetry data recorded during the flights yields extensive insight into the system's behaviour and is a valuable step towards full automation and optimisation of the entire take-off and landing cycle.
{"title":"Repeatable Energy-Efficient Perching for Flapping-Wing Robots Using Soft Grippers","authors":"Krispin C. V. Broers, Sophie F. Armanini","doi":"arxiv-2409.11921","DOIUrl":"https://doi.org/arxiv-2409.11921","url":null,"abstract":"With the emergence of new flapping-wing micro aerial vehicle (FWMAV) designs,\u0000a need for extensive and advanced mission capabilities arises. FWMAVs try to\u0000adapt and emulate the flight features of birds and flying insects. While\u0000current designs already achieve high manoeuvrability, they still almost\u0000entirely lack perching and take-off abilities. These capabilities could, for\u0000instance, enable long-term monitoring and surveillance missions, and operations\u0000in cluttered environments or in proximity to humans and animals. We present the\u0000development and testing of a framework that enables repeatable perching and\u0000take-off for small to medium-sized FWMAVs, utilising soft, non-damaging\u0000grippers. Thanks to its novel active-passive actuation system, an\u0000energy-conserving state can be achieved and indefinitely maintained while the\u0000vehicle is perched. A prototype of the proposed system weighing under 39 g was\u0000manufactured and extensively tested on a 110 g flapping-wing robot. Successful\u0000free-flight tests demonstrated the full mission cycle of landing, perching and\u0000subsequent take-off. The telemetry data recorded during the flights yields\u0000extensive insight into the system's behaviour and is a valuable step towards\u0000full automation and optimisation of the entire take-off and landing cycle.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gang Chen, Zhaoying Wang, Wei Dong, Javier Alonso-Mora
Representing the 3D environment with instance-aware semantic and geometric information is crucial for interaction-aware robots in dynamic environments. Nonetheless, creating such a representation poses challenges due to sensor noise, instance segmentation and tracking errors, and the objects' dynamic motion. This paper introduces a novel particle-based instance-aware semantic occupancy map to tackle these challenges. Particles with an augmented instance state are used to estimate the Probability Hypothesis Density (PHD) of the objects and implicitly model the environment. Utilizing a State-augmented Sequential Monte Carlo PHD (S$^2$MC-PHD) filter, these particles are updated to jointly estimate occupancy status, semantic, and instance IDs, mitigating noise. Additionally, a memory module is adopted to enhance the map's responsiveness to previously observed objects. Experimental results on the Virtual KITTI 2 dataset demonstrate that the proposed approach surpasses state-of-the-art methods across multiple metrics under different noise conditions. Subsequent tests using real-world data further validate the effectiveness of the proposed approach.
{"title":"Particle-based Instance-aware Semantic Occupancy Mapping in Dynamic Environments","authors":"Gang Chen, Zhaoying Wang, Wei Dong, Javier Alonso-Mora","doi":"arxiv-2409.11975","DOIUrl":"https://doi.org/arxiv-2409.11975","url":null,"abstract":"Representing the 3D environment with instance-aware semantic and geometric\u0000information is crucial for interaction-aware robots in dynamic environments.\u0000Nonetheless, creating such a representation poses challenges due to sensor\u0000noise, instance segmentation and tracking errors, and the objects' dynamic\u0000motion. This paper introduces a novel particle-based instance-aware semantic\u0000occupancy map to tackle these challenges. Particles with an augmented instance\u0000state are used to estimate the Probability Hypothesis Density (PHD) of the\u0000objects and implicitly model the environment. Utilizing a State-augmented\u0000Sequential Monte Carlo PHD (S$^2$MC-PHD) filter, these particles are updated to\u0000jointly estimate occupancy status, semantic, and instance IDs, mitigating\u0000noise. Additionally, a memory module is adopted to enhance the map's\u0000responsiveness to previously observed objects. Experimental results on the\u0000Virtual KITTI 2 dataset demonstrate that the proposed approach surpasses\u0000state-of-the-art methods across multiple metrics under different noise\u0000conditions. Subsequent tests using real-world data further validate the\u0000effectiveness of the proposed approach.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Environments with large terrain height variations present great challenges for legged robot locomotion. Drawing inspiration from fire ants' collective assembly behavior, we study strategies that can enable two ``connectable'' robots to collectively navigate over bumpy terrains with height variations larger than robot leg length. Each robot was designed to be extremely simple, with a cubical body and one rotary motor actuating four vertical peg legs that move in pairs. Two or more robots could physically connect to one another to enhance collective mobility. We performed locomotion experiments with a two-robot group, across an obstacle field filled with uniformly-distributed semi-spherical ``boulders''. Experimentally-measured robot speed suggested that the connection length between the robots has a significant effect on collective mobility: connection length C in [0.86, 0.9] robot unit body length (UBL) were able to produce sustainable movements across the obstacle field, whereas connection length C in [0.63, 0.84] and [0.92, 1.1] UBL resulted in low traversability. An energy landscape based model revealed the underlying mechanism of how connection length modulated collective mobility through the system's potential energy landscape, and informed adaptation strategies for the two-robot system to adapt their connection length for traversing obstacle fields with varying spatial frequencies. Our results demonstrated that by varying the connection configuration between the robots, the two-robot system could leverage mechanical intelligence to better utilize obstacle interaction forces and produce improved locomotion. Going forward, we envision that generalized principles of robot-environment coupling can inform design and control strategies for a large group of small robots to achieve ant-like collective environment negotiation.
{"title":"Multi-robot connection towards collective obstacle field traversal","authors":"Haodi Hu, Xingjue Liao, Wuhao Du, Feifei Qian","doi":"arxiv-2409.11709","DOIUrl":"https://doi.org/arxiv-2409.11709","url":null,"abstract":"Environments with large terrain height variations present great challenges\u0000for legged robot locomotion. Drawing inspiration from fire ants' collective\u0000assembly behavior, we study strategies that can enable two ``connectable''\u0000robots to collectively navigate over bumpy terrains with height variations\u0000larger than robot leg length. Each robot was designed to be extremely simple,\u0000with a cubical body and one rotary motor actuating four vertical peg legs that\u0000move in pairs. Two or more robots could physically connect to one another to\u0000enhance collective mobility. We performed locomotion experiments with a\u0000two-robot group, across an obstacle field filled with uniformly-distributed\u0000semi-spherical ``boulders''. Experimentally-measured robot speed suggested that\u0000the connection length between the robots has a significant effect on collective\u0000mobility: connection length C in [0.86, 0.9] robot unit body length (UBL) were\u0000able to produce sustainable movements across the obstacle field, whereas\u0000connection length C in [0.63, 0.84] and [0.92, 1.1] UBL resulted in low\u0000traversability. An energy landscape based model revealed the underlying\u0000mechanism of how connection length modulated collective mobility through the\u0000system's potential energy landscape, and informed adaptation strategies for the\u0000two-robot system to adapt their connection length for traversing obstacle\u0000fields with varying spatial frequencies. Our results demonstrated that by\u0000varying the connection configuration between the robots, the two-robot system\u0000could leverage mechanical intelligence to better utilize obstacle interaction\u0000forces and produce improved locomotion. Going forward, we envision that\u0000generalized principles of robot-environment coupling can inform design and\u0000control strategies for a large group of small robots to achieve ant-like\u0000collective environment negotiation.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Imitation based robot learning has recently gained significant attention in the robotics field due to its theoretical potential for transferability and generalizability. However, it remains notoriously costly, both in terms of hardware and data collection, and deploying it in real-world environments demands meticulous setup of robots and precise experimental conditions. In this paper, we present a low-cost robot learning framework that is both easily reproducible and transferable to various robots and environments. We demonstrate that deployable imitation learning can be successfully applied even to industrial-grade robots, not just expensive collaborative robotic arms. Furthermore, our results show that multi-task robot learning is achievable with simple network architectures and fewer demonstrations than previously thought necessary. As the current evaluating method is almost subjective when it comes to real-world manipulation tasks, we propose Voting Positive Rate (VPR) - a novel evaluation strategy that provides a more objective assessment of performance. We conduct an extensive comparison of success rates across various self-designed tasks to validate our approach. To foster collaboration and support the robot learning community, we have open-sourced all relevant datasets and model checkpoints, available at huggingface.co/ZhiChengAI.
{"title":"Generalized Robot Learning Framework","authors":"Jiahuan Yan, Zhouyang Hong, Yu Zhao, Yu Tian, Yunxin Liu, Travis Davies, Luhui Hu","doi":"arxiv-2409.12061","DOIUrl":"https://doi.org/arxiv-2409.12061","url":null,"abstract":"Imitation based robot learning has recently gained significant attention in\u0000the robotics field due to its theoretical potential for transferability and\u0000generalizability. However, it remains notoriously costly, both in terms of\u0000hardware and data collection, and deploying it in real-world environments\u0000demands meticulous setup of robots and precise experimental conditions. In this\u0000paper, we present a low-cost robot learning framework that is both easily\u0000reproducible and transferable to various robots and environments. We\u0000demonstrate that deployable imitation learning can be successfully applied even\u0000to industrial-grade robots, not just expensive collaborative robotic arms.\u0000Furthermore, our results show that multi-task robot learning is achievable with\u0000simple network architectures and fewer demonstrations than previously thought\u0000necessary. As the current evaluating method is almost subjective when it comes\u0000to real-world manipulation tasks, we propose Voting Positive Rate (VPR) - a\u0000novel evaluation strategy that provides a more objective assessment of\u0000performance. We conduct an extensive comparison of success rates across various\u0000self-designed tasks to validate our approach. To foster collaboration and\u0000support the robot learning community, we have open-sourced all relevant\u0000datasets and model checkpoints, available at huggingface.co/ZhiChengAI.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiawei Sun, Jiahui Li, Tingchen Liu, Chengran Yuan, Shuo Sun, Zefan Huang, Anthony Wong, Keng Peng Tee, Marcelo H. Ang Jr
We introduce RMP-YOLO, a unified framework designed to provide robust motion predictions even with incomplete input data. Our key insight stems from the observation that complete and reliable historical trajectory data plays a pivotal role in ensuring accurate motion prediction. Therefore, we propose a new paradigm that prioritizes the reconstruction of intact historical trajectories before feeding them into the prediction modules. Our approach introduces a novel scene tokenization module to enhance the extraction and fusion of spatial and temporal features. Following this, our proposed recovery module reconstructs agents' incomplete historical trajectories by leveraging local map topology and interactions with nearby agents. The reconstructed, clean historical data is then integrated into the downstream prediction modules. Our framework is able to effectively handle missing data of varying lengths and remains robust against observation noise, while maintaining high prediction accuracy. Furthermore, our recovery module is compatible with existing prediction models, ensuring seamless integration. Extensive experiments validate the effectiveness of our approach, and deployment in real-world autonomous vehicles confirms its practical utility. In the 2024 Waymo Motion Prediction Competition, our method, RMP-YOLO, achieves state-of-the-art performance, securing third place.
{"title":"RMP-YOLO: A Robust Motion Predictor for Partially Observable Scenarios even if You Only Look Once","authors":"Jiawei Sun, Jiahui Li, Tingchen Liu, Chengran Yuan, Shuo Sun, Zefan Huang, Anthony Wong, Keng Peng Tee, Marcelo H. Ang Jr","doi":"arxiv-2409.11696","DOIUrl":"https://doi.org/arxiv-2409.11696","url":null,"abstract":"We introduce RMP-YOLO, a unified framework designed to provide robust motion\u0000predictions even with incomplete input data. Our key insight stems from the\u0000observation that complete and reliable historical trajectory data plays a\u0000pivotal role in ensuring accurate motion prediction. Therefore, we propose a\u0000new paradigm that prioritizes the reconstruction of intact historical\u0000trajectories before feeding them into the prediction modules. Our approach\u0000introduces a novel scene tokenization module to enhance the extraction and\u0000fusion of spatial and temporal features. Following this, our proposed recovery\u0000module reconstructs agents' incomplete historical trajectories by leveraging\u0000local map topology and interactions with nearby agents. The reconstructed,\u0000clean historical data is then integrated into the downstream prediction\u0000modules. Our framework is able to effectively handle missing data of varying\u0000lengths and remains robust against observation noise, while maintaining high\u0000prediction accuracy. Furthermore, our recovery module is compatible with\u0000existing prediction models, ensuring seamless integration. Extensive\u0000experiments validate the effectiveness of our approach, and deployment in\u0000real-world autonomous vehicles confirms its practical utility. In the 2024\u0000Waymo Motion Prediction Competition, our method, RMP-YOLO, achieves\u0000state-of-the-art performance, securing third place.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}