Underwater object-level mapping requires incorporating visual foundation models to handle the uncommon and often previously unseen object classes encountered in marine scenarios. In this work, a metric of semantic uncertainty for open-set object detections produced by visual foundation models is calculated and then incorporated into an object-level uncertainty tracking framework. Object-level uncertainties and geometric relationships between objects are used to enable robust object-level loop closure detection for unknown object classes. The above loop closure detection problem is formulated as a graph-matching problem. While graph matching, in general, is NP-Complete, a solver for an equivalent formulation of the proposed graph matching problem as a graph editing problem is tested on multiple challenging underwater scenes. Results for this solver as well as three other solvers demonstrate that the proposed methods are feasible for real-time use in marine environments for the robust, open-set, multi-object, semantic-uncertainty-aware loop closure detection. Further experimental results on the KITTI dataset demonstrate that the method generalizes to large-scale terrestrial scenes.
{"title":"Open-Set Semantic Uncertainty Aware Metric-Semantic Graph Matching","authors":"Kurran Singh, John J. Leonard","doi":"arxiv-2409.11555","DOIUrl":"https://doi.org/arxiv-2409.11555","url":null,"abstract":"Underwater object-level mapping requires incorporating visual foundation\u0000models to handle the uncommon and often previously unseen object classes\u0000encountered in marine scenarios. In this work, a metric of semantic uncertainty\u0000for open-set object detections produced by visual foundation models is\u0000calculated and then incorporated into an object-level uncertainty tracking\u0000framework. Object-level uncertainties and geometric relationships between\u0000objects are used to enable robust object-level loop closure detection for\u0000unknown object classes. The above loop closure detection problem is formulated\u0000as a graph-matching problem. While graph matching, in general, is NP-Complete,\u0000a solver for an equivalent formulation of the proposed graph matching problem\u0000as a graph editing problem is tested on multiple challenging underwater scenes.\u0000Results for this solver as well as three other solvers demonstrate that the\u0000proposed methods are feasible for real-time use in marine environments for the\u0000robust, open-set, multi-object, semantic-uncertainty-aware loop closure\u0000detection. Further experimental results on the KITTI dataset demonstrate that\u0000the method generalizes to large-scale terrestrial scenes.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Embodied vision-based real-world systems, such as mobile robots, require a careful balance between energy consumption, compute latency, and safety constraints to optimize operation across dynamic tasks and contexts. As local computation tends to be restricted, offloading the computation, ie, to a remote server, can save local resources while providing access to high-quality predictions from powerful and large models. However, the resulting communication and latency overhead has led to limited usability of cloud models in dynamic, safety-critical, real-time settings. To effectively address this trade-off, we introduce UniLCD, a novel hybrid inference framework for enabling flexible local-cloud collaboration. By efficiently optimizing a flexible routing module via reinforcement learning and a suitable multi-task objective, UniLCD is specifically designed to support the multiple constraints of safety-critical end-to-end mobile systems. We validate the proposed approach using a challenging, crowded navigation task requiring frequent and timely switching between local and cloud operations. UniLCD demonstrates improved overall performance and efficiency, by over 35% compared to state-of-the-art baselines based on various split computing and early exit strategies.
{"title":"UniLCD: Unified Local-Cloud Decision-Making via Reinforcement Learning","authors":"Kathakoli Sengupta, Zhongkai Shagguan, Sandesh Bharadwaj, Sanjay Arora, Eshed Ohn-Bar, Renato Mancuso","doi":"arxiv-2409.11403","DOIUrl":"https://doi.org/arxiv-2409.11403","url":null,"abstract":"Embodied vision-based real-world systems, such as mobile robots, require a\u0000careful balance between energy consumption, compute latency, and safety\u0000constraints to optimize operation across dynamic tasks and contexts. As local\u0000computation tends to be restricted, offloading the computation, ie, to a remote\u0000server, can save local resources while providing access to high-quality\u0000predictions from powerful and large models. However, the resulting\u0000communication and latency overhead has led to limited usability of cloud models\u0000in dynamic, safety-critical, real-time settings. To effectively address this\u0000trade-off, we introduce UniLCD, a novel hybrid inference framework for enabling\u0000flexible local-cloud collaboration. By efficiently optimizing a flexible\u0000routing module via reinforcement learning and a suitable multi-task objective,\u0000UniLCD is specifically designed to support the multiple constraints of\u0000safety-critical end-to-end mobile systems. We validate the proposed approach\u0000using a challenging, crowded navigation task requiring frequent and timely\u0000switching between local and cloud operations. UniLCD demonstrates improved\u0000overall performance and efficiency, by over 35% compared to state-of-the-art\u0000baselines based on various split computing and early exit strategies.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel Butterfield, Sandilya Sai Garimella, Nai-Jen Cheng, Lu Gan
We present a Morphology-Informed Heterogeneous Graph Neural Network (MI-HGNN) for learning-based contact perception. The architecture and connectivity of the MI-HGNN are constructed from the robot morphology, in which nodes and edges are robot joints and links, respectively. By incorporating the morphology-informed constraints into a neural network, we improve a learning-based approach using model-based knowledge. We apply the proposed MI-HGNN to two contact perception problems, and conduct extensive experiments using both real-world and simulated data collected using two quadruped robots. Our experiments demonstrate the superiority of our method in terms of effectiveness, generalization ability, model efficiency, and sample efficiency. Our MI-HGNN improved the performance of a state-of-the-art model that leverages robot morphological symmetry by 8.4% with only 0.21% of its parameters. Although MI-HGNN is applied to contact perception problems for legged robots in this work, it can be seamlessly applied to other types of multi-body dynamical systems and has the potential to improve other robot learning frameworks. Our code is made publicly available at https://github.com/lunarlab-gatech/Morphology-Informed-HGNN.
{"title":"MI-HGNN: Morphology-Informed Heterogeneous Graph Neural Network for Legged Robot Contact Perception","authors":"Daniel Butterfield, Sandilya Sai Garimella, Nai-Jen Cheng, Lu Gan","doi":"arxiv-2409.11146","DOIUrl":"https://doi.org/arxiv-2409.11146","url":null,"abstract":"We present a Morphology-Informed Heterogeneous Graph Neural Network (MI-HGNN)\u0000for learning-based contact perception. The architecture and connectivity of the\u0000MI-HGNN are constructed from the robot morphology, in which nodes and edges are\u0000robot joints and links, respectively. By incorporating the morphology-informed\u0000constraints into a neural network, we improve a learning-based approach using\u0000model-based knowledge. We apply the proposed MI-HGNN to two contact perception\u0000problems, and conduct extensive experiments using both real-world and simulated\u0000data collected using two quadruped robots. Our experiments demonstrate the\u0000superiority of our method in terms of effectiveness, generalization ability,\u0000model efficiency, and sample efficiency. Our MI-HGNN improved the performance\u0000of a state-of-the-art model that leverages robot morphological symmetry by 8.4%\u0000with only 0.21% of its parameters. Although MI-HGNN is applied to contact\u0000perception problems for legged robots in this work, it can be seamlessly\u0000applied to other types of multi-body dynamical systems and has the potential to\u0000improve other robot learning frameworks. Our code is made publicly available at\u0000https://github.com/lunarlab-gatech/Morphology-Informed-HGNN.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhixing Hou, Maoxu Gao, Hang Yu, Mengyu Yang, Chio-In Ieong
This paper introduces a Spiking Diffusion Policy (SDP) learning method for robotic manipulation by integrating Spiking Neurons and Learnable Channel-wise Membrane Thresholds (LCMT) into the diffusion policy model, thereby enhancing computational efficiency and achieving high performance in evaluated tasks. Specifically, the proposed SDP model employs the U-Net architecture as the backbone for diffusion learning within the Spiking Neural Network (SNN). It strategically places residual connections between the spike convolution operations and the Leaky Integrate-and-Fire (LIF) nodes, thereby preventing disruptions to the spiking states. Additionally, we introduce a temporal encoding block and a temporal decoding block to transform static and dynamic data with timestep $T_S$ into each other, enabling the transmission of data within the SNN in spike format. Furthermore, we propose LCMT to enable the adaptive acquisition of membrane potential thresholds, thereby matching the conditions of varying membrane potentials and firing rates across channels and avoiding the cumbersome process of manually setting and tuning hyperparameters. Evaluating the SDP model on seven distinct tasks with SNN timestep $T_S=4$, we achieve results comparable to those of the ANN counterparts, along with faster convergence speeds than the baseline SNN method. This improvement is accompanied by a reduction of 94.3% in dynamic energy consumption estimated on 45nm hardware.
{"title":"SDP: Spiking Diffusion Policy for Robotic Manipulation with Learnable Channel-Wise Membrane Thresholds","authors":"Zhixing Hou, Maoxu Gao, Hang Yu, Mengyu Yang, Chio-In Ieong","doi":"arxiv-2409.11195","DOIUrl":"https://doi.org/arxiv-2409.11195","url":null,"abstract":"This paper introduces a Spiking Diffusion Policy (SDP) learning method for\u0000robotic manipulation by integrating Spiking Neurons and Learnable Channel-wise\u0000Membrane Thresholds (LCMT) into the diffusion policy model, thereby enhancing\u0000computational efficiency and achieving high performance in evaluated tasks.\u0000Specifically, the proposed SDP model employs the U-Net architecture as the\u0000backbone for diffusion learning within the Spiking Neural Network (SNN). It\u0000strategically places residual connections between the spike convolution\u0000operations and the Leaky Integrate-and-Fire (LIF) nodes, thereby preventing\u0000disruptions to the spiking states. Additionally, we introduce a temporal\u0000encoding block and a temporal decoding block to transform static and dynamic\u0000data with timestep $T_S$ into each other, enabling the transmission of data\u0000within the SNN in spike format. Furthermore, we propose LCMT to enable the\u0000adaptive acquisition of membrane potential thresholds, thereby matching the\u0000conditions of varying membrane potentials and firing rates across channels and\u0000avoiding the cumbersome process of manually setting and tuning hyperparameters.\u0000Evaluating the SDP model on seven distinct tasks with SNN timestep $T_S=4$, we\u0000achieve results comparable to those of the ANN counterparts, along with faster\u0000convergence speeds than the baseline SNN method. This improvement is\u0000accompanied by a reduction of 94.3% in dynamic energy consumption estimated on\u000045nm hardware.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yihong Xu, Victor Letzelter, Mickaël Chen, Éloi Zablocki, Matthieu Cord
In autonomous driving, motion prediction aims at forecasting the future trajectories of nearby agents, helping the ego vehicle to anticipate behaviors and drive safely. A key challenge is generating a diverse set of future predictions, commonly addressed using data-driven models with Multiple Choice Learning (MCL) architectures and Winner-Takes-All (WTA) training objectives. However, these methods face initialization sensitivity and training instabilities. Additionally, to compensate for limited performance, some approaches rely on training with a large set of hypotheses, requiring a post-selection step during inference to significantly reduce the number of predictions. To tackle these issues, we take inspiration from annealed MCL, a recently introduced technique that improves the convergence properties of MCL methods through an annealed Winner-Takes-All loss (aWTA). In this paper, we demonstrate how the aWTA loss can be integrated with state-of-the-art motion forecasting models to enhance their performance using only a minimal set of hypotheses, eliminating the need for the cumbersome post-selection step. Our approach can be easily incorporated into any trajectory prediction model normally trained using WTA and yields significant improvements. To facilitate the application of our approach to future motion forecasting models, the code will be made publicly available upon acceptance: https://github.com/valeoai/MF_aWTA.
{"title":"Annealed Winner-Takes-All for Motion Forecasting","authors":"Yihong Xu, Victor Letzelter, Mickaël Chen, Éloi Zablocki, Matthieu Cord","doi":"arxiv-2409.11172","DOIUrl":"https://doi.org/arxiv-2409.11172","url":null,"abstract":"In autonomous driving, motion prediction aims at forecasting the future\u0000trajectories of nearby agents, helping the ego vehicle to anticipate behaviors\u0000and drive safely. A key challenge is generating a diverse set of future\u0000predictions, commonly addressed using data-driven models with Multiple Choice\u0000Learning (MCL) architectures and Winner-Takes-All (WTA) training objectives.\u0000However, these methods face initialization sensitivity and training\u0000instabilities. Additionally, to compensate for limited performance, some\u0000approaches rely on training with a large set of hypotheses, requiring a\u0000post-selection step during inference to significantly reduce the number of\u0000predictions. To tackle these issues, we take inspiration from annealed MCL, a\u0000recently introduced technique that improves the convergence properties of MCL\u0000methods through an annealed Winner-Takes-All loss (aWTA). In this paper, we\u0000demonstrate how the aWTA loss can be integrated with state-of-the-art motion\u0000forecasting models to enhance their performance using only a minimal set of\u0000hypotheses, eliminating the need for the cumbersome post-selection step. Our\u0000approach can be easily incorporated into any trajectory prediction model\u0000normally trained using WTA and yields significant improvements. To facilitate\u0000the application of our approach to future motion forecasting models, the code\u0000will be made publicly available upon acceptance:\u0000https://github.com/valeoai/MF_aWTA.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The workshop is affiliated with 33nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN 2024) August 26~30, 2023 / Pasadena, CA, USA. It is designed as a half-day event, extending over four hours from 9:00 to 12:30 PST time. It accommodates both in-person and virtual attendees (via Zoom), ensuring a flexible participation mode. The agenda is thoughtfully crafted to include a diverse range of sessions: two keynote speeches that promise to provide insightful perspectives, two dedicated paper presentation sessions, an interactive panel discussion to foster dialogue among experts which facilitates deeper dives into specific topics, and a 15-minute coffee break. The workshop website: https://sites.google.com/view/interaiworkshops/home.
{"title":"The 1st InterAI Workshop: Interactive AI for Human-centered Robotics","authors":"Yuchong Zhang, Elmira Yadollahi, Yong Ma, Di Fu, Iolanda Leite, Danica Kragic","doi":"arxiv-2409.11150","DOIUrl":"https://doi.org/arxiv-2409.11150","url":null,"abstract":"The workshop is affiliated with 33nd IEEE International Conference on Robot\u0000and Human Interactive Communication (RO-MAN 2024) August 26~30, 2023 /\u0000Pasadena, CA, USA. It is designed as a half-day event, extending over four\u0000hours from 9:00 to 12:30 PST time. It accommodates both in-person and virtual\u0000attendees (via Zoom), ensuring a flexible participation mode. The agenda is\u0000thoughtfully crafted to include a diverse range of sessions: two keynote\u0000speeches that promise to provide insightful perspectives, two dedicated paper\u0000presentation sessions, an interactive panel discussion to foster dialogue among\u0000experts which facilitates deeper dives into specific topics, and a 15-minute\u0000coffee break. The workshop website:\u0000https://sites.google.com/view/interaiworkshops/home.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Botao He, Guofei Chen, Cornelia Fermuller, Yiannis Aloimonos, Ji Zhang
This paper presents a novel method for real-time 3D navigation in large-scale, complex environments using a hierarchical 3D visibility graph (V-graph). The proposed algorithm addresses the computational challenges of V-graph construction and shortest path search on the graph simultaneously. By introducing hierarchical 3D V-graph construction with heuristic visibility update, the 3D V-graph is constructed in O(K*n^2logn) time, which guarantees real-time performance. The proposed iterative divide-and-conquer path search method can achieve near-optimal path solutions within the constraints of real-time operations. The algorithm ensures efficient 3D V-graph construction and path search. Extensive simulated and real-world environments validated that our algorithm reduces the travel time by 42%, achieves up to 24.8% higher trajectory efficiency, and runs faster than most benchmarks by orders of magnitude in complex environments. The code and developed simulator have been open-sourced to facilitate future research.
本文提出了一种使用分层三维可见度图(V-graph)在大规模复杂环境中进行实时三维导航的新方法。所提出的算法同时解决了V图构建和图上最短路径搜索的计算难题。通过引入分层三维可见性图构建和启发式可见性更新,三维可见性图的构建只需O(K*n^2logn)时间,保证了实时性。所提出的迭代分而治之路径搜索方法可以在实时操作的约束条件下获得接近最优的路径解。该算法确保了高效的三维 V 型图构建和路径搜索。大量的模拟和实际环境验证了我们的算法可以减少 42% 的旅行时间,实现高达 24.8% 的高轨迹效率,并且在复杂环境中的运行速度比大多数基准快了几个数量级。代码和开发的模拟器已经开源,以方便未来的研究。
{"title":"Air-FAR: Fast and Adaptable Routing for Aerial Navigation in Large-scale Complex Unknown Environments","authors":"Botao He, Guofei Chen, Cornelia Fermuller, Yiannis Aloimonos, Ji Zhang","doi":"arxiv-2409.11188","DOIUrl":"https://doi.org/arxiv-2409.11188","url":null,"abstract":"This paper presents a novel method for real-time 3D navigation in\u0000large-scale, complex environments using a hierarchical 3D visibility graph\u0000(V-graph). The proposed algorithm addresses the computational challenges of\u0000V-graph construction and shortest path search on the graph simultaneously. By\u0000introducing hierarchical 3D V-graph construction with heuristic visibility\u0000update, the 3D V-graph is constructed in O(K*n^2logn) time, which guarantees\u0000real-time performance. The proposed iterative divide-and-conquer path search\u0000method can achieve near-optimal path solutions within the constraints of\u0000real-time operations. The algorithm ensures efficient 3D V-graph construction\u0000and path search. Extensive simulated and real-world environments validated that\u0000our algorithm reduces the travel time by 42%, achieves up to 24.8% higher\u0000trajectory efficiency, and runs faster than most benchmarks by orders of\u0000magnitude in complex environments. The code and developed simulator have been\u0000open-sourced to facilitate future research.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kaustubh Joshi, Tianchen Liu, Alan Williams, Matthew Gray, Xiaomin Lin, Nikhil Chopra
Water quality mapping for critical parameters such as temperature, salinity, and turbidity is crucial for assessing an aquaculture farm's health and yield capacity. Traditional approaches involve using boats or human divers, which are time-constrained and lack depth variability. This work presents an innovative approach to 3D water quality mapping in shallow water environments using a BlueROV2 equipped with GPS and a water quality sensor. This system allows for accurate location correction by resurfacing when errors occur. This study is being conducted at an oyster farm in the Chesapeake Bay, USA, providing a more comprehensive and precise water quality analysis in aquaculture settings.
{"title":"3D Water Quality Mapping using Invariant Extended Kalman Filtering for Underwater Robot Localization","authors":"Kaustubh Joshi, Tianchen Liu, Alan Williams, Matthew Gray, Xiaomin Lin, Nikhil Chopra","doi":"arxiv-2409.11578","DOIUrl":"https://doi.org/arxiv-2409.11578","url":null,"abstract":"Water quality mapping for critical parameters such as temperature, salinity,\u0000and turbidity is crucial for assessing an aquaculture farm's health and yield\u0000capacity. Traditional approaches involve using boats or human divers, which are\u0000time-constrained and lack depth variability. This work presents an innovative\u0000approach to 3D water quality mapping in shallow water environments using a\u0000BlueROV2 equipped with GPS and a water quality sensor. This system allows for\u0000accurate location correction by resurfacing when errors occur. This study is\u0000being conducted at an oyster farm in the Chesapeake Bay, USA, providing a more\u0000comprehensive and precise water quality analysis in aquaculture settings.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Multi-robot collaboration for target tracking presents significant challenges in hazardous environments, including addressing robot failures, dynamic priority changes, and other unpredictable factors. Moreover, these challenges are increased in adversarial settings if the environment is unknown. In this paper, we propose a resilient and adaptive framework for multi-robot, multi-target tracking in environments with unknown sensing and communication danger zones. The damages posed by these zones are temporary, allowing robots to track targets while accepting the risk of entering dangerous areas. We formulate the problem as an optimization with soft chance constraints, enabling real-time adjustments to robot behavior based on varying types of dangers and failures. An adaptive replanning strategy is introduced, featuring different triggers to improve group performance. This approach allows for dynamic prioritization of target tracking and risk aversion or resilience, depending on evolving resources and real-time conditions. To validate the effectiveness of the proposed method, we benchmark and evaluate it across multiple scenarios in simulation and conduct several real-world experiments.
{"title":"Resilient and Adaptive Replanning for Multi-Robot Target Tracking with Sensing and Communication Danger Zones","authors":"Peihan Li, Yuwei Wu, Jiazhen Liu, Gaurav S. Sukhatme, Vijay Kumar, Lifeng Zhou","doi":"arxiv-2409.11230","DOIUrl":"https://doi.org/arxiv-2409.11230","url":null,"abstract":"Multi-robot collaboration for target tracking presents significant challenges\u0000in hazardous environments, including addressing robot failures, dynamic\u0000priority changes, and other unpredictable factors. Moreover, these challenges\u0000are increased in adversarial settings if the environment is unknown. In this\u0000paper, we propose a resilient and adaptive framework for multi-robot,\u0000multi-target tracking in environments with unknown sensing and communication\u0000danger zones. The damages posed by these zones are temporary, allowing robots\u0000to track targets while accepting the risk of entering dangerous areas. We\u0000formulate the problem as an optimization with soft chance constraints, enabling\u0000real-time adjustments to robot behavior based on varying types of dangers and\u0000failures. An adaptive replanning strategy is introduced, featuring different\u0000triggers to improve group performance. This approach allows for dynamic\u0000prioritization of target tracking and risk aversion or resilience, depending on\u0000evolving resources and real-time conditions. To validate the effectiveness of\u0000the proposed method, we benchmark and evaluate it across multiple scenarios in\u0000simulation and conduct several real-world experiments.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Paul Werner Lödige, Maximilian Xiling Li, Rudolf Lioutikov
Movement Primitives (MPs) are a well-established method for representing and generating modular robot trajectories. This work presents FA-ProDMP, a new approach which introduces force awareness to Probabilistic Dynamic Movement Primitives (ProDMP). FA-ProDMP adapts the trajectory during runtime to account for measured and desired forces. It offers smooth trajectories and captures position and force correlations over multiple trajectories, e.g. a set of human demonstrations. FA-ProDMP supports multiple axes of force and is thus agnostic to cartesian or joint space control. This makes FA-ProDMP a valuable tool for learning contact rich manipulation tasks such as polishing, cutting or industrial assembly from demonstration. In order to reliably evaluate FA-ProDMP, this work additionally introduces a modular, 3D printed task suite called POEMPEL, inspired by the popular Lego Technic pins. POEMPEL mimics industrial peg-in-hole assembly tasks with force requirements. It offers multiple parameters of adjustment, such as position, orientation and plug stiffness level, thus varying the direction and amount of required forces. Our experiments show that FA-ProDMP outperforms other MP formulations on the POEMPEL setup and a electrical power plug insertion task, due to its replanning capabilities based on the measured forces. These findings highlight how FA-ProDMP enhances the performance of robotic systems in contact-rich manipulation tasks.
{"title":"Use the Force, Bot! -- Force-Aware ProDMP with Event-Based Replanning","authors":"Paul Werner Lödige, Maximilian Xiling Li, Rudolf Lioutikov","doi":"arxiv-2409.11144","DOIUrl":"https://doi.org/arxiv-2409.11144","url":null,"abstract":"Movement Primitives (MPs) are a well-established method for representing and\u0000generating modular robot trajectories. This work presents FA-ProDMP, a new\u0000approach which introduces force awareness to Probabilistic Dynamic Movement\u0000Primitives (ProDMP). FA-ProDMP adapts the trajectory during runtime to account\u0000for measured and desired forces. It offers smooth trajectories and captures\u0000position and force correlations over multiple trajectories, e.g. a set of human\u0000demonstrations. FA-ProDMP supports multiple axes of force and is thus agnostic\u0000to cartesian or joint space control. This makes FA-ProDMP a valuable tool for\u0000learning contact rich manipulation tasks such as polishing, cutting or\u0000industrial assembly from demonstration. In order to reliably evaluate\u0000FA-ProDMP, this work additionally introduces a modular, 3D printed task suite\u0000called POEMPEL, inspired by the popular Lego Technic pins. POEMPEL mimics\u0000industrial peg-in-hole assembly tasks with force requirements. It offers\u0000multiple parameters of adjustment, such as position, orientation and plug\u0000stiffness level, thus varying the direction and amount of required forces. Our\u0000experiments show that FA-ProDMP outperforms other MP formulations on the\u0000POEMPEL setup and a electrical power plug insertion task, due to its replanning\u0000capabilities based on the measured forces. These findings highlight how\u0000FA-ProDMP enhances the performance of robotic systems in contact-rich\u0000manipulation tasks.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}