Charles A. Meehan, Paul Rademacher, Mark Roberts, Laura M. Hiatt
Robot manipulation in real-world settings often requires adapting the robot's behavior to the current situation, such as by changing the sequences in which policies execute to achieve the desired task. Problematically, however, we show that composing a novel sequence of five deep RL options to perform a pick-and-place task is unlikely to successfully complete, even if their initiation and termination conditions align. We propose a framework to determine whether sequences will succeed a priori, and examine three approaches that adapt options to sequence successfully if they will not. Crucially, our adaptation methods consider the actual subset of points that the option is trained from or where it ends: (1) trains the second option to start where the first ends; (2) trains the first option to reach the centroid of where the second starts; and (3) trains the first option to reach the median of where the second starts. Our results show that our framework and adaptation methods have promise in adapting options to work in novel sequences.
{"title":"Composing Option Sequences by Adaptation: Initial Results","authors":"Charles A. Meehan, Paul Rademacher, Mark Roberts, Laura M. Hiatt","doi":"arxiv-2409.08195","DOIUrl":"https://doi.org/arxiv-2409.08195","url":null,"abstract":"Robot manipulation in real-world settings often requires adapting the robot's\u0000behavior to the current situation, such as by changing the sequences in which\u0000policies execute to achieve the desired task. Problematically, however, we show\u0000that composing a novel sequence of five deep RL options to perform a\u0000pick-and-place task is unlikely to successfully complete, even if their\u0000initiation and termination conditions align. We propose a framework to\u0000determine whether sequences will succeed a priori, and examine three approaches\u0000that adapt options to sequence successfully if they will not. Crucially, our\u0000adaptation methods consider the actual subset of points that the option is\u0000trained from or where it ends: (1) trains the second option to start where the\u0000first ends; (2) trains the first option to reach the centroid of where the\u0000second starts; and (3) trains the first option to reach the median of where the\u0000second starts. Our results show that our framework and adaptation methods have\u0000promise in adapting options to work in novel sequences.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
James Berneburg, Xuan Wang, Xuesu Xiao, Daigo Shishika
This paper presents a game theoretic formulation of a graph traversal problem, with applications to robots moving through hazardous environments in the presence of an adversary, as in military and security applications. The blue team of robots moves in an environment modeled by a time-varying graph, attempting to reach some goal with minimum cost, while the red team controls how the graph changes to maximize the cost. The problem is formulated as a stochastic game, so that Nash equilibrium strategies can be computed numerically. Bounds are provided for the game value, with a guarantee that it solves the original problem. Numerical simulations demonstrate the results and the effectiveness of this method, particularly showing the benefit of mixing actions for both players, as well as beneficial coordinated behavior, where blue robots split up and/or synchronize to traverse risky edges.
{"title":"Multi-Robot Coordination Induced in Hazardous Environments through an Adversarial Graph-Traversal Game","authors":"James Berneburg, Xuan Wang, Xuesu Xiao, Daigo Shishika","doi":"arxiv-2409.08222","DOIUrl":"https://doi.org/arxiv-2409.08222","url":null,"abstract":"This paper presents a game theoretic formulation of a graph traversal\u0000problem, with applications to robots moving through hazardous environments in\u0000the presence of an adversary, as in military and security applications. The\u0000blue team of robots moves in an environment modeled by a time-varying graph,\u0000attempting to reach some goal with minimum cost, while the red team controls\u0000how the graph changes to maximize the cost. The problem is formulated as a\u0000stochastic game, so that Nash equilibrium strategies can be computed\u0000numerically. Bounds are provided for the game value, with a guarantee that it\u0000solves the original problem. Numerical simulations demonstrate the results and\u0000the effectiveness of this method, particularly showing the benefit of mixing\u0000actions for both players, as well as beneficial coordinated behavior, where\u0000blue robots split up and/or synchronize to traverse risky edges.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Visual-inertial systems have been widely studied and applied in the last two decades, mainly due to their low cost and power consumption, small footprint, and high availability. Such a trend simultaneously leads to a large amount of visual-inertial calibration methods being presented, as accurate spatiotemporal parameters between sensors are a prerequisite for visual-inertial fusion. In our previous work, i.e., iKalibr, a continuous-time-based visual-inertial calibration method was proposed as a part of one-shot multi-sensor resilient spatiotemporal calibration. While requiring no artificial target brings considerable convenience, computationally expensive pose estimation is demanded in initialization and batch optimization, limiting its availability. Fortunately, this could be vastly improved for the RGBDs with additional depth information, by employing mapping-free ego-velocity estimation instead of mapping-based pose estimation. In this paper, we present the continuous-time ego-velocity estimation-based RGBD-inertial spatiotemporal calibration, termed as iKalibr-RGBD, which is also targetless but computationally efficient. The general pipeline of iKalibr-RGBD is inherited from iKalibr, composed of a rigorous initialization procedure and several continuous-time batch optimizations. The implementation of iKalibr-RGBD is open-sourced at (https://github.com/Unsigned-Long/iKalibr) to benefit the research community.
{"title":"iKalibr-RGBD: Partially-Specialized Target-Free Visual-Inertial Spatiotemporal Calibration For RGBDs via Continuous-Time Velocity Estimation","authors":"Shuolong Chen, Xingxing Li, Shengyu Li, Yuxuan Zhou","doi":"arxiv-2409.07116","DOIUrl":"https://doi.org/arxiv-2409.07116","url":null,"abstract":"Visual-inertial systems have been widely studied and applied in the last two\u0000decades, mainly due to their low cost and power consumption, small footprint,\u0000and high availability. Such a trend simultaneously leads to a large amount of\u0000visual-inertial calibration methods being presented, as accurate spatiotemporal\u0000parameters between sensors are a prerequisite for visual-inertial fusion. In\u0000our previous work, i.e., iKalibr, a continuous-time-based visual-inertial\u0000calibration method was proposed as a part of one-shot multi-sensor resilient\u0000spatiotemporal calibration. While requiring no artificial target brings\u0000considerable convenience, computationally expensive pose estimation is demanded\u0000in initialization and batch optimization, limiting its availability.\u0000Fortunately, this could be vastly improved for the RGBDs with additional depth\u0000information, by employing mapping-free ego-velocity estimation instead of\u0000mapping-based pose estimation. In this paper, we present the continuous-time\u0000ego-velocity estimation-based RGBD-inertial spatiotemporal calibration, termed\u0000as iKalibr-RGBD, which is also targetless but computationally efficient. The\u0000general pipeline of iKalibr-RGBD is inherited from iKalibr, composed of a\u0000rigorous initialization procedure and several continuous-time batch\u0000optimizations. The implementation of iKalibr-RGBD is open-sourced at\u0000(https://github.com/Unsigned-Long/iKalibr) to benefit the research community.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jonas Stolle, Philip Arm, Mayank Mittal, Marco Hutter
Pedipulation leverages the feet of legged robots for mobile manipulation, eliminating the need for dedicated robotic arms. While previous works have showcased blind and task-specific pedipulation skills, they fail to account for static and dynamic obstacles in the environment. To address this limitation, we introduce a reinforcement learning-based approach to train a whole-body obstacle-aware policy that tracks foot position commands while simultaneously avoiding obstacles. Despite training the policy in only five different static scenarios in simulation, we show that it generalizes to unknown environments with different numbers and types of obstacles. We analyze the performance of our method through a set of simulation experiments and successfully deploy the learned policy on the ANYmal quadruped, demonstrating its capability to follow foot commands while navigating around static and dynamic obstacles.
{"title":"Perceptive Pedipulation with Local Obstacle Avoidance","authors":"Jonas Stolle, Philip Arm, Mayank Mittal, Marco Hutter","doi":"arxiv-2409.07195","DOIUrl":"https://doi.org/arxiv-2409.07195","url":null,"abstract":"Pedipulation leverages the feet of legged robots for mobile manipulation,\u0000eliminating the need for dedicated robotic arms. While previous works have\u0000showcased blind and task-specific pedipulation skills, they fail to account for\u0000static and dynamic obstacles in the environment. To address this limitation, we\u0000introduce a reinforcement learning-based approach to train a whole-body\u0000obstacle-aware policy that tracks foot position commands while simultaneously\u0000avoiding obstacles. Despite training the policy in only five different static\u0000scenarios in simulation, we show that it generalizes to unknown environments\u0000with different numbers and types of obstacles. We analyze the performance of\u0000our method through a set of simulation experiments and successfully deploy the\u0000learned policy on the ANYmal quadruped, demonstrating its capability to follow\u0000foot commands while navigating around static and dynamic obstacles.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vincenzo Polizzi, Marco Cannici, Davide Scaramuzza, Jonathan Kelly
Camera relocalization methods range from dense image alignment to direct camera pose regression from a query image. Among these, sparse feature matching stands out as an efficient, versatile, and generally lightweight approach with numerous applications. However, feature-based methods often struggle with significant viewpoint and appearance changes, leading to matching failures and inaccurate pose estimates. To overcome this limitation, we propose a novel approach that leverages a globally sparse yet locally dense 3D representation of 2D features. By tracking and triangulating landmarks over a sequence of frames, we construct a sparse voxel map optimized to render image patch descriptors observed during tracking. Given an initial pose estimate, we first synthesize descriptors from the voxels using volumetric rendering and then perform feature matching to estimate the camera pose. This methodology enables the generation of descriptors for unseen views, enhancing robustness to view changes. We extensively evaluate our method on the 7-Scenes and Cambridge Landmarks datasets. Our results show that our method significantly outperforms existing state-of-the-art feature representation techniques in indoor environments, achieving up to a 39% improvement in median translation error. Additionally, our approach yields comparable results to other methods for outdoor scenarios while maintaining lower memory and computational costs.
{"title":"FaVoR: Features via Voxel Rendering for Camera Relocalization","authors":"Vincenzo Polizzi, Marco Cannici, Davide Scaramuzza, Jonathan Kelly","doi":"arxiv-2409.07571","DOIUrl":"https://doi.org/arxiv-2409.07571","url":null,"abstract":"Camera relocalization methods range from dense image alignment to direct\u0000camera pose regression from a query image. Among these, sparse feature matching\u0000stands out as an efficient, versatile, and generally lightweight approach with\u0000numerous applications. However, feature-based methods often struggle with\u0000significant viewpoint and appearance changes, leading to matching failures and\u0000inaccurate pose estimates. To overcome this limitation, we propose a novel\u0000approach that leverages a globally sparse yet locally dense 3D representation\u0000of 2D features. By tracking and triangulating landmarks over a sequence of\u0000frames, we construct a sparse voxel map optimized to render image patch\u0000descriptors observed during tracking. Given an initial pose estimate, we first\u0000synthesize descriptors from the voxels using volumetric rendering and then\u0000perform feature matching to estimate the camera pose. This methodology enables\u0000the generation of descriptors for unseen views, enhancing robustness to view\u0000changes. We extensively evaluate our method on the 7-Scenes and Cambridge\u0000Landmarks datasets. Our results show that our method significantly outperforms\u0000existing state-of-the-art feature representation techniques in indoor\u0000environments, achieving up to a 39% improvement in median translation error.\u0000Additionally, our approach yields comparable results to other methods for\u0000outdoor scenarios while maintaining lower memory and computational costs.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yu Chen, Mahshid Mansouri, Chenzhang Xiao, Ze Wang, Elizabeth T. Hsiao-Wecksler, William R. Norris
This study introduces a shared-control approach for collision avoidance in a self-balancing riding ballbot, called PURE, marked by its dynamic stability, omnidirectional movement, and hands-free interface. Integrated with a sensor array and a novel Passive Artificial Potential Field (PAPF) method, PURE provides intuitive navigation with deceleration assistance and haptic/audio feedback, effectively mitigating collision risks. This approach addresses the limitations of traditional APF methods, such as control oscillations and unnecessary speed reduction in challenging scenarios. A human-robot interaction experiment, with 20 manual wheelchair users and able-bodied individuals, was conducted to evaluate the performance of indoor navigation and obstacle avoidance with the proposed shared-control algorithm. Results indicated that shared-control significantly reduced collisions and cognitive load without affecting travel speed, offering intuitive and safe operation. These findings highlight the shared-control system's suitability for enhancing collision avoidance in self-balancing mobility devices, a relatively unexplored area in assistive mobility research.
{"title":"Enabling Shared-Control for A Riding Ballbot System","authors":"Yu Chen, Mahshid Mansouri, Chenzhang Xiao, Ze Wang, Elizabeth T. Hsiao-Wecksler, William R. Norris","doi":"arxiv-2409.07013","DOIUrl":"https://doi.org/arxiv-2409.07013","url":null,"abstract":"This study introduces a shared-control approach for collision avoidance in a\u0000self-balancing riding ballbot, called PURE, marked by its dynamic stability,\u0000omnidirectional movement, and hands-free interface. Integrated with a sensor\u0000array and a novel Passive Artificial Potential Field (PAPF) method, PURE\u0000provides intuitive navigation with deceleration assistance and haptic/audio\u0000feedback, effectively mitigating collision risks. This approach addresses the\u0000limitations of traditional APF methods, such as control oscillations and\u0000unnecessary speed reduction in challenging scenarios. A human-robot interaction\u0000experiment, with 20 manual wheelchair users and able-bodied individuals, was\u0000conducted to evaluate the performance of indoor navigation and obstacle\u0000avoidance with the proposed shared-control algorithm. Results indicated that\u0000shared-control significantly reduced collisions and cognitive load without\u0000affecting travel speed, offering intuitive and safe operation. These findings\u0000highlight the shared-control system's suitability for enhancing collision\u0000avoidance in self-balancing mobility devices, a relatively unexplored area in\u0000assistive mobility research.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Paul ChauchatAMU SCI, AMU, LIS, DIAPRO, Silvère BonnabelCAOR, Axel Barrau
We consider the problem of observer design for a nonholonomic car (more generally a wheeled robot) equipped with wheel speeds with unknown wheel radius, and whose position is measured via a GNSS antenna placed at an unknown position in the car. In a tutorial and unified exposition, we recall the recent theory of two-frame systems within the field of invariant Kalman filtering. We then show how to adapt it geometrically to address the considered problem, although it seems at first sight out of its scope. This yields an invariant extended Kalman filter having autonomous error equations, and state-independent Jacobians, which is shown to work remarkably well in simulations. The proposed novel construction thus extends the application scope of invariant filtering.
{"title":"Invariant filtering for wheeled vehicle localization with unknown wheel radius and unknown GNSS lever arm","authors":"Paul ChauchatAMU SCI, AMU, LIS, DIAPRO, Silvère BonnabelCAOR, Axel Barrau","doi":"arxiv-2409.07050","DOIUrl":"https://doi.org/arxiv-2409.07050","url":null,"abstract":"We consider the problem of observer design for a nonholonomic car (more\u0000generally a wheeled robot) equipped with wheel speeds with unknown wheel\u0000radius, and whose position is measured via a GNSS antenna placed at an unknown\u0000position in the car. In a tutorial and unified exposition, we recall the recent\u0000theory of two-frame systems within the field of invariant Kalman filtering. We\u0000then show how to adapt it geometrically to address the considered problem,\u0000although it seems at first sight out of its scope. This yields an invariant\u0000extended Kalman filter having autonomous error equations, and state-independent\u0000Jacobians, which is shown to work remarkably well in simulations. The proposed\u0000novel construction thus extends the application scope of invariant filtering.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Houston Claure, Kate Candon, Inyoung Shin, Marynel Vázquez
People deeply care about how fairly they are treated by robots. The established paradigm for probing fairness in Human-Robot Interaction (HRI) involves measuring the perception of the fairness of a robot at the conclusion of an interaction. However, such an approach is limited as interactions vary over time, potentially causing changes in fairness perceptions as well. To validate this idea, we conducted a 2x2 user study with a mixed design (N=40) where we investigated two factors: the timing of unfair robot actions (early or late in an interaction) and the beneficiary of those actions (either another robot or the participant). Our results show that fairness judgments are not static. They can shift based on the timing of unfair robot actions. Further, we explored using perceptions of three key factors (reduced welfare, conduct, and moral transgression) proposed by a Fairness Theory from Organizational Justice to predict momentary perceptions of fairness in our study. Interestingly, we found that the reduced welfare and moral transgression factors were better predictors than all factors together. Our findings reinforce the idea that unfair robot behavior can shape perceptions of group dynamics and trust towards a robot and pave the path to future research directions on moment-to-moment fairness perceptions
{"title":"Dynamic Fairness Perceptions in Human-Robot Interaction","authors":"Houston Claure, Kate Candon, Inyoung Shin, Marynel Vázquez","doi":"arxiv-2409.07560","DOIUrl":"https://doi.org/arxiv-2409.07560","url":null,"abstract":"People deeply care about how fairly they are treated by robots. The\u0000established paradigm for probing fairness in Human-Robot Interaction (HRI)\u0000involves measuring the perception of the fairness of a robot at the conclusion\u0000of an interaction. However, such an approach is limited as interactions vary\u0000over time, potentially causing changes in fairness perceptions as well. To\u0000validate this idea, we conducted a 2x2 user study with a mixed design (N=40)\u0000where we investigated two factors: the timing of unfair robot actions (early or\u0000late in an interaction) and the beneficiary of those actions (either another\u0000robot or the participant). Our results show that fairness judgments are not\u0000static. They can shift based on the timing of unfair robot actions. Further, we\u0000explored using perceptions of three key factors (reduced welfare, conduct, and\u0000moral transgression) proposed by a Fairness Theory from Organizational Justice\u0000to predict momentary perceptions of fairness in our study. Interestingly, we\u0000found that the reduced welfare and moral transgression factors were better\u0000predictors than all factors together. Our findings reinforce the idea that\u0000unfair robot behavior can shape perceptions of group dynamics and trust towards\u0000a robot and pave the path to future research directions on moment-to-moment\u0000fairness perceptions","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xuzhao Huang, Akira Seino, Fuyuki Tokuda, Akinari Kobayashi, Dayuan Chen, Yasuhisa Hirata, Norman C. Tien, Kazuhiro Kosuge
Seams are information-rich components of garments. The presence of different types of seams and their combinations helps to select grasping points for garment handling. In this paper, we propose a new Seam-Informed Strategy (SIS) for finding actions for handling a garment, such as grasping and unfolding a T-shirt. Candidates for a pair of grasping points for a dual-arm manipulator system are extracted using the proposed Seam Feature Extraction Method (SFEM). A pair of grasping points for the robot system is selected by the proposed Decision Matrix Iteration Method (DMIM). The decision matrix is first computed by multiple human demonstrations and updated by the robot execution results to improve the grasping and unfolding performance of the robot. Note that the proposed scheme is trained on real data without relying on simulation. Experimental results demonstrate the effectiveness of the proposed strategy. The project video is available at https://github.com/lancexz/sis.
{"title":"SIS: Seam-Informed Strategy for T-shirt Unfolding","authors":"Xuzhao Huang, Akira Seino, Fuyuki Tokuda, Akinari Kobayashi, Dayuan Chen, Yasuhisa Hirata, Norman C. Tien, Kazuhiro Kosuge","doi":"arxiv-2409.06990","DOIUrl":"https://doi.org/arxiv-2409.06990","url":null,"abstract":"Seams are information-rich components of garments. The presence of different\u0000types of seams and their combinations helps to select grasping points for\u0000garment handling. In this paper, we propose a new Seam-Informed Strategy (SIS)\u0000for finding actions for handling a garment, such as grasping and unfolding a\u0000T-shirt. Candidates for a pair of grasping points for a dual-arm manipulator\u0000system are extracted using the proposed Seam Feature Extraction Method (SFEM).\u0000A pair of grasping points for the robot system is selected by the proposed\u0000Decision Matrix Iteration Method (DMIM). The decision matrix is first computed\u0000by multiple human demonstrations and updated by the robot execution results to\u0000improve the grasping and unfolding performance of the robot. Note that the\u0000proposed scheme is trained on real data without relying on simulation.\u0000Experimental results demonstrate the effectiveness of the proposed strategy.\u0000The project video is available at https://github.com/lancexz/sis.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ruihan Xu, Anthony Opipari, Joshua Mah, Stanley Lewis, Haoran Zhang, Hanzhe Guo, Odest Chadwicke Jenkins
This paper introduces SO(2)-Equivariant Gaussian Sculpting Networks (GSNs) as an approach for SO(2)-Equivariant 3D object reconstruction from single-view image observations. GSNs take a single observation as input to generate a Gaussian splat representation describing the observed object's geometry and texture. By using a shared feature extractor before decoding Gaussian colors, covariances, positions, and opacities, GSNs achieve extremely high throughput (>150FPS). Experiments demonstrate that GSNs can be trained efficiently using a multi-view rendering loss and are competitive, in quality, with expensive diffusion-based reconstruction algorithms. The GSN model is validated on multiple benchmark experiments. Moreover, we demonstrate the potential for GSNs to be used within a robotic manipulation pipeline for object-centric grasping.
{"title":"Single-View 3D Reconstruction via SO(2)-Equivariant Gaussian Sculpting Networks","authors":"Ruihan Xu, Anthony Opipari, Joshua Mah, Stanley Lewis, Haoran Zhang, Hanzhe Guo, Odest Chadwicke Jenkins","doi":"arxiv-2409.07245","DOIUrl":"https://doi.org/arxiv-2409.07245","url":null,"abstract":"This paper introduces SO(2)-Equivariant Gaussian Sculpting Networks (GSNs) as\u0000an approach for SO(2)-Equivariant 3D object reconstruction from single-view\u0000image observations. GSNs take a single observation as input to generate a Gaussian splat\u0000representation describing the observed object's geometry and texture. By using\u0000a shared feature extractor before decoding Gaussian colors, covariances,\u0000positions, and opacities, GSNs achieve extremely high throughput (>150FPS).\u0000Experiments demonstrate that GSNs can be trained efficiently using a multi-view\u0000rendering loss and are competitive, in quality, with expensive diffusion-based\u0000reconstruction algorithms. The GSN model is validated on multiple benchmark\u0000experiments. Moreover, we demonstrate the potential for GSNs to be used within\u0000a robotic manipulation pipeline for object-centric grasping.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}