首页 > 最新文献

Conference on Robot Learning最新文献

英文 中文
Learning Visualization Policies of Augmented Reality for Human-Robot Collaboration 面向人机协作的增强现实学习可视化策略
Pub Date : 2022-11-13 DOI: 10.48550/arXiv.2211.07028
Kishan Chandan, Jack Albertson, Shiqi Zhang
In human-robot collaboration domains, augmented reality (AR) technologies have enabled people to visualize the state of robots. Current AR-based visualization policies are designed manually, which requires a lot of human efforts and domain knowledge. When too little information is visualized, human users find the AR interface not useful; when too much information is visualized, they find it difficult to process the visualized information. In this paper, we develop a framework, called VARIL, that enables AR agents to learn visualization policies (what to visualize, when, and how) from demonstrations. We created a Unity-based platform for simulating warehouse environments where human-robot teammates collaborate on delivery tasks. We have collected a dataset that includes demonstrations of visualizing robots' current and planned behaviors. Results from experiments with real human participants show that, compared with competitive baselines from the literature, our learned visualization strategies significantly increase the efficiency of human-robot teams, while reducing the distraction level of human users. VARIL has been demonstrated in a built-in-lab mock warehouse.
在人机协作领域,增强现实(AR)技术使人们能够可视化机器人的状态。目前基于ar的可视化策略都是手工设计的,这需要大量的人力和领域知识。当可视化的信息太少时,人类用户会觉得AR界面没有用;当太多的信息被可视化时,他们发现很难处理可视化的信息。在本文中,我们开发了一个名为VARIL的框架,使AR代理能够从演示中学习可视化策略(可视化什么,何时以及如何)。我们创建了一个基于unity的平台,用于模拟仓库环境,在仓库环境中,人机队友协作完成交付任务。我们收集了一个数据集,其中包括可视化机器人当前和计划行为的演示。真实人类参与者的实验结果表明,与文献中的竞争基线相比,我们的学习可视化策略显著提高了人机团队的效率,同时降低了人类用户的分心程度。VARIL已在一个内置实验室模拟仓库中进行了演示。
{"title":"Learning Visualization Policies of Augmented Reality for Human-Robot Collaboration","authors":"Kishan Chandan, Jack Albertson, Shiqi Zhang","doi":"10.48550/arXiv.2211.07028","DOIUrl":"https://doi.org/10.48550/arXiv.2211.07028","url":null,"abstract":"In human-robot collaboration domains, augmented reality (AR) technologies have enabled people to visualize the state of robots. Current AR-based visualization policies are designed manually, which requires a lot of human efforts and domain knowledge. When too little information is visualized, human users find the AR interface not useful; when too much information is visualized, they find it difficult to process the visualized information. In this paper, we develop a framework, called VARIL, that enables AR agents to learn visualization policies (what to visualize, when, and how) from demonstrations. We created a Unity-based platform for simulating warehouse environments where human-robot teammates collaborate on delivery tasks. We have collected a dataset that includes demonstrations of visualizing robots' current and planned behaviors. Results from experiments with real human participants show that, compared with competitive baselines from the literature, our learned visualization strategies significantly increase the efficiency of human-robot teams, while reducing the distraction level of human users. VARIL has been demonstrated in a built-in-lab mock warehouse.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121889522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Learning Riemannian Stable Dynamical Systems via Diffeomorphisms 通过微分同态学习黎曼稳定动力系统
Pub Date : 2022-11-06 DOI: 10.48550/arXiv.2211.03169
Jiechao Zhang, H. Mohammadi, L. Rozo
Dexterous and autonomous robots should be capable of executing elaborated dynamical motions skillfully. Learning techniques may be leveraged to build models of such dynamic skills. To accomplish this, the learning model needs to encode a stable vector field that resembles the desired motion dynamics. This is challenging as the robot state does not evolve on a Euclidean space, and therefore the stability guarantees and vector field encoding need to account for the geometry arising from, for example, the orientation representation. To tackle this problem, we propose learning Riemannian stable dynamical systems (RSDS) from demonstrations, allowing us to account for different geometric constraints resulting from the dynamical system state representation. Our approach provides Lyapunov-stability guarantees on Riemannian manifolds that are enforced on the desired motion dynamics via diffeomorphisms built on neural manifold ODEs. We show that our Riemannian approach makes it possible to learn stable dynamical systems displaying complicated vector fields on both illustrative examples and real-world manipulation tasks, where Euclidean approximations fail.
灵巧和自主的机器人应该能够熟练地执行复杂的动态运动。可以利用学习技术来构建这种动态技能的模型。为了实现这一点,学习模型需要编码一个稳定的向量场,该向量场与期望的运动动力学相似。这是具有挑战性的,因为机器人的状态不是在欧几里得空间中进化的,因此稳定性保证和向量场编码需要考虑到由几何产生的,例如,方向表示。为了解决这个问题,我们建议从演示中学习黎曼稳定动力系统(RSDS),使我们能够解释由动力系统状态表示产生的不同几何约束。我们的方法提供了黎曼流形的李雅普诺夫稳定性保证,这些保证是通过建立在神经流形ode上的微分同态来实现的。我们表明,我们的黎曼方法可以学习稳定的动态系统,在举例说明和现实世界的操作任务中显示复杂的向量场,其中欧几里得近似失败。
{"title":"Learning Riemannian Stable Dynamical Systems via Diffeomorphisms","authors":"Jiechao Zhang, H. Mohammadi, L. Rozo","doi":"10.48550/arXiv.2211.03169","DOIUrl":"https://doi.org/10.48550/arXiv.2211.03169","url":null,"abstract":"Dexterous and autonomous robots should be capable of executing elaborated dynamical motions skillfully. Learning techniques may be leveraged to build models of such dynamic skills. To accomplish this, the learning model needs to encode a stable vector field that resembles the desired motion dynamics. This is challenging as the robot state does not evolve on a Euclidean space, and therefore the stability guarantees and vector field encoding need to account for the geometry arising from, for example, the orientation representation. To tackle this problem, we propose learning Riemannian stable dynamical systems (RSDS) from demonstrations, allowing us to account for different geometric constraints resulting from the dynamical system state representation. Our approach provides Lyapunov-stability guarantees on Riemannian manifolds that are enforced on the desired motion dynamics via diffeomorphisms built on neural manifold ODEs. We show that our Riemannian approach makes it possible to learn stable dynamical systems displaying complicated vector fields on both illustrative examples and real-world manipulation tasks, where Euclidean approximations fail.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130698881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Residual Skill Policies: Learning an Adaptable Skill-based Action Space for Reinforcement Learning for Robotics 剩余技能策略:学习一个可适应的基于技能的机器人强化学习行动空间
Pub Date : 2022-11-04 DOI: 10.48550/arXiv.2211.02231
Krishan Rana, Ming Xu, Brendan Tidd, Michael Milford, N. Sunderhauf
Skill-based reinforcement learning (RL) has emerged as a promising strategy to leverage prior knowledge for accelerated robot learning. Skills are typically extracted from expert demonstrations and are embedded into a latent space from which they can be sampled as actions by a high-level RL agent. However, this skill space is expansive, and not all skills are relevant for a given robot state, making exploration difficult. Furthermore, the downstream RL agent is limited to learning structurally similar tasks to those used to construct the skill space. We firstly propose accelerating exploration in the skill space using state-conditioned generative models to directly bias the high-level agent towards only sampling skills relevant to a given state based on prior experience. Next, we propose a low-level residual policy for fine-grained skill adaptation enabling downstream RL agents to adapt to unseen task variations. Finally, we validate our approach across four challenging manipulation tasks that differ from those used to build the skill space, demonstrating our ability to learn across task variations while significantly accelerating exploration, outperforming prior works. Code and videos are available on our project website: https://krishanrana.github.io/reskill.
基于技能的强化学习(RL)已经成为利用先验知识加速机器人学习的一种有前途的策略。技能通常是从专家演示中提取出来的,并嵌入到一个潜在空间中,从中它们可以被高级强化学习代理作为动作进行采样。然而,这个技能空间是很广阔的,并不是所有的技能都与给定的机器人状态相关,这使得探索变得困难。此外,下游RL代理仅限于学习结构上与用于构建技能空间的任务相似的任务。我们首先提出使用状态条件生成模型加速技能空间的探索,直接使高级智能体偏向于基于先前经验的与给定状态相关的采样技能。接下来,我们提出了一种用于细粒度技能适应的低级残留策略,使下游RL代理能够适应看不见的任务变化。最后,我们在四个不同于构建技能空间的具有挑战性的操作任务中验证了我们的方法,证明了我们在跨任务变化学习的能力,同时显著加速了探索,超越了之前的工作。代码和视频可在我们的项目网站上获得:https://krishanrana.github.io/reskill。
{"title":"Residual Skill Policies: Learning an Adaptable Skill-based Action Space for Reinforcement Learning for Robotics","authors":"Krishan Rana, Ming Xu, Brendan Tidd, Michael Milford, N. Sunderhauf","doi":"10.48550/arXiv.2211.02231","DOIUrl":"https://doi.org/10.48550/arXiv.2211.02231","url":null,"abstract":"Skill-based reinforcement learning (RL) has emerged as a promising strategy to leverage prior knowledge for accelerated robot learning. Skills are typically extracted from expert demonstrations and are embedded into a latent space from which they can be sampled as actions by a high-level RL agent. However, this skill space is expansive, and not all skills are relevant for a given robot state, making exploration difficult. Furthermore, the downstream RL agent is limited to learning structurally similar tasks to those used to construct the skill space. We firstly propose accelerating exploration in the skill space using state-conditioned generative models to directly bias the high-level agent towards only sampling skills relevant to a given state based on prior experience. Next, we propose a low-level residual policy for fine-grained skill adaptation enabling downstream RL agents to adapt to unseen task variations. Finally, we validate our approach across four challenging manipulation tasks that differ from those used to build the skill space, demonstrating our ability to learn across task variations while significantly accelerating exploration, outperforming prior works. Code and videos are available on our project website: https://krishanrana.github.io/reskill.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131522907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Leveraging Fully Observable Policies for Learning under Partial Observability 利用完全可观察策略进行部分可观察学习
Pub Date : 2022-11-03 DOI: 10.48550/arXiv.2211.01991
Hai V. Nguyen, Andrea Baisero, Dian Wang, Chris Amato, Robert W. Platt
Reinforcement learning in partially observable domains is challenging due to the lack of observable state information. Thankfully, learning offline in a simulator with such state information is often possible. In particular, we propose a method for partially observable reinforcement learning that uses a fully observable policy (which we call a state expert) during offline training to improve online performance. Based on Soft Actor-Critic (SAC), our agent balances performing actions similar to the state expert and getting high returns under partial observability. Our approach can leverage the fully-observable policy for exploration and parts of the domain that are fully observable while still being able to learn under partial observability. On six robotics domains, our method outperforms pure imitation, pure reinforcement learning, the sequential or parallel combination of both types, and a recent state-of-the-art method in the same setting. A successful policy transfer to a physical robot in a manipulation task from pixels shows our approach's practicality in learning interesting policies under partial observability.
部分可观察域的强化学习由于缺乏可观察状态信息而具有挑战性。值得庆幸的是,在具有这种状态信息的模拟器中离线学习通常是可能的。特别是,我们提出了一种部分可观察强化学习的方法,该方法在离线训练期间使用完全可观察策略(我们称之为状态专家)来提高在线性能。基于软行为者-批评家(SAC),我们的智能体在部分可观察性下平衡了执行类似于状态专家的行为和获得高回报。我们的方法可以利用完全可观察策略来探索和部分完全可观察的领域,同时仍然能够在部分可观察性下学习。在六个机器人领域,我们的方法优于纯模仿,纯强化学习,两种类型的顺序或并行组合,以及在相同设置下的最新最先进的方法。在像素操作任务中成功地将策略转移到物理机器人上表明了我们的方法在部分可观察性下学习有趣策略的实用性。
{"title":"Leveraging Fully Observable Policies for Learning under Partial Observability","authors":"Hai V. Nguyen, Andrea Baisero, Dian Wang, Chris Amato, Robert W. Platt","doi":"10.48550/arXiv.2211.01991","DOIUrl":"https://doi.org/10.48550/arXiv.2211.01991","url":null,"abstract":"Reinforcement learning in partially observable domains is challenging due to the lack of observable state information. Thankfully, learning offline in a simulator with such state information is often possible. In particular, we propose a method for partially observable reinforcement learning that uses a fully observable policy (which we call a state expert) during offline training to improve online performance. Based on Soft Actor-Critic (SAC), our agent balances performing actions similar to the state expert and getting high returns under partial observability. Our approach can leverage the fully-observable policy for exploration and parts of the domain that are fully observable while still being able to learn under partial observability. On six robotics domains, our method outperforms pure imitation, pure reinforcement learning, the sequential or parallel combination of both types, and a recent state-of-the-art method in the same setting. A successful policy transfer to a physical robot in a manipulation task from pixels shows our approach's practicality in learning interesting policies under partial observability.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125436143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Learning to Grasp the Ungraspable with Emergent Extrinsic Dexterity 学会用突发的外在灵活性掌握不可掌握的东西
Pub Date : 2022-11-02 DOI: 10.48550/arXiv.2211.01500
Wen-Min Zhou, David Held
A simple gripper can solve more complex manipulation tasks if it can utilize the external environment such as pushing the object against the table or a vertical wall, known as"Extrinsic Dexterity."Previous work in extrinsic dexterity usually has careful assumptions about contacts which impose restrictions on robot design, robot motions, and the variations of the physical parameters. In this work, we develop a system based on reinforcement learning (RL) to address these limitations. We study the task of"Occluded Grasping"which aims to grasp the object in configurations that are initially occluded; the robot needs to move the object into a configuration from which these grasps can be achieved. We present a system with model-free RL that successfully achieves this task using a simple gripper with extrinsic dexterity. The policy learns emergent behaviors of pushing the object against the wall to rotate and then grasp it without additional reward terms on extrinsic dexterity. We discuss important components of the system including the design of the RL problem, multi-grasp training and selection, and policy generalization with automatic curriculum. Most importantly, the policy trained in simulation is zero-shot transferred to a physical robot. It demonstrates dynamic and contact-rich motions with a simple gripper that generalizes across objects with various size, density, surface friction, and shape with a 78% success rate. Videos can be found at https://sites.google.com/view/grasp-ungraspable/.
一个简单的抓取器可以解决更复杂的操作任务,如果它能利用外部环境,比如把物体推到桌子或垂直的墙上,被称为“外在灵巧”。以往关于外在灵巧性的研究通常对接触进行了谨慎的假设,这对机器人的设计、运动和物理参数的变化施加了限制。在这项工作中,我们开发了一个基于强化学习(RL)的系统来解决这些限制。我们研究了“遮挡抓取”任务,其目的是在初始遮挡的构型中抓取物体;机器人需要将物体移动到可以实现抓取的位置。我们提出了一个无模型强化学习系统,该系统使用一个具有外在灵巧性的简单夹具成功地完成了这一任务。策略学习将物体推到墙上旋转,然后抓住它的紧急行为,而不需要额外的奖励条件。我们讨论了该系统的重要组成部分,包括RL问题的设计、多抓手训练和选拔以及自动课程的策略概括。最重要的是,在模拟中训练的策略是零射击转移到物理机器人。它用一个简单的抓手演示了动态和丰富的接触运动,该抓手可以在各种大小、密度、表面摩擦和形状的物体上进行推广,成功率为78%。视频可以在https://sites.google.com/view/grasp-ungraspable/上找到。
{"title":"Learning to Grasp the Ungraspable with Emergent Extrinsic Dexterity","authors":"Wen-Min Zhou, David Held","doi":"10.48550/arXiv.2211.01500","DOIUrl":"https://doi.org/10.48550/arXiv.2211.01500","url":null,"abstract":"A simple gripper can solve more complex manipulation tasks if it can utilize the external environment such as pushing the object against the table or a vertical wall, known as\"Extrinsic Dexterity.\"Previous work in extrinsic dexterity usually has careful assumptions about contacts which impose restrictions on robot design, robot motions, and the variations of the physical parameters. In this work, we develop a system based on reinforcement learning (RL) to address these limitations. We study the task of\"Occluded Grasping\"which aims to grasp the object in configurations that are initially occluded; the robot needs to move the object into a configuration from which these grasps can be achieved. We present a system with model-free RL that successfully achieves this task using a simple gripper with extrinsic dexterity. The policy learns emergent behaviors of pushing the object against the wall to rotate and then grasp it without additional reward terms on extrinsic dexterity. We discuss important components of the system including the design of the RL problem, multi-grasp training and selection, and policy generalization with automatic curriculum. Most importantly, the policy trained in simulation is zero-shot transferred to a physical robot. It demonstrates dynamic and contact-rich motions with a simple gripper that generalizes across objects with various size, density, surface friction, and shape with a 78% success rate. Videos can be found at https://sites.google.com/view/grasp-ungraspable/.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133922816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Real-time Mapping of Physical Scene Properties with an Autonomous Robot Experimenter
Pub Date : 2022-10-31 DOI: 10.48550/arXiv.2210.17325
I. Haughton, Edgar Sucar, A. Mouton, Edward Johns, A. Davison
Neural fields can be trained from scratch to represent the shape and appearance of 3D scenes efficiently. It has also been shown that they can densely map correlated properties such as semantics, via sparse interactions from a human labeller. In this work, we show that a robot can densely annotate a scene with arbitrary discrete or continuous physical properties via its own fully-autonomous experimental interactions, as it simultaneously scans and maps it with an RGB-D camera. A variety of scene interactions are possible, including poking with force sensing to determine rigidity, measuring local material type with single-pixel spectroscopy or predicting force distributions by pushing. Sparse experimental interactions are guided by entropy to enable high efficiency, with tabletop scene properties densely mapped from scratch in a few minutes from a few tens of interactions.
神经场可以从零开始训练,有效地表示3D场景的形状和外观。研究还表明,它们可以通过与人类标注者的稀疏交互,密集地映射相关属性,如语义。在这项工作中,我们展示了机器人可以通过自己的完全自主的实验交互,密集地注释具有任意离散或连续物理属性的场景,因为它同时使用RGB-D相机扫描和映射场景。各种场景交互是可能的,包括用力传感戳来确定刚度,用单像素光谱测量局部材料类型或通过推动来预测力分布。稀疏的实验交互由熵引导,以实现高效率,桌面场景属性在几分钟内从几十个交互开始密集映射。
{"title":"Real-time Mapping of Physical Scene Properties with an Autonomous Robot Experimenter","authors":"I. Haughton, Edgar Sucar, A. Mouton, Edward Johns, A. Davison","doi":"10.48550/arXiv.2210.17325","DOIUrl":"https://doi.org/10.48550/arXiv.2210.17325","url":null,"abstract":"Neural fields can be trained from scratch to represent the shape and appearance of 3D scenes efficiently. It has also been shown that they can densely map correlated properties such as semantics, via sparse interactions from a human labeller. In this work, we show that a robot can densely annotate a scene with arbitrary discrete or continuous physical properties via its own fully-autonomous experimental interactions, as it simultaneously scans and maps it with an RGB-D camera. A variety of scene interactions are possible, including poking with force sensing to determine rigidity, measuring local material type with single-pixel spectroscopy or predicting force distributions by pushing. Sparse experimental interactions are guided by entropy to enable high efficiency, with tabletop scene properties densely mapped from scratch in a few minutes from a few tens of interactions.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130111047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Adapting Neural Models with Sequential Monte Carlo Dropout 时序蒙特卡罗Dropout神经模型的自适应
Pub Date : 2022-10-27 DOI: 10.48550/arXiv.2210.15779
Pamela Carreno-Medrano, Dana Kuli'c, Michael Burke
The ability to adapt to changing environments and settings is essential for robots acting in dynamic and unstructured environments or working alongside humans with varied abilities or preferences. This work introduces an extremely simple and effective approach to adapting neural models in response to changing settings. We first train a standard network using dropout, which is analogous to learning an ensemble of predictive models or distribution over predictions. At run-time, we use a particle filter to maintain a distribution over dropout masks to adapt the neural model to changing settings in an online manner. Experimental results show improved performance in control problems requiring both online and look-ahead prediction, and showcase the interpretability of the inferred masks in a human behaviour modelling task for drone teleoperation.
适应不断变化的环境和设置的能力对于机器人在动态和非结构化环境中行动或与具有不同能力或偏好的人类一起工作至关重要。这项工作介绍了一种非常简单和有效的方法来适应神经模型,以响应不断变化的环境。我们首先使用dropout训练一个标准网络,这类似于学习预测模型的集合或预测分布。在运行时,我们使用粒子过滤器来维持dropout蒙版上的分布,以在线方式使神经模型适应不断变化的设置。实验结果表明,在需要在线和前瞻性预测的控制问题上,该方法的性能有所提高,并且在无人机远程操作的人类行为建模任务中展示了推断掩模的可解释性。
{"title":"Adapting Neural Models with Sequential Monte Carlo Dropout","authors":"Pamela Carreno-Medrano, Dana Kuli'c, Michael Burke","doi":"10.48550/arXiv.2210.15779","DOIUrl":"https://doi.org/10.48550/arXiv.2210.15779","url":null,"abstract":"The ability to adapt to changing environments and settings is essential for robots acting in dynamic and unstructured environments or working alongside humans with varied abilities or preferences. This work introduces an extremely simple and effective approach to adapting neural models in response to changing settings. We first train a standard network using dropout, which is analogous to learning an ensemble of predictive models or distribution over predictions. At run-time, we use a particle filter to maintain a distribution over dropout masks to adapt the neural model to changing settings in an online manner. Experimental results show improved performance in control problems requiring both online and look-ahead prediction, and showcase the interpretability of the inferred masks in a human behaviour modelling task for drone teleoperation.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115579384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Planning with Spatial-Temporal Abstraction from Point Clouds for Deformable Object Manipulation 面向可变形对象操作的点云时空抽象规划
Pub Date : 2022-10-27 DOI: 10.48550/arXiv.2210.15751
Xingyu Lin, Carl Qi, Yunchu Zhang, Zhiao Huang, Katerina Fragkiadaki, Yunzhu Li, Chuang Gan, David Held
Effective planning of long-horizon deformable object manipulation requires suitable abstractions at both the spatial and temporal levels. Previous methods typically either focus on short-horizon tasks or make strong assumptions that full-state information is available, which prevents their use on deformable objects. In this paper, we propose PlAnning with Spatial-Temporal Abstraction (PASTA), which incorporates both spatial abstraction (reasoning about objects and their relations to each other) and temporal abstraction (reasoning over skills instead of low-level actions). Our framework maps high-dimension 3D observations such as point clouds into a set of latent vectors and plans over skill sequences on top of the latent set representation. We show that our method can effectively perform challenging sequential deformable object manipulation tasks in the real world, which require combining multiple tool-use skills such as cutting with a knife, pushing with a pusher, and spreading the dough with a roller.
长视界可变形对象操作的有效规划需要在空间和时间层面上进行适当的抽象。以前的方法通常要么专注于短期任务,要么强烈假设完整状态信息可用,这阻碍了它们在可变形对象上的应用。在本文中,我们提出了规划与时空抽象(PASTA),它结合了空间抽象(对对象及其相互关系的推理)和时间抽象(对技能而不是低级动作的推理)。我们的框架将高维3D观测(如点云)映射到一组潜在向量中,并在潜在集表示的基础上对技能序列进行规划。我们表明,我们的方法可以有效地执行现实世界中具有挑战性的顺序可变形对象操作任务,这些任务需要结合多种工具使用技能,如用刀切割、用推子推、用滚筒擀面。
{"title":"Planning with Spatial-Temporal Abstraction from Point Clouds for Deformable Object Manipulation","authors":"Xingyu Lin, Carl Qi, Yunchu Zhang, Zhiao Huang, Katerina Fragkiadaki, Yunzhu Li, Chuang Gan, David Held","doi":"10.48550/arXiv.2210.15751","DOIUrl":"https://doi.org/10.48550/arXiv.2210.15751","url":null,"abstract":"Effective planning of long-horizon deformable object manipulation requires suitable abstractions at both the spatial and temporal levels. Previous methods typically either focus on short-horizon tasks or make strong assumptions that full-state information is available, which prevents their use on deformable objects. In this paper, we propose PlAnning with Spatial-Temporal Abstraction (PASTA), which incorporates both spatial abstraction (reasoning about objects and their relations to each other) and temporal abstraction (reasoning over skills instead of low-level actions). Our framework maps high-dimension 3D observations such as point clouds into a set of latent vectors and plans over skill sequences on top of the latent set representation. We show that our method can effectively perform challenging sequential deformable object manipulation tasks in the real world, which require combining multiple tool-use skills such as cutting with a knife, pushing with a pusher, and spreading the dough with a roller.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134647221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Sim-to-Real via Sim-to-Seg: End-to-end Off-road Autonomous Driving Without Real Data 通过模拟到真实:端到端无真实数据的越野自动驾驶
Pub Date : 2022-10-25 DOI: 10.48550/arXiv.2210.14721
John So, Amber Xie, Sunggoo Jung, J. Edlund, Rohan Thakker, Ali-akbar Agha-mohammadi, P. Abbeel, Stephen James
Autonomous driving is complex, requiring sophisticated 3D scene understanding, localization, mapping, and control. Rather than explicitly modelling and fusing each of these components, we instead consider an end-to-end approach via reinforcement learning (RL). However, collecting exploration driving data in the real world is impractical and dangerous. While training in simulation and deploying visual sim-to-real techniques has worked well for robot manipulation, deploying beyond controlled workspace viewpoints remains a challenge. In this paper, we address this challenge by presenting Sim2Seg, a re-imagining of RCAN that crosses the visual reality gap for off-road autonomous driving, without using any real-world data. This is done by learning to translate randomized simulation images into simulated segmentation and depth maps, subsequently enabling real-world images to also be translated. This allows us to train an end-to-end RL policy in simulation, and directly deploy in the real-world. Our approach, which can be trained in 48 hours on 1 GPU, can perform equally as well as a classical perception and control stack that took thousands of engineering hours over several months to build. We hope this work motivates future end-to-end autonomous driving research.
自动驾驶是复杂的,需要复杂的3D场景理解、定位、绘图和控制。我们没有明确地建模和融合这些组件,而是考虑通过强化学习(RL)的端到端方法。然而,在现实世界中收集探索驾驶数据是不切实际和危险的。虽然模拟训练和部署视觉模拟到真实技术在机器人操作中很有效,但部署超出受控工作空间的视点仍然是一个挑战。在本文中,我们通过介绍Sim2Seg来解决这一挑战,Sim2Seg是一种重新想象的RCAN,它跨越了越野自动驾驶的视觉现实差距,而不使用任何真实世界的数据。这是通过学习将随机模拟图像转换为模拟分割和深度图来完成的,随后使真实世界的图像也能够被翻译。这允许我们在模拟中训练端到端强化学习策略,并直接部署在现实世界中。我们的方法可以在1个GPU上进行48小时的训练,其性能可以与几个月来花费数千个工程小时构建的经典感知和控制堆栈一样好。我们希望这项工作能够激励未来端到端的自动驾驶研究。
{"title":"Sim-to-Real via Sim-to-Seg: End-to-end Off-road Autonomous Driving Without Real Data","authors":"John So, Amber Xie, Sunggoo Jung, J. Edlund, Rohan Thakker, Ali-akbar Agha-mohammadi, P. Abbeel, Stephen James","doi":"10.48550/arXiv.2210.14721","DOIUrl":"https://doi.org/10.48550/arXiv.2210.14721","url":null,"abstract":"Autonomous driving is complex, requiring sophisticated 3D scene understanding, localization, mapping, and control. Rather than explicitly modelling and fusing each of these components, we instead consider an end-to-end approach via reinforcement learning (RL). However, collecting exploration driving data in the real world is impractical and dangerous. While training in simulation and deploying visual sim-to-real techniques has worked well for robot manipulation, deploying beyond controlled workspace viewpoints remains a challenge. In this paper, we address this challenge by presenting Sim2Seg, a re-imagining of RCAN that crosses the visual reality gap for off-road autonomous driving, without using any real-world data. This is done by learning to translate randomized simulation images into simulated segmentation and depth maps, subsequently enabling real-world images to also be translated. This allows us to train an end-to-end RL policy in simulation, and directly deploy in the real-world. Our approach, which can be trained in 48 hours on 1 GPU, can perform equally as well as a classical perception and control stack that took thousands of engineering hours over several months to build. We hope this work motivates future end-to-end autonomous driving research.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"183 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115072129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
MidasTouch: Monte-Carlo inference over distributions across sliding touch MidasTouch:滑动触摸分布的蒙特卡罗推断
Pub Date : 2022-10-25 DOI: 10.48550/arXiv.2210.14210
Sudharshan Suresh, Zilin Si, Stuart Anderson, M. Kaess, Mustafa Mukadam
We present MidasTouch, a tactile perception system for online global localization of a vision-based touch sensor sliding on an object surface. This framework takes in posed tactile images over time, and outputs an evolving distribution of sensor pose on the object's surface, without the need for visual priors. Our key insight is to estimate local surface geometry with tactile sensing, learn a compact representation for it, and disambiguate these signals over a long time horizon. The backbone of MidasTouch is a Monte-Carlo particle filter, with a measurement model based on a tactile code network learned from tactile simulation. This network, inspired by LIDAR place recognition, compactly summarizes local surface geometries. These generated codes are efficiently compared against a precomputed tactile codebook per-object, to update the pose distribution. We further release the YCB-Slide dataset of real-world and simulated forceful sliding interactions between a vision-based tactile sensor and standard YCB objects. While single-touch localization can be inherently ambiguous, we can quickly localize our sensor by traversing salient surface geometries. Project page: https://suddhu.github.io/midastouch-tactile/
我们提出了MidasTouch,一种用于在物体表面滑动的基于视觉的触摸传感器的在线全局定位的触觉感知系统。该框架接受姿势的触觉图像,并在物体表面输出传感器姿势的不断变化的分布,而不需要视觉先验。我们的关键见解是通过触觉感知来估计局部表面几何形状,学习它的紧凑表示,并在很长一段时间内消除这些信号的歧义。MidasTouch的核心是一个蒙特卡罗粒子滤波器,其测量模型基于从触觉仿真中学习到的触觉代码网络。该网络受激光雷达位置识别的启发,紧凑地总结了局部表面的几何形状。这些生成的代码有效地与预先计算的每个对象的触觉代码本进行比较,以更新姿态分布。我们进一步发布了基于视觉的触觉传感器和标准YCB物体之间真实世界和模拟的有力滑动相互作用的YCB- slide数据集。虽然单触定位本身就很模糊,但我们可以通过遍历显著的表面几何形状来快速定位传感器。项目页面:https://suddhu.github.io/midastouch-tactile/
{"title":"MidasTouch: Monte-Carlo inference over distributions across sliding touch","authors":"Sudharshan Suresh, Zilin Si, Stuart Anderson, M. Kaess, Mustafa Mukadam","doi":"10.48550/arXiv.2210.14210","DOIUrl":"https://doi.org/10.48550/arXiv.2210.14210","url":null,"abstract":"We present MidasTouch, a tactile perception system for online global localization of a vision-based touch sensor sliding on an object surface. This framework takes in posed tactile images over time, and outputs an evolving distribution of sensor pose on the object's surface, without the need for visual priors. Our key insight is to estimate local surface geometry with tactile sensing, learn a compact representation for it, and disambiguate these signals over a long time horizon. The backbone of MidasTouch is a Monte-Carlo particle filter, with a measurement model based on a tactile code network learned from tactile simulation. This network, inspired by LIDAR place recognition, compactly summarizes local surface geometries. These generated codes are efficiently compared against a precomputed tactile codebook per-object, to update the pose distribution. We further release the YCB-Slide dataset of real-world and simulated forceful sliding interactions between a vision-based tactile sensor and standard YCB objects. While single-touch localization can be inherently ambiguous, we can quickly localize our sensor by traversing salient surface geometries. Project page: https://suddhu.github.io/midastouch-tactile/","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130776340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
期刊
Conference on Robot Learning
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1