Robotics: Science and Systems XIX最新文献_第2页

Fast Monocular Visual-Inertial Initialization Leveraging Learned Single-View Depth 快速单目视觉惯性初始化利用学习单视图深度

Robotics: Science and Systems XIX

Pub Date : 2023-07-10 DOI: 10.15607/RSS.2023.XIX.072

Nate Merrill, Patrick Geneva, Saimouli Katragadda, Chuchu Chen, G. Huang

—In monocular visual-inertial navigation systems, it is ideal to initialize as quickly and robustly as possible. State-of-the-art initialization methods typically make linear approximations using the image features and inertial information in order to initialize in closed-form, and then refine the states with a nonlinear optimization. While the standard methods typically wait for a 2sec data window, a recent work has shown that it is possible to initialize faster (0.5sec) by adding constraints from a robust but only up-to-scale monocular depth network in the nonlinear optimization. To further expedite the initialization, in this work, we leverage the scale-less depth measurements instead in the linear initialization step that is performed prior to the nonlinear one, which only requires a single depth image for the first frame. We show that the typical estimation of each feature state independently in the closed-form solution can be replaced by just estimating the scale and offset parameters of the learned depth map. Interestingly, our formulation makes it possible to construct small minimal problems in a RANSAC loop, whereas the typical linear system’s minimal problem is quite large and includes every feature state. Experiments show that our method can improve the overall initialization performance on popular public datasets (EuRoC MAV and TUM-VI) over state-of-the-art methods. For the TUM-VI dataset, we show superior initialization performance with only a 0.3sec window of data, which is the smallest ever reported, and show that our method can initialize more often, robustly, and accurately in different challenging scenarios.

在单目视觉惯性导航系统中，理想的方法是尽可能快速和鲁棒地初始化。现有的初始化方法通常是利用图像特征和惯性信息进行线性逼近，以封闭形式初始化，然后通过非线性优化来细化状态。虽然标准方法通常等待2秒的数据窗口，但最近的一项研究表明，通过在非线性优化中添加鲁棒的约束条件(但只能达到规模的单目深度网络)，可以更快地初始化(0.5秒)。为了进一步加快初始化，在这项工作中，我们在非线性初始化步骤之前执行的线性初始化步骤中利用无尺度深度测量，这只需要第一帧的单个深度图像。我们证明了封闭解中每个特征状态独立的典型估计可以被仅仅估计学习到的深度图的尺度和偏移参数所取代。有趣的是，我们的公式使得在RANSAC循环中构造小的最小问题成为可能，而典型的线性系统的最小问题相当大，并且包括每个特征状态。实验表明，我们的方法可以提高流行的公共数据集(EuRoC MAV和TUM-VI)的整体初始化性能。对于TUM-VI数据集，我们仅用0.3秒的数据窗口显示了优越的初始化性能，这是有史以来最小的数据窗口，并且表明我们的方法可以在不同具有挑战性的场景中更频繁，更健壮，更准确地初始化。

{"title":"Fast Monocular Visual-Inertial Initialization Leveraging Learned Single-View Depth","authors":"Nate Merrill, Patrick Geneva, Saimouli Katragadda, Chuchu Chen, G. Huang","doi":"10.15607/RSS.2023.XIX.072","DOIUrl":"https://doi.org/10.15607/RSS.2023.XIX.072","url":null,"abstract":"—In monocular visual-inertial navigation systems, it is ideal to initialize as quickly and robustly as possible. State-of-the-art initialization methods typically make linear approximations using the image features and inertial information in order to initialize in closed-form, and then refine the states with a nonlinear optimization. While the standard methods typically wait for a 2sec data window, a recent work has shown that it is possible to initialize faster (0.5sec) by adding constraints from a robust but only up-to-scale monocular depth network in the nonlinear optimization. To further expedite the initialization, in this work, we leverage the scale-less depth measurements instead in the linear initialization step that is performed prior to the nonlinear one, which only requires a single depth image for the first frame. We show that the typical estimation of each feature state independently in the closed-form solution can be replaced by just estimating the scale and offset parameters of the learned depth map. Interestingly, our formulation makes it possible to construct small minimal problems in a RANSAC loop, whereas the typical linear system’s minimal problem is quite large and includes every feature state. Experiments show that our method can improve the overall initialization performance on popular public datasets (EuRoC MAV and TUM-VI) over state-of-the-art methods. For the TUM-VI dataset, we show superior initialization performance with only a 0.3sec window of data, which is the smallest ever reported, and show that our method can initialize more often, robustly, and accurately in different challenging scenarios.","PeriodicalId":248720,"journal":{"name":"Robotics: Science and Systems XIX","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128766189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

G*: A New Approach to Bounding Curvature Constrained Shortest Paths through Dubins Gates G*:一种新的边界曲率约束下通过Dubins Gates的最短路径方法

Robotics: Science and Systems XIX

Pub Date : 2023-07-10 DOI: 10.15607/RSS.2023.XIX.059

S. Manyam, Abhishek Nayak, S. Rathinam

—We consider a Curvature-constrained Shortest Path (CSP) problem on a 2D plane for a robot with minimum turning radius constraints in the presence of obstacles. We introduce a new bounding technique called Gate* (G*) that provides optimality guarantees to the CSP. Our approach relies on relaxing the obstacle avoidance constraints but allows a path to travel through some restricted sets of configurations called gates which are informed by the obstacles. We also let the path to be discontinuous when it reaches a gate. This approach allows us to pose the bounding problem as a least-cost problem in a graph where the cost of traveling an edge requires us to solve a new motion planning problem called the Dubins gate problem. In addition to the theoretical results, our numerical tests show that G* can significantly improve the lower bounds with respect to the baseline approaches, and by more than 60% in some instances.

-考虑一个具有最小转弯半径约束的机器人在2D平面上的曲率约束最短路径(CSP)问题。我们引入了一种新的边界技术，称为Gate* (G*)，它为CSP提供了最优性保证。我们的方法依赖于放松避障约束，但允许路径通过一些受限的配置集，这些配置集称为门，由障碍物通知。我们也让路径在到达栅极时是不连续的。这种方法允许我们将边界问题作为图中的最小代价问题，其中移动边缘的代价要求我们解决一个新的运动规划问题，称为杜宾斯门问题。除了理论结果之外，我们的数值测试表明，相对于基线方法，G*可以显著提高下界，在某些情况下可以提高60%以上。

引用次数: 0

Investigating the Impact of Experience on a User's Ability to Perform Hierarchical Abstraction 调查体验对用户执行层次抽象能力的影响

Robotics: Science and Systems XIX

Pub Date : 2023-07-10 DOI: 10.15607/RSS.2023.XIX.004

Nina Moorman, N. Gopalan, Aman Singh, Erin Botti, Mariah L. Schrum, Chuxuan Yang, Lakshmi Seelam, M. Gombolay

The field of Learning from Demonstration enables end-users, who are not robotics experts, to shape robot behavior. However, using human demonstrations to teach robots to solve long-horizon problems by leveraging the hierarchical structure of the task is still an unsolved problem. Prior work has yet to show that human users can provide sufficient demonstrations in novel domains without showing the demonstrators explicit teaching strategies for each domain. In this work, we investigate whether non-expert demonstrators can generalize robot teaching strategies to provide necessary and sufficient demonstrations to robots zero-shot in novel domains. We find that increasing participant experience with providing demonstrations improves their demonstration’s degree of sub-task abstraction (p < .001), teaching efficiency (p < .001), and sub-task redundancy (p < .05) in novel domains, allowing generalization in robot teaching. Our findings demonstrate for the first time that non-expert demonstrators can transfer knowledge from a series of training experiences to novel domains without the need for explicit instruction, such that they can provide necessary and sufficient demonstrations when programming robots to complete task and motion planning problems.

从演示中学习的领域使最终用户(不是机器人专家)能够塑造机器人的行为。然而，利用人类示范来教机器人利用任务的层次结构来解决长期问题仍然是一个未解决的问题。先前的工作尚未表明人类用户可以在新领域提供足够的演示，而无需向演示者展示每个领域的明确教学策略。在这项工作中，我们研究了非专家演示是否可以推广机器人教学策略，为机器人零射击在新领域提供必要和充分的演示。我们发现，通过提供演示来增加参与者的经验，可以提高他们在新领域的演示子任务抽象程度(p < 0.001)、教学效率(p < 0.001)和子任务冗余(p < 0.05)，从而实现机器人教学的泛化。我们的研究结果首次证明，非专业演示人员可以在不需要明确指导的情况下将知识从一系列训练经验转移到新的领域，这样他们就可以在编程机器人完成任务和运动规划问题时提供必要和充分的演示。

{"title":"Investigating the Impact of Experience on a User's Ability to Perform Hierarchical Abstraction","authors":"Nina Moorman, N. Gopalan, Aman Singh, Erin Botti, Mariah L. Schrum, Chuxuan Yang, Lakshmi Seelam, M. Gombolay","doi":"10.15607/RSS.2023.XIX.004","DOIUrl":"https://doi.org/10.15607/RSS.2023.XIX.004","url":null,"abstract":"The field of Learning from Demonstration enables end-users, who are not robotics experts, to shape robot behavior. However, using human demonstrations to teach robots to solve long-horizon problems by leveraging the hierarchical structure of the task is still an unsolved problem. Prior work has yet to show that human users can provide sufficient demonstrations in novel domains without showing the demonstrators explicit teaching strategies for each domain. In this work, we investigate whether non-expert demonstrators can generalize robot teaching strategies to provide necessary and sufficient demonstrations to robots zero-shot in novel domains. We find that increasing participant experience with providing demonstrations improves their demonstration’s degree of sub-task abstraction (p < .001), teaching efficiency (p < .001), and sub-task redundancy (p < .05) in novel domains, allowing generalization in robot teaching. Our findings demonstrate for the first time that non-expert demonstrators can transfer knowledge from a series of training experiences to novel domains without the need for explicit instruction, such that they can provide necessary and sufficient demonstrations when programming robots to complete task and motion planning problems.","PeriodicalId":248720,"journal":{"name":"Robotics: Science and Systems XIX","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134087917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Follow my Advice: Assume-Guarantee Approach to Task Planning with Human in the Loop 遵循我的建议:假设-保证方法的任务规划与人在循环

Robotics: Science and Systems XIX

Pub Date : 2023-07-10 DOI: 10.15607/RSS.2023.XIX.001

Georg Friedrich Schuppe, Ilaria Torre, Iolanda Leite, Jana Tumova

—We focus on correct-by-design robot task planning from ﬁnite Linear Temporal Logic (LTL f ) speciﬁcations with a human in the loop. Since provable guarantees are difﬁcult to obtain unconditionally, we take an assume-guarantee perspective. Along with guarantees on the robot’s task satisfaction, we com- pute the weakest sufﬁcient assumptions on the human’s behavior. We approach the problem via a stochastic game and leverage algorithmic synthesis of the weakest sufﬁcient assumptions. We turn the assumptions into runtime advice to be communicated to the human. We conducted an online user study and showed that the robot is perceived as safer, more intelligent and more compliant with our approach than a robot giving more frequent advice corresponding to stronger assumptions. In addition, we show that our approach leads to less violations of the speciﬁcation than not communicating with the participant at all.

-我们专注于从有限线性时间逻辑(LTL f)规范中设计正确的机器人任务规划，其中有一个人在环路中。由于可证明担保难以无条件获得，我们采用假设担保的观点。除了保证机器人的任务满意度外，我们还计算了对人类行为的最弱充分假设。我们通过随机博弈和利用最弱充分假设的算法合成来解决这个问题。我们将假设转换为运行时建议，并与人沟通。我们进行了一项在线用户研究，结果表明，与机器人根据更强的假设给出更频繁的建议相比，人们认为机器人更安全、更智能、更符合我们的方法。此外，我们还表明，与完全不与参与者沟通相比，我们的方法导致了更少的对规范的违反。

引用次数: 0

Demonstrating Mobile Manipulation in the Wild: A Metrics-Driven Approach 在野外演示手机操作:参数驱动方法

Robotics: Science and Systems XIX

Pub Date : 2023-07-10 DOI: 10.15607/RSS.2023.XIX.055

M. Bajracharya, James Borders, Richard Cheng, D. Helmick, Lukas Kaul, Daniel Kruse, John Leichty, Jeremy Ma, Carolyn Matl, Frank Michel, Chavdar Papazov, Josh Petersen, K. Shankar, Mark Tjersland

—We present our general-purpose mobile manipu- lation system consisting of a custom robot platform and key algorithms spanning perception and planning. To extensively test the system in the wild and benchmark its performance, we choose a grocery shopping scenario in an actual, unmodified grocery store. We derive key performance metrics from detailed robot log data collected during six week-long field tests, spread across 18 months. These objective metrics, gained from complex yet repeatable tests, drive the direction of our research efforts and let us continuously improve our system’s performance. We find that thorough end-to-end system-level testing of a complex mobile manipulation system can serve as a reality-check for state-of-the-art methods in robotics. This effectively grounds robotics research efforts in real world needs and challenges, which we deem highly useful for the advancement of the field. To this end, we share our key insights and takeaways to inspire and accelerate similar system-level research projects.

我们提出了我们的通用移动操作系统，包括一个定制的机器人平台和跨越感知和规划的关键算法。为了在野外广泛测试系统并对其性能进行基准测试，我们在一个实际的、未经修改的杂货店中选择了一个杂货店购物场景。我们从为期6周的现场测试中收集的详细机器人日志数据中得出关键性能指标，这些数据分布在18个月内。这些从复杂但可重复的测试中获得的客观指标，推动了我们研究工作的方向，并让我们不断改进系统的性能。我们发现，对复杂的移动操作系统进行彻底的端到端系统级测试可以作为机器人技术中最先进方法的现实检查。这有效地将机器人研究工作置于现实世界的需求和挑战中，我们认为这对该领域的进步非常有用。为此，我们分享我们的关键见解和要点，以启发和加速类似的系统级研究项目。

{"title":"Demonstrating Mobile Manipulation in the Wild: A Metrics-Driven Approach","authors":"M. Bajracharya, James Borders, Richard Cheng, D. Helmick, Lukas Kaul, Daniel Kruse, John Leichty, Jeremy Ma, Carolyn Matl, Frank Michel, Chavdar Papazov, Josh Petersen, K. Shankar, Mark Tjersland","doi":"10.15607/RSS.2023.XIX.055","DOIUrl":"https://doi.org/10.15607/RSS.2023.XIX.055","url":null,"abstract":"—We present our general-purpose mobile manipu- lation system consisting of a custom robot platform and key algorithms spanning perception and planning. To extensively test the system in the wild and benchmark its performance, we choose a grocery shopping scenario in an actual, unmodified grocery store. We derive key performance metrics from detailed robot log data collected during six week-long field tests, spread across 18 months. These objective metrics, gained from complex yet repeatable tests, drive the direction of our research efforts and let us continuously improve our system’s performance. We find that thorough end-to-end system-level testing of a complex mobile manipulation system can serve as a reality-check for state-of-the-art methods in robotics. This effectively grounds robotics research efforts in real world needs and challenges, which we deem highly useful for the advancement of the field. To this end, we share our key insights and takeaways to inspire and accelerate similar system-level research projects.","PeriodicalId":248720,"journal":{"name":"Robotics: Science and Systems XIX","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133357596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Solving Stabilize-Avoid via Epigraph Form Optimal Control using Deep Reinforcement Learning 基于深度强化学习的铭文形式最优控制求解稳定-避免问题

Robotics: Science and Systems XIX

Pub Date : 2023-07-10 DOI: 10.15607/RSS.2023.XIX.085

Oswin So, Chuchu Fan

引用次数: 1

Autonomous Justification for Enabling Explainable Decision Support in Human-Robot Teaming 人-机器人团队中可解释决策支持的自主论证

Robotics: Science and Systems XIX

Pub Date : 2023-07-10 DOI: 10.15607/RSS.2023.XIX.002

Matthew B. Luebbers, Aaquib Tabrez, K. Ruvane, Bradley Hayes

—Justification is an important facet of policy expla- nation, a process for describing the behavior of an autonomous system. In human-robot collaboration, an autonomous agent can attempt to justify distinctly important decisions by offering explanations as to why those decisions are right or reasonable, leveraging a snapshot of its internal reasoning to do so. Without sufficient insight into a robot’s decision-making process, it becomes challenging for users to trust or comply with those important decisions, especially when they are viewed as confusing or contrary to the user’s expectations (e.g., when decisions change as new information is introduced to the agent’s decision-making process). In this work we characterize the benefits of justification within the context of decision-support during human- robot teaming (i.e., agents giving recommendations to human teammates). We introduce a formal framework using value of information theory to strategically time justifications during periods of misaligned expectations for greater effect. We also characterize four different types of counterfactual justification derived from established explainable AI literature and evaluate them against each other in a human-subjects study involving a collaborative, partially observable search task. Based on our findings, we present takeaways on the effective use of different types of justifications in human-robot teaming scenarios, to improve user compliance and decision-making by strategically influencing human teammate thinking patterns. Finally, we present an augmented reality system incorporating these findings into a real-world decision-support system for human-robot teaming.

-辩护是政策解释的一个重要方面，是描述自治系统行为的过程。在人机协作中，自主代理可以尝试通过提供解释来证明非常重要的决策是正确或合理的，并利用其内部推理的快照来做到这一点。如果对机器人的决策过程没有足够的了解，用户就很难信任或遵守这些重要的决策，特别是当它们被视为令人困惑或与用户的期望相反时(例如，当决策随着新信息被引入代理的决策过程而改变时)。在这项工作中，我们描述了在人-机器人团队(即代理向人类队友提供建议)决策支持的背景下，辩护的好处。我们引入了一个正式的框架，利用信息论的价值，在预期不一致的时期战略性地进行时间证明，以获得更大的效果。我们还描述了四种不同类型的反事实理由，这些理由来自于已建立的可解释的人工智能文献，并在涉及协作的、部分可观察的搜索任务的人类受试者研究中相互评估。基于我们的研究结果，我们提出了在人机团队场景中有效使用不同类型的理由的建议，通过战略性地影响人类队友的思维模式来提高用户的依从性和决策。最后，我们提出了一个增强现实系统，将这些发现融入到现实世界的人机团队决策支持系统中。

{"title":"Autonomous Justification for Enabling Explainable Decision Support in Human-Robot Teaming","authors":"Matthew B. Luebbers, Aaquib Tabrez, K. Ruvane, Bradley Hayes","doi":"10.15607/RSS.2023.XIX.002","DOIUrl":"https://doi.org/10.15607/RSS.2023.XIX.002","url":null,"abstract":"—Justification is an important facet of policy expla- nation, a process for describing the behavior of an autonomous system. In human-robot collaboration, an autonomous agent can attempt to justify distinctly important decisions by offering explanations as to why those decisions are right or reasonable, leveraging a snapshot of its internal reasoning to do so. Without sufficient insight into a robot’s decision-making process, it becomes challenging for users to trust or comply with those important decisions, especially when they are viewed as confusing or contrary to the user’s expectations (e.g., when decisions change as new information is introduced to the agent’s decision-making process). In this work we characterize the benefits of justification within the context of decision-support during human- robot teaming (i.e., agents giving recommendations to human teammates). We introduce a formal framework using value of information theory to strategically time justifications during periods of misaligned expectations for greater effect. We also characterize four different types of counterfactual justification derived from established explainable AI literature and evaluate them against each other in a human-subjects study involving a collaborative, partially observable search task. Based on our findings, we present takeaways on the effective use of different types of justifications in human-robot teaming scenarios, to improve user compliance and decision-making by strategically influencing human teammate thinking patterns. Finally, we present an augmented reality system incorporating these findings into a real-world decision-support system for human-robot teaming.","PeriodicalId":248720,"journal":{"name":"Robotics: Science and Systems XIX","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133760604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Autonomous Navigation, Mapping and Exploration with Gaussian Processes 基于高斯过程的自主导航、测绘和勘探

Robotics: Science and Systems XIX

Pub Date : 2023-07-10 DOI: 10.15607/RSS.2023.XIX.104

Mahmoud Ali, Hassan Jardali, N. Roy, Lantao Liu

—Navigating and exploring an unknown environment is a challenging task for autonomous robots, especially in complex and unstructured environments. We propose a new framework that can simultaneously accomplish multiple objectives that are essential to robot autonomy including identifying free space for navigation, building a metric-topological representation for mapping, and ensuring good spatial coverage for unknown space exploration. Different from existing work that model these critical objectives separately, we show that navigation, mapping, and exploration can be derived with the same foundation modeled with a sparse variant of a Gaussian process. Specifically, in our framework the robot navigates by following frontiers computed from a local Gaussian process perception model, and along the way builds a map in a metric-topological form where nodes are adaptively selected from important perception frontiers. The topology expands towards unexplored areas by assessing a low-cost global uncertainty map also computed from a sparse Gaussian process. Through evaluations in various cluttered and unstructured environments, we validate that the proposed framework can explore unknown environments faster and with a shorter distance travelled than the state-of-the-art frontier explo- ration approaches. Through field demonstration, we have begun to lay the groundwork for field robots to explore challenging environments such as forests that humans have yet to set foot in 1 .

导航和探索未知环境对自主机器人来说是一项具有挑战性的任务，特别是在复杂和非结构化的环境中。我们提出了一个新的框架，它可以同时完成对机器人自治至关重要的多个目标，包括识别导航的自由空间，构建映射的度量拓扑表示，以及确保未知空间探索的良好空间覆盖。与现有的分别为这些关键目标建模的工作不同，我们表明导航、映射和探索可以在使用高斯过程的稀疏变体建模的相同基础上推导出来。具体来说，在我们的框架中，机器人通过遵循从局部高斯过程感知模型计算的边界进行导航，并在此过程中以度量拓扑形式构建地图，其中节点自适应地从重要的感知边界中选择。通过评估同样由稀疏高斯过程计算的低成本全局不确定性映射，拓扑扩展到未开发的区域。通过对各种杂乱和非结构化环境的评估，我们验证了所提出的框架可以比最先进的前沿探索方法更快、更短的距离探索未知环境。通过现场演示，我们已经开始为野外机器人探索人类尚未涉足的森林等具有挑战性的环境奠定基础。

{"title":"Autonomous Navigation, Mapping and Exploration with Gaussian Processes","authors":"Mahmoud Ali, Hassan Jardali, N. Roy, Lantao Liu","doi":"10.15607/RSS.2023.XIX.104","DOIUrl":"https://doi.org/10.15607/RSS.2023.XIX.104","url":null,"abstract":"—Navigating and exploring an unknown environment is a challenging task for autonomous robots, especially in complex and unstructured environments. We propose a new framework that can simultaneously accomplish multiple objectives that are essential to robot autonomy including identifying free space for navigation, building a metric-topological representation for mapping, and ensuring good spatial coverage for unknown space exploration. Different from existing work that model these critical objectives separately, we show that navigation, mapping, and exploration can be derived with the same foundation modeled with a sparse variant of a Gaussian process. Specifically, in our framework the robot navigates by following frontiers computed from a local Gaussian process perception model, and along the way builds a map in a metric-topological form where nodes are adaptively selected from important perception frontiers. The topology expands towards unexplored areas by assessing a low-cost global uncertainty map also computed from a sparse Gaussian process. Through evaluations in various cluttered and unstructured environments, we validate that the proposed framework can explore unknown environments faster and with a shorter distance travelled than the state-of-the-art frontier explo- ration approaches. Through field demonstration, we have begun to lay the groundwork for field robots to explore challenging environments such as forests that humans have yet to set foot in 1 .","PeriodicalId":248720,"journal":{"name":"Robotics: Science and Systems XIX","volume":"125 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121413917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Demonstrating Large-Scale Package Manipulation via Learned Metrics of Pick Success 通过学习采摘成功的度量来演示大规模的包操作

Robotics: Science and Systems XIX

Pub Date : 2023-05-17 DOI: 10.15607/RSS.2023.XIX.023

Shuai-Peng Li, Azarakhsh Keipour, Kevin G. Jamieson, Nicolas Hudson, Charles Swan, Kostas E. Bekris

Automating warehouse operations can reduce logistics overhead costs, ultimately driving down the final price for consumers, increasing the speed of delivery, and enhancing the resiliency to workforce fluctuations. The past few years have seen increased interest in automating such repeated tasks but mostly in controlled settings. Tasks such as picking objects from unstructured, cluttered piles have only recently become robust enough for large-scale deployment with minimal human intervention. This paper demonstrates a large-scale package manipulation from unstructured piles in Amazon Robotics' Robot Induction (Robin) fleet, which utilizes a pick success predictor trained on real production data. Specifically, the system was trained on over 394K picks. It is used for singulating up to 5 million packages per day and has manipulated over 200 million packages during this paper's evaluation period. The developed learned pick quality measure ranks various pick alternatives in real-time and prioritizes the most promising ones for execution. The pick success predictor aims to estimate from prior experience the success probability of a desired pick by the deployed industrial robotic arms in cluttered scenes containing deformable and rigid objects with partially known properties. It is a shallow machine learning model, which allows us to evaluate which features are most important for the prediction. An online pick ranker leverages the learned success predictor to prioritize the most promising picks for the robotic arm, which are then assessed for collision avoidance. This learned ranking process is demonstrated to overcome the limitations and outperform the performance of manually engineered and heuristic alternatives. To the best of the authors' knowledge, this paper presents the first large-scale deployment of learned pick quality estimation methods in a real production system.

自动化仓库操作可以降低物流间接成本，最终降低消费者的最终价格，提高交付速度，并增强对劳动力波动的弹性。过去几年，人们对这些重复任务的自动化越来越感兴趣，但主要是在受控环境中。从杂乱杂乱的堆中挑选物品等任务直到最近才变得足够强大，可以在最少人为干预的情况下进行大规模部署。本文展示了亚马逊机器人公司的机器人感应(Robin)车队中对非结构化堆的大规模包装操作，该车队利用了经过实际生产数据训练的拣选成功预测器。具体来说，该系统接受了超过394K次选秀的训练。在本文的评估期间，它每天被用来处理多达500万个包裹，并操纵了超过2亿个包裹。开发的学习采摘质量度量对各种采摘方案进行实时排序，并优先考虑最有希望执行的方案。拾取成功预测器的目的是根据先前的经验估计在包含部分已知属性的可变形和刚性物体的混乱场景中部署的工业机械臂的期望拾取的成功概率。这是一个浅层机器学习模型，它允许我们评估哪些特征对预测最重要。在线拾取排序器利用学习到的成功预测器来优先考虑机器人手臂最有希望的拾取，然后对其进行碰撞避免评估。这种学习排序过程被证明克服了局限性，并且优于手动设计和启发式替代方案的性能。据作者所知，本文首次在实际生产系统中大规模部署了学习后的采摘质量估计方法。

{"title":"Demonstrating Large-Scale Package Manipulation via Learned Metrics of Pick Success","authors":"Shuai-Peng Li, Azarakhsh Keipour, Kevin G. Jamieson, Nicolas Hudson, Charles Swan, Kostas E. Bekris","doi":"10.15607/RSS.2023.XIX.023","DOIUrl":"https://doi.org/10.15607/RSS.2023.XIX.023","url":null,"abstract":"Automating warehouse operations can reduce logistics overhead costs, ultimately driving down the final price for consumers, increasing the speed of delivery, and enhancing the resiliency to workforce fluctuations. The past few years have seen increased interest in automating such repeated tasks but mostly in controlled settings. Tasks such as picking objects from unstructured, cluttered piles have only recently become robust enough for large-scale deployment with minimal human intervention. This paper demonstrates a large-scale package manipulation from unstructured piles in Amazon Robotics' Robot Induction (Robin) fleet, which utilizes a pick success predictor trained on real production data. Specifically, the system was trained on over 394K picks. It is used for singulating up to 5 million packages per day and has manipulated over 200 million packages during this paper's evaluation period. The developed learned pick quality measure ranks various pick alternatives in real-time and prioritizes the most promising ones for execution. The pick success predictor aims to estimate from prior experience the success probability of a desired pick by the deployed industrial robotic arms in cluttered scenes containing deformable and rigid objects with partially known properties. It is a shallow machine learning model, which allows us to evaluate which features are most important for the prediction. An online pick ranker leverages the learned success predictor to prioritize the most promising picks for the robotic arm, which are then assessed for collision avoidance. This learned ranking process is demonstrated to overcome the limitations and outperform the performance of manually engineered and heuristic alternatives. To the best of the authors' knowledge, this paper presents the first large-scale deployment of learned pick quality estimation methods in a real production system.","PeriodicalId":248720,"journal":{"name":"Robotics: Science and Systems XIX","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121024093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Energy-based Models are Zero-Shot Planners for Compositional Scene Rearrangement 基于能量的模型是零镜头计划器，用于合成场景重排

Robotics: Science and Systems XIX

Pub Date : 2023-04-27 DOI: 10.15607/RSS.2023.XIX.030

N. Gkanatsios, Ayush Jain, Zhou Xian, Yunchu Zhang, C. Atkeson, Katerina Fragkiadaki

Language is compositional; an instruction can express multiple relation constraints to hold among objects in a scene that a robot is tasked to rearrange. Our focus in this work is an instructable scene-rearranging framework that generalizes to longer instructions and to spatial concept compositions never seen at training time. We propose to represent language-instructed spatial concepts with energy functions over relative object arrangements. A language parser maps instructions to corresponding energy functions and an open-vocabulary visual-language model grounds their arguments to relevant objects in the scene. We generate goal scene configurations by gradient descent on the sum of energy functions, one per language predicate in the instruction. Local vision-based policies then re-locate objects to the inferred goal locations. We test our model on established instruction-guided manipulation benchmarks, as well as benchmarks of compositional instructions we introduce. We show our model can execute highly compositional instructions zero-shot in simulation and in the real world. It outperforms language-to-action reactive policies and Large Language Model planners by a large margin, especially for long instructions that involve compositions of multiple spatial concepts. Simulation and real-world robot execution videos, as well as our code and datasets are publicly available on our website: https://ebmplanner.github.io.

语言是组成的;指令可以表达机器人任务重新排列的场景中对象之间的多个关系约束。我们在这项工作中的重点是一个可指导的场景重排框架，它可以推广到更长的指令和空间概念组合，这些组合在训练时从未见过。我们建议用相对物体排列上的能量函数来表示语言指示的空间概念。语言解析器将指令映射到相应的能量函数，开放词汇的视觉语言模型将它们的参数基于场景中的相关对象。我们通过能量函数和的梯度下降来生成目标场景配置，指令中的每个语言谓词一个。然后，基于局部视觉的策略将对象重新定位到推断的目标位置。我们在已建立的指令引导操作基准测试上测试我们的模型，以及我们引入的组合指令的基准测试。我们证明了我们的模型可以在模拟和现实世界中执行高度合成的指令零射击。它在很大程度上优于语言到行动的反应性策略和大型语言模型规划器，特别是对于涉及多个空间概念组合的长指令。仿真和真实世界的机器人执行视频，以及我们的代码和数据集都可以在我们的网站上公开获取:https://ebmplanner.github.io。

{"title":"Energy-based Models are Zero-Shot Planners for Compositional Scene Rearrangement","authors":"N. Gkanatsios, Ayush Jain, Zhou Xian, Yunchu Zhang, C. Atkeson, Katerina Fragkiadaki","doi":"10.15607/RSS.2023.XIX.030","DOIUrl":"https://doi.org/10.15607/RSS.2023.XIX.030","url":null,"abstract":"Language is compositional; an instruction can express multiple relation constraints to hold among objects in a scene that a robot is tasked to rearrange. Our focus in this work is an instructable scene-rearranging framework that generalizes to longer instructions and to spatial concept compositions never seen at training time. We propose to represent language-instructed spatial concepts with energy functions over relative object arrangements. A language parser maps instructions to corresponding energy functions and an open-vocabulary visual-language model grounds their arguments to relevant objects in the scene. We generate goal scene configurations by gradient descent on the sum of energy functions, one per language predicate in the instruction. Local vision-based policies then re-locate objects to the inferred goal locations. We test our model on established instruction-guided manipulation benchmarks, as well as benchmarks of compositional instructions we introduce. We show our model can execute highly compositional instructions zero-shot in simulation and in the real world. It outperforms language-to-action reactive policies and Large Language Model planners by a large margin, especially for long instructions that involve compositions of multiple spatial concepts. Simulation and real-world robot execution videos, as well as our code and datasets are publicly available on our website: https://ebmplanner.github.io.","PeriodicalId":248720,"journal":{"name":"Robotics: Science and Systems XIX","volume":"534 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133389274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13