Autonomous Robots最新文献

英文中文

Learning to summarize and answer questions about a virtual robot’s past actions 学习总结和回答关于虚拟机器人过去行为的问题

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Autonomous Robots

Pub Date : 2023-11-16 DOI: 10.1007/s10514-023-10134-4

Chad DeChant, Iretiayo Akinola, Daniel Bauer

When robots perform long action sequences, users will want to easily and reliably find out what they have done. We therefore demonstrate the task of learning to summarize and answer questions about a robot agent’s past actions using natural language alone. A single system with a large language model at its core is trained to both summarize and answer questions about action sequences given ego-centric video frames of a virtual robot and a question prompt. To enable training of question answering, we develop a method to automatically generate English-language questions and answers about objects, actions, and the temporal order in which actions occurred during episodes of robot action in the virtual environment. Training one model to both summarize and answer questions enables zero-shot transfer of representations of objects learned through question answering to improved action summarization.

当机器人执行长动作序列时，用户会想要轻松可靠地找出它们做了什么。因此，我们展示了学习的任务，即仅使用自然语言来总结和回答关于机器人代理过去行为的问题。一个以大型语言模型为核心的单一系统被训练来总结和回答关于动作序列的问题，给定一个虚拟机器人的以自我为中心的视频帧和一个问题提示。为了实现问题回答的训练，我们开发了一种方法来自动生成关于对象、动作和虚拟环境中机器人动作期间动作发生的时间顺序的英语问题和答案。训练一个模型来总结和回答问题，可以将通过回答问题学习到的对象的表示零概率转移到改进的动作总结。

引用次数: 0

Text2Motion: from natural language instructions to feasible plans Text2Motion:从自然语言指令到可行的计划

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Autonomous Robots

Pub Date : 2023-11-14 DOI: 10.1007/s10514-023-10131-7

Kevin Lin, Christopher Agia, Toki Migimatsu, Marco Pavone, Jeannette Bohg

We propose Text2Motion, a language-based planning framework enabling robots to solve sequential manipulation tasks that require long-horizon reasoning. Given a natural language instruction, our framework constructs both a task- and motion-level plan that is verified to reach inferred symbolic goals. Text2Motion uses feasibility heuristics encoded in Q-functions of a library of skills to guide task planning with Large Language Models. Whereas previous language-based planners only consider the feasibility of individual skills, Text2Motion actively resolves geometric dependencies spanning skill sequences by performing geometric feasibility planning during its search. We evaluate our method on a suite of problems that require long-horizon reasoning, interpretation of abstract goals, and handling of partial affordance perception. Our experiments show that Text2Motion can solve these challenging problems with a success rate of 82%, while prior state-of-the-art language-based planning methods only achieve 13%. Text2Motion thus provides promising generalization characteristics to semantically diverse sequential manipulation tasks with geometric dependencies between skills. Qualitative results are made available at https://sites.google.com/stanford.edu/text2motion.

我们提出Text2Motion，一个基于语言的规划框架，使机器人能够解决需要长期推理的顺序操作任务。给定一个自然语言指令，我们的框架构建了一个任务级和动作级计划，该计划被验证以达到推断的符号目标。Text2Motion使用在技能库的q函数中编码的可行性启发式来指导大型语言模型的任务规划。以前基于语言的规划器只考虑单个技能的可行性，而Text2Motion通过在搜索过程中执行几何可行性规划，主动解决跨越技能序列的几何依赖性。我们在一系列问题上评估了我们的方法，这些问题需要长期的推理，抽象目标的解释，以及部分可视性感知的处理。我们的实验表明，Text2Motion可以以82%的成功率解决这些具有挑战性的问题，而之前最先进的基于语言的规划方法只有13%的成功率。因此，Text2Motion为技能之间具有几何依赖性的语义多样的顺序操作任务提供了有希望的泛化特征。定性结果可在https://sites.google.com/stanford.edu/text2motion查阅。

{"title":"Text2Motion: from natural language instructions to feasible plans","authors":"Kevin Lin, Christopher Agia, Toki Migimatsu, Marco Pavone, Jeannette Bohg","doi":"10.1007/s10514-023-10131-7","DOIUrl":"10.1007/s10514-023-10131-7","url":null,"abstract":"<div><p>We propose Text2Motion, a language-based planning framework enabling robots to solve sequential manipulation tasks that require long-horizon reasoning. Given a natural language instruction, our framework constructs both a task- and motion-level plan that is verified to reach inferred symbolic goals. Text2Motion uses feasibility heuristics encoded in Q-functions of a library of skills to guide task planning with Large Language Models. Whereas previous language-based planners only consider the feasibility of individual skills, Text2Motion actively resolves geometric dependencies spanning skill sequences by performing geometric feasibility planning during its search. We evaluate our method on a suite of problems that require long-horizon reasoning, interpretation of abstract goals, and handling of partial affordance perception. Our experiments show that Text2Motion can solve these challenging problems with a success rate of 82%, while prior state-of-the-art language-based planning methods only achieve 13%. Text2Motion thus provides promising generalization characteristics to semantically diverse sequential manipulation tasks with geometric dependencies between skills. Qualitative results are made available at https://sites.google.com/stanford.edu/text2motion.\u0000</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1345 - 1365"},"PeriodicalIF":3.5,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134954182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 73

Correction: Efficiently exploring for human robot interaction: partially observable Poisson processes 更正:有效探索人机交互:部分可观察的泊松过程

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Autonomous Robots

Pub Date : 2023-11-11 DOI: 10.1007/s10514-023-10152-2

Ferdian Jovan, Milan Tomy, Nick Hawes, Jeremy Wyatt

引用次数: 0

Editor’s note - Special issue on Robot Swarms in the Real World: from Design to Deployment 编者按-关于现实世界中的机器人群的特刊:从设计到部署

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Autonomous Robots

Pub Date : 2023-11-09 DOI: 10.1007/s10514-023-10151-3

引用次数: 0

SpaTiaL: monitoring and planning of robotic tasks using spatio-temporal logic specifications 空间:利用时空逻辑规范监测和规划机器人任务

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Autonomous Robots

Pub Date : 2023-11-03 DOI: 10.1007/s10514-023-10145-1

Christian Pek, Georg Friedrich Schuppe, Francesco Esposito, Jana Tumova, Danica Kragic

Many tasks require robots to manipulate objects while satisfying a complex interplay of spatial and temporal constraints. For instance, a table setting robot first needs to place a mug and then fill it with coffee, while satisfying spatial relations such as forks need to placed left of plates. We propose the spatio-temporal framework SpaTiaL that unifies the specification, monitoring, and planning of object-oriented robotic tasks in a robot-agnostic fashion. SpaTiaL is able to specify diverse spatial relations between objects and temporal task patterns. Our experiments with recorded data, simulations, and real robots demonstrate how SpaTiaL provides real-time monitoring and facilitates online planning. SpaTiaL is open source and easily expandable to new object relations and robotic applications.

许多任务要求机器人在满足空间和时间约束的复杂相互作用的同时操纵物体。比如，摆桌机器人首先要把杯子放好，然后再往杯子里倒咖啡，而满足空间关系，比如叉子需要放在盘子的左边。我们提出了一个时空框架SpaTiaL，它以一种机器人不可知的方式统一了面向对象机器人任务的规范、监控和规划。空间能够指定对象和时间任务模式之间的各种空间关系。我们通过记录数据、模拟和真实机器人进行的实验证明了SpaTiaL如何提供实时监控和促进在线规划。SpaTiaL是开源的，可以很容易地扩展到新的对象关系和机器人应用程序。

引用次数: 0

Multi-robot geometric task-and-motion planning for collaborative manipulation tasks 多机器人协同操作任务的几何任务与运动规划

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Autonomous Robots

Pub Date : 2023-10-30 DOI: 10.1007/s10514-023-10148-y

Hejia Zhang, Shao-Hung Chan, Jie Zhong, Jiaoyang Li, Peter Kolapo, Sven Koenig, Zach Agioutantis, Steven Schafrik, Stefanos Nikolaidis

We address multi-robot geometric task-and-motion planning (MR-GTAMP) problems in synchronous, monotone setups. The goal of the MR-GTAMP problem is to move objects with multiple robots to goal regions in the presence of other movable objects. We focus on collaborative manipulation tasks where the robots have to adopt intelligent collaboration strategies to be successful and effective, i.e., decide which robot should move which objects to which positions, and perform collaborative actions, such as handovers. To endow robots with these collaboration capabilities, we propose to first collect occlusion and reachability information for each robot by calling motion-planning algorithms. We then propose a method that uses the collected information to build a graph structure which captures the precedence of the manipulations of different objects and supports the implementation of a mixed-integer program to guide the search for highly effective collaborative task-and-motion plans. The search process for collaborative task-and-motion plans is based on a Monte-Carlo Tree Search (MCTS) exploration strategy to achieve exploration-exploitation balance. We evaluate our framework in two challenging MR-GTAMP domains and show that it outperforms two state-of-the-art baselines with respect to the planning time, the resulting plan length and the number of objects moved. We also show that our framework can be applied to underground mining operations where a robotic arm needs to coordinate with an autonomous roof bolter. We demonstrate plan execution in two roof-bolting scenarios both in simulation and on robots.

我们解决了同步、单调设置中的多机器人几何任务和运动规划(MR-GTAMP)问题。MR-GTAMP问题的目标是在其他可移动物体存在的情况下，将多个机器人的物体移动到目标区域。我们专注于协作操作任务，其中机器人必须采用智能协作策略才能成功和有效，即决定哪个机器人应该将哪个对象移动到哪个位置，并执行协作动作，例如移交。为了赋予机器人这些协作能力，我们建议首先通过调用运动规划算法来收集每个机器人的遮挡和可达性信息。然后，我们提出了一种方法，该方法使用收集到的信息来构建一个图结构，该图结构捕获了不同对象的操作优先级，并支持混合整数程序的实现，以指导搜索高效的协同任务和运动计划。协同任务-运动计划的搜索过程基于蒙特卡罗树搜索(MCTS)搜索策略，以实现勘探-开发平衡。我们在两个具有挑战性的MR-GTAMP域中评估了我们的框架，并表明它在规划时间、最终计划长度和移动对象数量方面优于两个最先进的基线。我们还表明，我们的框架可以应用于地下采矿作业，在地下采矿作业中，机械臂需要与自动锚固机协调。我们在模拟和机器人上演示了两种屋顶锚固方案的计划执行。

{"title":"Multi-robot geometric task-and-motion planning for collaborative manipulation tasks","authors":"Hejia Zhang, Shao-Hung Chan, Jie Zhong, Jiaoyang Li, Peter Kolapo, Sven Koenig, Zach Agioutantis, Steven Schafrik, Stefanos Nikolaidis","doi":"10.1007/s10514-023-10148-y","DOIUrl":"10.1007/s10514-023-10148-y","url":null,"abstract":"<div><p>We address multi-robot geometric task-and-motion planning (MR-GTAMP) problems in <i>synchronous</i>, <i>monotone</i> setups. The goal of the MR-GTAMP problem is to move objects with multiple robots to goal regions in the presence of other movable objects. We focus on collaborative manipulation tasks where the robots have to adopt intelligent collaboration strategies to be successful and effective, i.e., decide which robot should move which objects to which positions, and perform collaborative actions, such as handovers. To endow robots with these collaboration capabilities, we propose to first collect occlusion and reachability information for each robot by calling motion-planning algorithms. We then propose a method that uses the collected information to build a graph structure which captures the precedence of the manipulations of different objects and supports the implementation of a mixed-integer program to guide the search for highly effective collaborative task-and-motion plans. The search process for collaborative task-and-motion plans is based on a Monte-Carlo Tree Search (MCTS) exploration strategy to achieve exploration-exploitation balance. We evaluate our framework in two challenging MR-GTAMP domains and show that it outperforms two state-of-the-art baselines with respect to the planning time, the resulting plan length and the number of objects moved. We also show that our framework can be applied to underground mining operations where a robotic arm needs to coordinate with an autonomous roof bolter. We demonstrate plan execution in two roof-bolting scenarios both in simulation and on robots.\u0000</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1537 - 1558"},"PeriodicalIF":3.5,"publicationDate":"2023-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-023-10148-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136022819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unsupervised dissimilarity-based fault detection method for autonomous mobile robots 基于非监督差异性的自主移动机器人故障检测方法

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Autonomous Robots

Pub Date : 2023-10-28 DOI: 10.1007/s10514-023-10144-2

Mahmut Kasap, Metin Yılmaz, Eyüp Çinar, Ahmet Yazıcı

Autonomous robots are one of the critical components in modern manufacturing systems. For this reason, the uninterrupted operation of robots in manufacturing is important for the sustainability of autonomy. Detecting possible fault symptoms that may cause failures within a work environment will help to eliminate interrupted operations. When supervised learning methods are considered, obtaining and storing labeled, historical training data in a manufacturing environment with faults is a challenging task. In addition, sensors in mobile devices such as robots are exposed to different noisy external conditions in production environments affecting data labels and fault mapping. Furthermore, relying on a single sensor data for fault detection often causes false alarms for equipment monitoring. Our study takes requirements into consideration and proposes a new unsupervised machine-learning algorithm to detect possible operational faults encountered by autonomous mobile robots. The method suggests using an ensemble of multi-sensor information fusion at the decision level by voting to enhance decision reliability. The proposed technique relies on dissimilarity-based sensor data segmentation with an adaptive threshold control. It has been tested experimentally on an autonomous mobile robot. The experimental results show that the proposed method is effective for detecting operational anomalies. Furthermore, the proposed voting mechanism is also capable of eliminating false positives in case of a single source of information is utilized.

自主机器人是现代制造系统的重要组成部分之一。因此，机器人在制造业中的不间断运行对于自主性的可持续性至关重要。检测工作环境中可能导致故障的故障症状将有助于消除中断的操作。当考虑监督学习方法时，在具有故障的制造环境中获取和存储标记的历史训练数据是一项具有挑战性的任务。此外，移动设备(如机器人)中的传感器在生产环境中暴露于不同的嘈杂外部条件下，影响数据标签和故障映射。此外，依靠单个传感器数据进行故障检测往往会导致设备监控的误报。我们的研究考虑了需求，提出了一种新的无监督机器学习算法来检测自主移动机器人可能遇到的操作故障。该方法提出在决策层面采用多传感器信息融合集成，通过投票的方式提高决策的可靠性。该方法基于传感器数据的不相似度分割和自适应阈值控制。它已经在一个自主移动机器人上进行了实验测试。实验结果表明，该方法对操作异常检测是有效的。此外，所提议的投票机制还能够在使用单一信息来源的情况下消除误报。

{"title":"Unsupervised dissimilarity-based fault detection method for autonomous mobile robots","authors":"Mahmut Kasap, Metin Yılmaz, Eyüp Çinar, Ahmet Yazıcı","doi":"10.1007/s10514-023-10144-2","DOIUrl":"10.1007/s10514-023-10144-2","url":null,"abstract":"<div><p>Autonomous robots are one of the critical components in modern manufacturing systems. For this reason, the uninterrupted operation of robots in manufacturing is important for the sustainability of autonomy. Detecting possible fault symptoms that may cause failures within a work environment will help to eliminate interrupted operations. When supervised learning methods are considered, obtaining and storing labeled, historical training data in a manufacturing environment with faults is a challenging task. In addition, sensors in mobile devices such as robots are exposed to different noisy external conditions in production environments affecting data labels and fault mapping. Furthermore, relying on a single sensor data for fault detection often causes false alarms for equipment monitoring. Our study takes requirements into consideration and proposes a new unsupervised machine-learning algorithm to detect possible operational faults encountered by autonomous mobile robots. The method suggests using an ensemble of multi-sensor information fusion at the decision level by voting to enhance decision reliability. The proposed technique relies on dissimilarity-based sensor data segmentation with an adaptive threshold control. It has been tested experimentally on an autonomous mobile robot. The experimental results show that the proposed method is effective for detecting operational anomalies. Furthermore, the proposed voting mechanism is also capable of eliminating false positives in case of a single source of information is utilized.\u0000</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1503 - 1518"},"PeriodicalIF":3.5,"publicationDate":"2023-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136232753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Chasing millimeters: design, navigation and state estimation for precise in-flight marking on ceilings 追逐毫米:设计，导航和状态估计精确的飞行标记在天花板上

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Autonomous Robots

Pub Date : 2023-10-26 DOI: 10.1007/s10514-023-10141-5

Christian Lanegger, Michael Pantic, Rik Bähnemann, Roland Siegwart, Lionel Ott

Precise markings for drilling and assembly are crucial, laborious construction tasks. Aerial robots with suitable end-effectors are capable of markings at the millimeter scale. However, so far, they have only been demonstrated under laboratory conditions where rigid state estimation and navigation assumptions do not impede robustness and accuracy. This paper presents a complete aerial layouting system capable of precise markings on-site under realistic conditions. We use a compliant actuated end-effector on an omnidirectional flying base. Combining a two-stage factor-graph state estimator with a Riemannian Motion Policy-based navigation stack, we avoid the need for a globally consistent state estimate and increase robustness. The policy-based navigation is structured into individual behaviors in different state spaces. Through a comprehensive study, we show that the system creates highly precise markings at a relative precision of 1.5 mm and a global accuracy of 5–6 mm and discuss the results in the context of future construction robotics.

钻井和装配的精确标记是至关重要的，费力的施工任务。具有合适末端执行器的空中机器人能够在毫米尺度上进行标记。然而，到目前为止，它们只在实验室条件下进行了演示，在实验室条件下，刚性状态估计和导航假设不会妨碍鲁棒性和准确性。本文提出了一种能够在实际条件下进行精确现场标记的完整的空中布图系统。我们在一个全向飞行基地上使用了一个柔性驱动的末端执行器。将两阶段因子图状态估计器与基于黎曼运动策略的导航堆栈相结合，避免了对全局一致状态估计的需要，提高了鲁棒性。基于策略的导航被组织成不同状态空间中的单个行为。通过全面的研究，我们表明，该系统以1.5毫米的相对精度和5-6毫米的全局精度创建高精度标记，并在未来建筑机器人的背景下讨论了结果。

{"title":"Chasing millimeters: design, navigation and state estimation for precise in-flight marking on ceilings","authors":"Christian Lanegger, Michael Pantic, Rik Bähnemann, Roland Siegwart, Lionel Ott","doi":"10.1007/s10514-023-10141-5","DOIUrl":"10.1007/s10514-023-10141-5","url":null,"abstract":"<div><p>Precise markings for drilling and assembly are crucial, laborious construction tasks. Aerial robots with suitable end-effectors are capable of markings at the millimeter scale. However, so far, they have only been demonstrated under laboratory conditions where rigid state estimation and navigation assumptions do not impede robustness and accuracy. This paper presents a complete aerial layouting system capable of precise markings on-site under realistic conditions. We use a compliant actuated end-effector on an omnidirectional flying base. Combining a two-stage factor-graph state estimator with a Riemannian Motion Policy-based navigation stack, we avoid the need for a globally consistent state estimate and increase robustness. The policy-based navigation is structured into individual behaviors in different state spaces. Through a comprehensive study, we show that the system creates highly precise markings at a relative precision of 1.5 mm and a global accuracy of 5–6 mm and discuss the results in the context of future construction robotics.\u0000</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1405 - 1418"},"PeriodicalIF":3.5,"publicationDate":"2023-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-023-10141-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134909983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models, Benchmark and Efficient Evaluation 基于图像的导航在现实世界环境中通过多个中层表示:融合模型，基准和有效评估

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Autonomous Robots

Pub Date : 2023-10-26 DOI: 10.1007/s10514-023-10147-z

Marco Rosano, Antonino Furnari, Luigi Gulino, Corrado Santoro, Giovanni Maria Farinella

Robot visual navigation is a relevant research topic. Current deep navigation models conveniently learn the navigation policies in simulation, given the large amount of experience they need to collect. Unfortunately, the resulting models show a limited generalization ability when deployed in the real world. In this work we explore solutions to facilitate the development of visual navigation policies trained in simulation that can be successfully transferred in the real world. We first propose an efficient evaluation tool to reproduce realistic navigation episodes in simulation. We then investigate a variety of deep fusion architectures to combine a set of mid-level representations, with the aim of finding the best merge strategy that maximize the real world performances. Our experiments, performed both in simulation and on a robotic platform, show the effectiveness of the considered mid-level representations-based models and confirm the reliability of the evaluation tool. The 3D models of the environment and the code of the validation tool are publicly available at the following link: https://iplab.dmi.unict.it/EmbodiedVN/.

机器人视觉导航是一个相关的研究课题。目前的深度导航模型需要收集大量的经验，因此可以方便地在仿真中学习导航策略。不幸的是，当在现实世界中部署时，得到的模型显示出有限的泛化能力。在这项工作中，我们探索解决方案，以促进在模拟中训练的视觉导航策略的发展，这些策略可以成功地转移到现实世界中。我们首先提出了一种有效的评估工具来重现模拟中的真实导航事件。然后，我们研究了各种深度融合架构，以组合一组中层表示，目的是找到最大化现实世界性能的最佳合并策略。我们在仿真和机器人平台上进行的实验显示了所考虑的基于中层表示的模型的有效性，并确认了评估工具的可靠性。环境的3D模型和验证工具的代码可在以下链接中公开获取:https://iplab.dmi.unict.it/EmbodiedVN/。

{"title":"Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models, Benchmark and Efficient Evaluation","authors":"Marco Rosano, Antonino Furnari, Luigi Gulino, Corrado Santoro, Giovanni Maria Farinella","doi":"10.1007/s10514-023-10147-z","DOIUrl":"10.1007/s10514-023-10147-z","url":null,"abstract":"<div><p>Robot visual navigation is a relevant research topic. Current deep navigation models conveniently learn the navigation policies in simulation, given the large amount of experience they need to collect. Unfortunately, the resulting models show a limited generalization ability when deployed in the real world. In this work we explore solutions to facilitate the development of visual navigation policies trained in simulation that can be successfully transferred in the real world. We first propose an efficient evaluation tool to reproduce realistic navigation episodes in simulation. We then investigate a variety of deep fusion architectures to combine a set of mid-level representations, with the aim of finding the best merge strategy that maximize the real world performances. Our experiments, performed both in simulation and on a robotic platform, show the effectiveness of the considered mid-level representations-based models and confirm the reliability of the evaluation tool. The 3D models of the environment and the code of the validation tool are publicly available at the following link: https://iplab.dmi.unict.it/EmbodiedVN/.\u0000</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1483 - 1502"},"PeriodicalIF":3.5,"publicationDate":"2023-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-023-10147-z.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134907723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Large language models for chemistry robotics 用于化学机器人的大型语言模型

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Autonomous Robots

Pub Date : 2023-10-25 DOI: 10.1007/s10514-023-10136-2

Naruki Yoshikawa, Marta Skreta, Kourosh Darvish, Sebastian Arellano-Rubach, Zhi Ji, Lasse Bjørn Kristensen, Andrew Zou Li, Yuchi Zhao, Haoping Xu, Artur Kuramshin, Alán Aspuru-Guzik, Florian Shkurti, Animesh Garg

This paper proposes an approach to automate chemistry experiments using robots by translating natural language instructions into robot-executable plans, using large language models together with task and motion planning. Adding natural language interfaces to autonomous chemistry experiment systems lowers the barrier to using complicated robotics systems and increases utility for non-expert users, but translating natural language experiment descriptions from users into low-level robotics languages is nontrivial. Furthermore, while recent advances have used large language models to generate task plans, reliably executing those plans in the real world by an embodied agent remains challenging. To enable autonomous chemistry experiments and alleviate the workload of chemists, robots must interpret natural language commands, perceive the workspace, autonomously plan multi-step actions and motions, consider safety precautions, and interact with various laboratory equipment. Our approach, CLAIRify, combines automatic iterative prompting with program verification to ensure syntactically valid programs in a data-scarce domain-specific language that incorporates environmental constraints. The generated plan is executed through solving a constrained task and motion planning problem using PDDLStream solvers to prevent spillages of liquids as well as collisions in chemistry labs. We demonstrate the effectiveness of our approach in planning chemistry experiments, with plans successfully executed on a real robot using a repertoire of robot skills and lab tools. Specifically, we showcase the utility of our framework in pouring skills for various materials and two fundamental chemical experiments for materials synthesis: solubility and recrystallization. Further details about CLAIRify can be found at https://ac-rad.github.io/clairify/.

本文提出了一种使用机器人自动化化学实验的方法，通过将自然语言指令翻译成机器人可执行的计划，使用大型语言模型以及任务和运动规划。将自然语言接口添加到自主化学实验系统中降低了使用复杂机器人系统的障碍，并增加了非专业用户的实用性，但将用户的自然语言实验描述转换为低级机器人语言并非易事。此外，虽然最近的进展已经使用大型语言模型来生成任务计划，但在现实世界中由具体化的代理可靠地执行这些计划仍然具有挑战性。为了实现自主化学实验和减轻化学家的工作量，机器人必须解释自然语言命令，感知工作空间，自主规划多步骤动作和运动，考虑安全预防措施，并与各种实验室设备进行交互。我们的方法，CLAIRify，将自动迭代提示与程序验证相结合，以确保在包含环境约束的数据稀缺领域特定语言中语法有效的程序。生成的计划通过使用PDDLStream求解器解决约束任务和运动规划问题来执行，以防止化学实验室中的液体溢出和碰撞。我们证明了我们的方法在规划化学实验方面的有效性，并使用机器人技能和实验室工具成功地在真实的机器人上执行了计划。具体来说，我们展示了我们的框架在各种材料的浇注技能和材料合成的两个基本化学实验中的实用性:溶解度和再结晶。有关CLAIRify的更多详细信息，请访问https://ac-rad.github.io/clairify/。

{"title":"Large language models for chemistry robotics","authors":"Naruki Yoshikawa, Marta Skreta, Kourosh Darvish, Sebastian Arellano-Rubach, Zhi Ji, Lasse Bjørn Kristensen, Andrew Zou Li, Yuchi Zhao, Haoping Xu, Artur Kuramshin, Alán Aspuru-Guzik, Florian Shkurti, Animesh Garg","doi":"10.1007/s10514-023-10136-2","DOIUrl":"10.1007/s10514-023-10136-2","url":null,"abstract":"<div><p>This paper proposes an approach to automate chemistry experiments using robots by translating natural language instructions into robot-executable plans, using large language models together with task and motion planning. Adding natural language interfaces to autonomous chemistry experiment systems lowers the barrier to using complicated robotics systems and increases utility for non-expert users, but translating natural language experiment descriptions from users into low-level robotics languages is nontrivial. Furthermore, while recent advances have used large language models to generate task plans, reliably executing those plans in the real world by an embodied agent remains challenging. To enable autonomous chemistry experiments and alleviate the workload of chemists, robots must interpret natural language commands, perceive the workspace, autonomously plan multi-step actions and motions, consider safety precautions, and interact with various laboratory equipment. Our approach, <span>CLAIRify</span>, combines automatic iterative prompting with program verification to ensure syntactically valid programs in a data-scarce domain-specific language that incorporates environmental constraints. The generated plan is executed through solving a constrained task and motion planning problem using PDDLStream solvers to prevent spillages of liquids as well as collisions in chemistry labs. We demonstrate the effectiveness of our approach in planning chemistry experiments, with plans successfully executed on a real robot using a repertoire of robot skills and lab tools. Specifically, we showcase the utility of our framework in pouring skills for various materials and two fundamental chemical experiments for materials synthesis: solubility and recrystallization. Further details about <span>CLAIRify</span> can be found at https://ac-rad.github.io/clairify/.</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1057 - 1086"},"PeriodicalIF":3.5,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-023-10136-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135112102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Autonomous Robots

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀