释放机器人的自主性：基础模型应用调查

IF 2.5 3区计算机科学 Q2 AUTOMATION & CONTROL SYSTEMS International Journal of Control Automation and Systems Pub Date : 2024-08-02 DOI:10.1007/s12555-024-0438-7

Dae-Sung Jang, Doo-Hyun Cho, Woo-Cheol Lee, Seung-Keol Ryu, Byeongmin Jeong, Minji Hong, Minjo Jung, Minchae Kim, Minjoon Lee, SeungJae Lee, Han-Lim Choi

{"title":"释放机器人的自主性：基础模型应用调查","authors":"Dae-Sung Jang, Doo-Hyun Cho, Woo-Cheol Lee, Seung-Keol Ryu, Byeongmin Jeong, Minji Hong, Minjo Jung, Minchae Kim, Minjoon Lee, SeungJae Lee, Han-Lim Choi","doi":"10.1007/s12555-024-0438-7","DOIUrl":null,"url":null,"abstract":"<p>The advancement of foundation models, such as large language models (LLMs), vision-language models (VLMs), diffusion models, and robotics foundation models (RFMs), has become a new paradigm in robotics by offering innovative approaches to the long-standing challenge of building robot autonomy. These models enable the development of robotic agents that can independently understand and reason about semantic contexts, plan actions, physically interact with surroundings, and adapt to new environments and untrained tasks. This paper presents a comprehensive and systematic survey of recent advancements in applying foundation models to robot perception, planning, and control. It introduces the key concepts and terminology associated with foundation models, providing a clear understanding for researchers in robotics and control engineering. The relevant studies are categorized based on how foundation models are utilized in various elements of robotic autonomy, focusing on 1) perception and situational awareness: object detection and classification, semantic understanding, mapping, and navigation; 2) decision making and task planning: mission understanding, task decomposition and coordination, planning with symbolic and learning-based approaches, plan validation and correction, and LLM-robot interaction; 3) motion planning and control: motion planning, control command and reward generation, and trajectory generation and optimization with diffusion models. Furthermore, the survey covers essential environmental setups, including real-world and simulation datasets and platforms used in training and validating these models. It concludes with a discussion on current challenges such as robustness, explainability, data scarcity, and real-time performance, and highlights promising future directions, including retrieval augmented generation, on-device foundation models, and explainability. This survey aims to systematically summarize the latest research trends in applying foundation models to robotics, bridging the gap between the state-of-the-art in artificial intelligence and robotics. By sharing knowledge and resources, this survey is expected to foster the introduction of a new research paradigm for building generalized and autonomous robots.</p>","PeriodicalId":54965,"journal":{"name":"International Journal of Control Automation and Systems","volume":"174 1","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Unlocking Robotic Autonomy: A Survey on the Applications of Foundation Models\",\"authors\":\"Dae-Sung Jang, Doo-Hyun Cho, Woo-Cheol Lee, Seung-Keol Ryu, Byeongmin Jeong, Minji Hong, Minjo Jung, Minchae Kim, Minjoon Lee, SeungJae Lee, Han-Lim Choi\",\"doi\":\"10.1007/s12555-024-0438-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The advancement of foundation models, such as large language models (LLMs), vision-language models (VLMs), diffusion models, and robotics foundation models (RFMs), has become a new paradigm in robotics by offering innovative approaches to the long-standing challenge of building robot autonomy. These models enable the development of robotic agents that can independently understand and reason about semantic contexts, plan actions, physically interact with surroundings, and adapt to new environments and untrained tasks. This paper presents a comprehensive and systematic survey of recent advancements in applying foundation models to robot perception, planning, and control. It introduces the key concepts and terminology associated with foundation models, providing a clear understanding for researchers in robotics and control engineering. The relevant studies are categorized based on how foundation models are utilized in various elements of robotic autonomy, focusing on 1) perception and situational awareness: object detection and classification, semantic understanding, mapping, and navigation; 2) decision making and task planning: mission understanding, task decomposition and coordination, planning with symbolic and learning-based approaches, plan validation and correction, and LLM-robot interaction; 3) motion planning and control: motion planning, control command and reward generation, and trajectory generation and optimization with diffusion models. Furthermore, the survey covers essential environmental setups, including real-world and simulation datasets and platforms used in training and validating these models. It concludes with a discussion on current challenges such as robustness, explainability, data scarcity, and real-time performance, and highlights promising future directions, including retrieval augmented generation, on-device foundation models, and explainability. This survey aims to systematically summarize the latest research trends in applying foundation models to robotics, bridging the gap between the state-of-the-art in artificial intelligence and robotics. By sharing knowledge and resources, this survey is expected to foster the introduction of a new research paradigm for building generalized and autonomous robots.</p>\",\"PeriodicalId\":54965,\"journal\":{\"name\":\"International Journal of Control Automation and Systems\",\"volume\":\"174 1\",\"pages\":\"\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2024-08-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Control Automation and Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s12555-024-0438-7\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Control Automation and Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s12555-024-0438-7","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

大语言模型（LLMs）、视觉语言模型（VLMs）、扩散模型和机器人基础模型（RFMs）等基础模型的发展，为应对机器人自主性这一长期挑战提供了创新方法，从而成为机器人技术的新范例。通过这些模型，可以开发出能够独立理解和推理语义上下文、规划行动、与周围环境进行物理交互、适应新环境和未经训练的任务的机器人代理。本文全面系统地介绍了将基础模型应用于机器人感知、规划和控制的最新进展。它介绍了与基础模型相关的关键概念和术语，为机器人和控制工程领域的研究人员提供了一个清晰的认识。相关研究根据基础模型在机器人自主性各要素中的应用方式进行了分类，主要集中在：1）感知和态势感知：物体检测和分类、语义理解、映射和导航；2）决策制定和任务规划：任务理解、任务分解和协调、基于符号和学习方法的规划、计划验证和修正，以及 LLM 与机器人的交互；3）运动规划和控制：运动规划、控制命令和奖励生成，以及轨迹生成和扩散模型优化。此外，调查还涉及基本环境设置，包括真实世界和模拟数据集，以及用于训练和验证这些模型的平台。最后，调查还讨论了当前面临的挑战，如鲁棒性、可解释性、数据稀缺性和实时性，并强调了未来的发展方向，包括检索增强生成、设备基础模型和可解释性。本调查旨在系统总结将基础模型应用于机器人技术的最新研究趋势，缩小人工智能和机器人技术之间的差距。通过共享知识和资源，本调查有望促进引入一种新的研究范式，用于构建通用和自主机器人。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Unlocking Robotic Autonomy: A Survey on the Applications of Foundation Models

The advancement of foundation models, such as large language models (LLMs), vision-language models (VLMs), diffusion models, and robotics foundation models (RFMs), has become a new paradigm in robotics by offering innovative approaches to the long-standing challenge of building robot autonomy. These models enable the development of robotic agents that can independently understand and reason about semantic contexts, plan actions, physically interact with surroundings, and adapt to new environments and untrained tasks. This paper presents a comprehensive and systematic survey of recent advancements in applying foundation models to robot perception, planning, and control. It introduces the key concepts and terminology associated with foundation models, providing a clear understanding for researchers in robotics and control engineering. The relevant studies are categorized based on how foundation models are utilized in various elements of robotic autonomy, focusing on 1) perception and situational awareness: object detection and classification, semantic understanding, mapping, and navigation; 2) decision making and task planning: mission understanding, task decomposition and coordination, planning with symbolic and learning-based approaches, plan validation and correction, and LLM-robot interaction; 3) motion planning and control: motion planning, control command and reward generation, and trajectory generation and optimization with diffusion models. Furthermore, the survey covers essential environmental setups, including real-world and simulation datasets and platforms used in training and validating these models. It concludes with a discussion on current challenges such as robustness, explainability, data scarcity, and real-time performance, and highlights promising future directions, including retrieval augmented generation, on-device foundation models, and explainability. This survey aims to systematically summarize the latest research trends in applying foundation models to robotics, bridging the gap between the state-of-the-art in artificial intelligence and robotics. By sharing knowledge and resources, this survey is expected to foster the introduction of a new research paradigm for building generalized and autonomous robots.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Control Automation and Systems 工程技术-自动化与控制系统

CiteScore

5.80

自引率

21.90%

发文量

343

审稿时长

8.7 months

期刊介绍： International Journal of Control, Automation and Systems is a joint publication of the Institute of Control, Robotics and Systems (ICROS) and the Korean Institute of Electrical Engineers (KIEE). The journal covers three closly-related research areas including control, automation, and systems. The technical areas include Control Theory Control Applications Robotics and Automation Intelligent and Information Systems The Journal addresses research areas focused on control, automation, and systems in electrical, mechanical, aerospace, chemical, and industrial engineering in order to create a strong synergy effect throughout the interdisciplinary research areas.