{"title":"释放机器人的自主性:基础模型应用调查","authors":"Dae-Sung Jang, Doo-Hyun Cho, Woo-Cheol Lee, Seung-Keol Ryu, Byeongmin Jeong, Minji Hong, Minjo Jung, Minchae Kim, Minjoon Lee, SeungJae Lee, Han-Lim Choi","doi":"10.1007/s12555-024-0438-7","DOIUrl":null,"url":null,"abstract":"<p>The advancement of foundation models, such as large language models (LLMs), vision-language models (VLMs), diffusion models, and robotics foundation models (RFMs), has become a new paradigm in robotics by offering innovative approaches to the long-standing challenge of building robot autonomy. These models enable the development of robotic agents that can independently understand and reason about semantic contexts, plan actions, physically interact with surroundings, and adapt to new environments and untrained tasks. This paper presents a comprehensive and systematic survey of recent advancements in applying foundation models to robot perception, planning, and control. It introduces the key concepts and terminology associated with foundation models, providing a clear understanding for researchers in robotics and control engineering. The relevant studies are categorized based on how foundation models are utilized in various elements of robotic autonomy, focusing on 1) perception and situational awareness: object detection and classification, semantic understanding, mapping, and navigation; 2) decision making and task planning: mission understanding, task decomposition and coordination, planning with symbolic and learning-based approaches, plan validation and correction, and LLM-robot interaction; 3) motion planning and control: motion planning, control command and reward generation, and trajectory generation and optimization with diffusion models. Furthermore, the survey covers essential environmental setups, including real-world and simulation datasets and platforms used in training and validating these models. It concludes with a discussion on current challenges such as robustness, explainability, data scarcity, and real-time performance, and highlights promising future directions, including retrieval augmented generation, on-device foundation models, and explainability. This survey aims to systematically summarize the latest research trends in applying foundation models to robotics, bridging the gap between the state-of-the-art in artificial intelligence and robotics. By sharing knowledge and resources, this survey is expected to foster the introduction of a new research paradigm for building generalized and autonomous robots.</p>","PeriodicalId":54965,"journal":{"name":"International Journal of Control Automation and Systems","volume":"174 1","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Unlocking Robotic Autonomy: A Survey on the Applications of Foundation Models\",\"authors\":\"Dae-Sung Jang, Doo-Hyun Cho, Woo-Cheol Lee, Seung-Keol Ryu, Byeongmin Jeong, Minji Hong, Minjo Jung, Minchae Kim, Minjoon Lee, SeungJae Lee, Han-Lim Choi\",\"doi\":\"10.1007/s12555-024-0438-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The advancement of foundation models, such as large language models (LLMs), vision-language models (VLMs), diffusion models, and robotics foundation models (RFMs), has become a new paradigm in robotics by offering innovative approaches to the long-standing challenge of building robot autonomy. These models enable the development of robotic agents that can independently understand and reason about semantic contexts, plan actions, physically interact with surroundings, and adapt to new environments and untrained tasks. This paper presents a comprehensive and systematic survey of recent advancements in applying foundation models to robot perception, planning, and control. It introduces the key concepts and terminology associated with foundation models, providing a clear understanding for researchers in robotics and control engineering. The relevant studies are categorized based on how foundation models are utilized in various elements of robotic autonomy, focusing on 1) perception and situational awareness: object detection and classification, semantic understanding, mapping, and navigation; 2) decision making and task planning: mission understanding, task decomposition and coordination, planning with symbolic and learning-based approaches, plan validation and correction, and LLM-robot interaction; 3) motion planning and control: motion planning, control command and reward generation, and trajectory generation and optimization with diffusion models. Furthermore, the survey covers essential environmental setups, including real-world and simulation datasets and platforms used in training and validating these models. It concludes with a discussion on current challenges such as robustness, explainability, data scarcity, and real-time performance, and highlights promising future directions, including retrieval augmented generation, on-device foundation models, and explainability. This survey aims to systematically summarize the latest research trends in applying foundation models to robotics, bridging the gap between the state-of-the-art in artificial intelligence and robotics. By sharing knowledge and resources, this survey is expected to foster the introduction of a new research paradigm for building generalized and autonomous robots.</p>\",\"PeriodicalId\":54965,\"journal\":{\"name\":\"International Journal of Control Automation and Systems\",\"volume\":\"174 1\",\"pages\":\"\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2024-08-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Control Automation and Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s12555-024-0438-7\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Control Automation and Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s12555-024-0438-7","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Unlocking Robotic Autonomy: A Survey on the Applications of Foundation Models
The advancement of foundation models, such as large language models (LLMs), vision-language models (VLMs), diffusion models, and robotics foundation models (RFMs), has become a new paradigm in robotics by offering innovative approaches to the long-standing challenge of building robot autonomy. These models enable the development of robotic agents that can independently understand and reason about semantic contexts, plan actions, physically interact with surroundings, and adapt to new environments and untrained tasks. This paper presents a comprehensive and systematic survey of recent advancements in applying foundation models to robot perception, planning, and control. It introduces the key concepts and terminology associated with foundation models, providing a clear understanding for researchers in robotics and control engineering. The relevant studies are categorized based on how foundation models are utilized in various elements of robotic autonomy, focusing on 1) perception and situational awareness: object detection and classification, semantic understanding, mapping, and navigation; 2) decision making and task planning: mission understanding, task decomposition and coordination, planning with symbolic and learning-based approaches, plan validation and correction, and LLM-robot interaction; 3) motion planning and control: motion planning, control command and reward generation, and trajectory generation and optimization with diffusion models. Furthermore, the survey covers essential environmental setups, including real-world and simulation datasets and platforms used in training and validating these models. It concludes with a discussion on current challenges such as robustness, explainability, data scarcity, and real-time performance, and highlights promising future directions, including retrieval augmented generation, on-device foundation models, and explainability. This survey aims to systematically summarize the latest research trends in applying foundation models to robotics, bridging the gap between the state-of-the-art in artificial intelligence and robotics. By sharing knowledge and resources, this survey is expected to foster the introduction of a new research paradigm for building generalized and autonomous robots.
期刊介绍:
International Journal of Control, Automation and Systems is a joint publication of the Institute of Control, Robotics and Systems (ICROS) and the Korean Institute of Electrical Engineers (KIEE).
The journal covers three closly-related research areas including control, automation, and systems.
The technical areas include
Control Theory
Control Applications
Robotics and Automation
Intelligent and Information Systems
The Journal addresses research areas focused on control, automation, and systems in electrical, mechanical, aerospace, chemical, and industrial engineering in order to create a strong synergy effect throughout the interdisciplinary research areas.