Arvind Car, Sai Sravan Yarlagadda, Alison Bartsch, Abraham George, Amir Barati Farimani
{"title":"PLATO:利用 LLM 和 Affordances 进行工具操作规划","authors":"Arvind Car, Sai Sravan Yarlagadda, Alison Bartsch, Abraham George, Amir Barati Farimani","doi":"arxiv-2409.11580","DOIUrl":null,"url":null,"abstract":"As robotic systems become increasingly integrated into complex real-world\nenvironments, there is a growing need for approaches that enable robots to\nunderstand and act upon natural language instructions without relying on\nextensive pre-programmed knowledge of their surroundings. This paper presents\nPLATO, an innovative system that addresses this challenge by leveraging\nspecialized large language model agents to process natural language inputs,\nunderstand the environment, predict tool affordances, and generate executable\nactions for robotic systems. Unlike traditional systems that depend on\nhard-coded environmental information, PLATO employs a modular architecture of\nspecialized agents to operate without any initial knowledge of the environment.\nThese agents identify objects and their locations within the scene, generate a\ncomprehensive high-level plan, translate this plan into a series of low-level\nactions, and verify the completion of each step. The system is particularly\ntested on challenging tool-use tasks, which involve handling diverse objects\nand require long-horizon planning. PLATO's design allows it to adapt to dynamic\nand unstructured settings, significantly enhancing its flexibility and\nrobustness. By evaluating the system across various complex scenarios, we\ndemonstrate its capability to tackle a diverse range of tasks and offer a novel\nsolution to integrate LLMs with robotic platforms, advancing the\nstate-of-the-art in autonomous robotic task execution. For videos and prompt\ndetails, please see our project website:\nhttps://sites.google.com/andrew.cmu.edu/plato","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PLATO: Planning with LLMs and Affordances for Tool Manipulation\",\"authors\":\"Arvind Car, Sai Sravan Yarlagadda, Alison Bartsch, Abraham George, Amir Barati Farimani\",\"doi\":\"arxiv-2409.11580\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As robotic systems become increasingly integrated into complex real-world\\nenvironments, there is a growing need for approaches that enable robots to\\nunderstand and act upon natural language instructions without relying on\\nextensive pre-programmed knowledge of their surroundings. This paper presents\\nPLATO, an innovative system that addresses this challenge by leveraging\\nspecialized large language model agents to process natural language inputs,\\nunderstand the environment, predict tool affordances, and generate executable\\nactions for robotic systems. Unlike traditional systems that depend on\\nhard-coded environmental information, PLATO employs a modular architecture of\\nspecialized agents to operate without any initial knowledge of the environment.\\nThese agents identify objects and their locations within the scene, generate a\\ncomprehensive high-level plan, translate this plan into a series of low-level\\nactions, and verify the completion of each step. The system is particularly\\ntested on challenging tool-use tasks, which involve handling diverse objects\\nand require long-horizon planning. PLATO's design allows it to adapt to dynamic\\nand unstructured settings, significantly enhancing its flexibility and\\nrobustness. By evaluating the system across various complex scenarios, we\\ndemonstrate its capability to tackle a diverse range of tasks and offer a novel\\nsolution to integrate LLMs with robotic platforms, advancing the\\nstate-of-the-art in autonomous robotic task execution. For videos and prompt\\ndetails, please see our project website:\\nhttps://sites.google.com/andrew.cmu.edu/plato\",\"PeriodicalId\":501031,\"journal\":{\"name\":\"arXiv - CS - Robotics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Robotics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11580\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Robotics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11580","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
PLATO: Planning with LLMs and Affordances for Tool Manipulation
As robotic systems become increasingly integrated into complex real-world
environments, there is a growing need for approaches that enable robots to
understand and act upon natural language instructions without relying on
extensive pre-programmed knowledge of their surroundings. This paper presents
PLATO, an innovative system that addresses this challenge by leveraging
specialized large language model agents to process natural language inputs,
understand the environment, predict tool affordances, and generate executable
actions for robotic systems. Unlike traditional systems that depend on
hard-coded environmental information, PLATO employs a modular architecture of
specialized agents to operate without any initial knowledge of the environment.
These agents identify objects and their locations within the scene, generate a
comprehensive high-level plan, translate this plan into a series of low-level
actions, and verify the completion of each step. The system is particularly
tested on challenging tool-use tasks, which involve handling diverse objects
and require long-horizon planning. PLATO's design allows it to adapt to dynamic
and unstructured settings, significantly enhancing its flexibility and
robustness. By evaluating the system across various complex scenarios, we
demonstrate its capability to tackle a diverse range of tasks and offer a novel
solution to integrate LLMs with robotic platforms, advancing the
state-of-the-art in autonomous robotic task execution. For videos and prompt
details, please see our project website:
https://sites.google.com/andrew.cmu.edu/plato