{"title":"LeTO:利用可变轨迹优化学习受限视觉运动策略","authors":"Zhengtong Xu;Yu She","doi":"10.1109/TASE.2024.3486542","DOIUrl":null,"url":null,"abstract":"This paper introduces LeTO, a method for learning constrained visuomotor policy with differentiable trajectory optimization. Our approach integrates a differentiable optimization layer into the neural network. By formulating the optimization layer as a trajectory optimization problem, we enable the model to end-to-end generate actions in a safe and constraint-controlled fashion without extra modules. Our method allows for the introduction of constraint information during the training process, thereby balancing the training objectives of satisfying constraints, smoothing the trajectories, and minimizing errors with demonstrations. This “gray box” method marries optimization-based safety and interpretability with powerful representational abilities of neural networks. We quantitatively evaluate LeTO in simulation and in the real robot. The results demonstrate that LeTO performs well in both simulated and real-world tasks. In addition, it is capable of generating trajectories that are less uncertain, higher quality, and smoother compared to existing imitation learning methods. Therefore, it is shown that LeTO provides a practical example of how to achieve the integration of neural networks with trajectory optimization. We release our code at <uri>https://github.com/ZhengtongXu/LeTO</uri>. Note to Practitioners—LeTO is driven by the goal of developing an imitation learning algorithm capable of generating safe and constraint-satisfying robotic behaviors. The idea of imitation learning is to enable the robot to learn from human demonstrations of certain tasks. Subsequently, the robot is able to autonomously perform the learned tasks on its own. Thanks to the powerful representational and fitting capabilities of neural networks, imitation learning can let robots perform complex manipulation tasks. However, neural networks often exhibit a certain level of uncertainty and lack theoretical safety guarantees. For robotic systems, it is crucial that robot behaviors meet specific constraints; otherwise, the system may not be sufficiently reliable. Therefore, we introduce LeTO, an approach that integrates trajectory optimization with neural networks to generate actions that not only achieve manipulation tasks, but also comply with constraints. This improves the interpretability, safety, and reliability of robot policies acquired through imitation learning, facilitating their deployment in scenarios with high safety requirements.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"8567-8578"},"PeriodicalIF":6.4000,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"LeTO: Learning Constrained Visuomotor Policy With Differentiable Trajectory Optimization\",\"authors\":\"Zhengtong Xu;Yu She\",\"doi\":\"10.1109/TASE.2024.3486542\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper introduces LeTO, a method for learning constrained visuomotor policy with differentiable trajectory optimization. Our approach integrates a differentiable optimization layer into the neural network. By formulating the optimization layer as a trajectory optimization problem, we enable the model to end-to-end generate actions in a safe and constraint-controlled fashion without extra modules. Our method allows for the introduction of constraint information during the training process, thereby balancing the training objectives of satisfying constraints, smoothing the trajectories, and minimizing errors with demonstrations. This “gray box” method marries optimization-based safety and interpretability with powerful representational abilities of neural networks. We quantitatively evaluate LeTO in simulation and in the real robot. The results demonstrate that LeTO performs well in both simulated and real-world tasks. In addition, it is capable of generating trajectories that are less uncertain, higher quality, and smoother compared to existing imitation learning methods. Therefore, it is shown that LeTO provides a practical example of how to achieve the integration of neural networks with trajectory optimization. We release our code at <uri>https://github.com/ZhengtongXu/LeTO</uri>. Note to Practitioners—LeTO is driven by the goal of developing an imitation learning algorithm capable of generating safe and constraint-satisfying robotic behaviors. The idea of imitation learning is to enable the robot to learn from human demonstrations of certain tasks. Subsequently, the robot is able to autonomously perform the learned tasks on its own. Thanks to the powerful representational and fitting capabilities of neural networks, imitation learning can let robots perform complex manipulation tasks. However, neural networks often exhibit a certain level of uncertainty and lack theoretical safety guarantees. For robotic systems, it is crucial that robot behaviors meet specific constraints; otherwise, the system may not be sufficiently reliable. Therefore, we introduce LeTO, an approach that integrates trajectory optimization with neural networks to generate actions that not only achieve manipulation tasks, but also comply with constraints. This improves the interpretability, safety, and reliability of robot policies acquired through imitation learning, facilitating their deployment in scenarios with high safety requirements.\",\"PeriodicalId\":51060,\"journal\":{\"name\":\"IEEE Transactions on Automation Science and Engineering\",\"volume\":\"22 \",\"pages\":\"8567-8578\"},\"PeriodicalIF\":6.4000,\"publicationDate\":\"2024-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Automation Science and Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10740461/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10740461/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
LeTO: Learning Constrained Visuomotor Policy With Differentiable Trajectory Optimization
This paper introduces LeTO, a method for learning constrained visuomotor policy with differentiable trajectory optimization. Our approach integrates a differentiable optimization layer into the neural network. By formulating the optimization layer as a trajectory optimization problem, we enable the model to end-to-end generate actions in a safe and constraint-controlled fashion without extra modules. Our method allows for the introduction of constraint information during the training process, thereby balancing the training objectives of satisfying constraints, smoothing the trajectories, and minimizing errors with demonstrations. This “gray box” method marries optimization-based safety and interpretability with powerful representational abilities of neural networks. We quantitatively evaluate LeTO in simulation and in the real robot. The results demonstrate that LeTO performs well in both simulated and real-world tasks. In addition, it is capable of generating trajectories that are less uncertain, higher quality, and smoother compared to existing imitation learning methods. Therefore, it is shown that LeTO provides a practical example of how to achieve the integration of neural networks with trajectory optimization. We release our code at https://github.com/ZhengtongXu/LeTO. Note to Practitioners—LeTO is driven by the goal of developing an imitation learning algorithm capable of generating safe and constraint-satisfying robotic behaviors. The idea of imitation learning is to enable the robot to learn from human demonstrations of certain tasks. Subsequently, the robot is able to autonomously perform the learned tasks on its own. Thanks to the powerful representational and fitting capabilities of neural networks, imitation learning can let robots perform complex manipulation tasks. However, neural networks often exhibit a certain level of uncertainty and lack theoretical safety guarantees. For robotic systems, it is crucial that robot behaviors meet specific constraints; otherwise, the system may not be sufficiently reliable. Therefore, we introduce LeTO, an approach that integrates trajectory optimization with neural networks to generate actions that not only achieve manipulation tasks, but also comply with constraints. This improves the interpretability, safety, and reliability of robot policies acquired through imitation learning, facilitating their deployment in scenarios with high safety requirements.
期刊介绍:
The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.