GOPS:用于自动驾驶和工业控制应用的通用最优控制问题求解器

IF 12.5 Q1 TRANSPORTATION Communications in Transportation Research Pub Date : 2023-04-17 DOI:10.1016/j.commtr.2023.100096

Wenxuan Wang, Yuhang Zhang, Jiaxin Gao, Yuxuan Jiang, Yujie Yang, Zhilong Zheng, Wenjun Zou, Jie Li, Congsheng Zhang, Wenhan Cao, Genjin Xie, Jingliang Duan, Shengbo Eben Li

{"title":"GOPS:用于自动驾驶和工业控制应用的通用最优控制问题求解器","authors":"Wenxuan Wang, Yuhang Zhang, Jiaxin Gao, Yuxuan Jiang, Yujie Yang, Zhilong Zheng, Wenjun Zou, Jie Li, Congsheng Zhang, Wenhan Cao, Genjin Xie, Jingliang Duan, Shengbo Eben Li","doi":"10.1016/j.commtr.2023.100096","DOIUrl":null,"url":null,"abstract":"<div><p>Solving optimal control problems serves as the basic demand of industrial control tasks. Existing methods like model predictive control often suffer from heavy online computational burdens. Reinforcement learning has shown promise in computer and board games but has yet to be widely adopted in industrial applications due to a lack of accessible, high-accuracy solvers. Current Reinforcement learning (RL) solvers are often developed for academic research and require a significant amount of theoretical knowledge and programming skills. Besides, many of them only support Python-based environments and limit to model-free algorithms. To address this gap, this paper develops General Optimal control Problems Solver (GOPS), an easy-to-use RL solver package that aims to build real-time and high-performance controllers in industrial fields. GOPS is built with a highly modular structure that retains a flexible framework for secondary development. Considering the diversity of industrial control tasks, GOPS also includes a conversion tool that allows for the use of Matlab/Simulink to support environment construction, controller design, and performance validation. To handle large-scale problems, GOPS can automatically create various serial and parallel trainers by flexibly combining embedded buffers and samplers. It offers a variety of common approximate functions for policy and value functions, including polynomial, multilayer perceptron, convolutional neural network, etc. Additionally, constrained and robust algorithms for special industrial control systems with state constraints and model uncertainties are also integrated into GOPS. Several examples, including linear quadratic control, inverted double pendulum, vehicle tracking, humanoid robot, obstacle avoidance, and active suspension control, are tested to verify the performances of GOPS.</p></div>","PeriodicalId":100292,"journal":{"name":"Communications in Transportation Research","volume":null,"pages":null},"PeriodicalIF":12.5000,"publicationDate":"2023-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"GOPS: A general optimal control problem solver for autonomous driving and industrial control applications\",\"authors\":\"Wenxuan Wang, Yuhang Zhang, Jiaxin Gao, Yuxuan Jiang, Yujie Yang, Zhilong Zheng, Wenjun Zou, Jie Li, Congsheng Zhang, Wenhan Cao, Genjin Xie, Jingliang Duan, Shengbo Eben Li\",\"doi\":\"10.1016/j.commtr.2023.100096\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Solving optimal control problems serves as the basic demand of industrial control tasks. Existing methods like model predictive control often suffer from heavy online computational burdens. Reinforcement learning has shown promise in computer and board games but has yet to be widely adopted in industrial applications due to a lack of accessible, high-accuracy solvers. Current Reinforcement learning (RL) solvers are often developed for academic research and require a significant amount of theoretical knowledge and programming skills. Besides, many of them only support Python-based environments and limit to model-free algorithms. To address this gap, this paper develops General Optimal control Problems Solver (GOPS), an easy-to-use RL solver package that aims to build real-time and high-performance controllers in industrial fields. GOPS is built with a highly modular structure that retains a flexible framework for secondary development. Considering the diversity of industrial control tasks, GOPS also includes a conversion tool that allows for the use of Matlab/Simulink to support environment construction, controller design, and performance validation. To handle large-scale problems, GOPS can automatically create various serial and parallel trainers by flexibly combining embedded buffers and samplers. It offers a variety of common approximate functions for policy and value functions, including polynomial, multilayer perceptron, convolutional neural network, etc. Additionally, constrained and robust algorithms for special industrial control systems with state constraints and model uncertainties are also integrated into GOPS. Several examples, including linear quadratic control, inverted double pendulum, vehicle tracking, humanoid robot, obstacle avoidance, and active suspension control, are tested to verify the performances of GOPS.</p></div>\",\"PeriodicalId\":100292,\"journal\":{\"name\":\"Communications in Transportation Research\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":12.5000,\"publicationDate\":\"2023-04-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Communications in Transportation Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772424723000070\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"TRANSPORTATION\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications in Transportation Research","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772424723000070","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TRANSPORTATION","Score":null,"Total":0}

引用次数: 4

摘要

解决最优控制问题是工业控制任务的基本要求。现有的方法，如模型预测控制，经常遭受沉重的在线计算负担。强化学习在计算机和棋盘游戏中显示出了前景，但由于缺乏可访问的高精度求解器，尚未在工业应用中广泛采用。当前的强化学习（RL）求解器通常是为学术研究而开发的，需要大量的理论知识和编程技能。此外，它们中的许多只支持基于Python的环境，并且仅限于无模型算法。为了解决这一差距，本文开发了通用最优控制问题求解器（GOPS），这是一个易于使用的RL求解器包，旨在构建工业领域的实时和高性能控制器。GOPS采用高度模块化的结构，为二次开发保留了灵活的框架。考虑到工业控制任务的多样性，GOPS还包括一个转换工具，该工具允许使用Matlab/Simulink来支持环境构建、控制器设计和性能验证。为了处理大规模问题，GOPS可以通过灵活组合嵌入式缓冲区和采样器来自动创建各种串行和并行训练器。它为策略和值函数提供了各种常见的近似函数，包括多项式、多层感知器、卷积神经网络等。此外，具有状态约束和模型不确定性的特殊工业控制系统的约束和鲁棒算法也被集成到GOPS中。通过线性二次控制、倒立摆、车辆跟踪、仿人机器人、避障和主动悬架控制等实例验证了GOPS的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

GOPS: A general optimal control problem solver for autonomous driving and industrial control applications

Solving optimal control problems serves as the basic demand of industrial control tasks. Existing methods like model predictive control often suffer from heavy online computational burdens. Reinforcement learning has shown promise in computer and board games but has yet to be widely adopted in industrial applications due to a lack of accessible, high-accuracy solvers. Current Reinforcement learning (RL) solvers are often developed for academic research and require a significant amount of theoretical knowledge and programming skills. Besides, many of them only support Python-based environments and limit to model-free algorithms. To address this gap, this paper develops General Optimal control Problems Solver (GOPS), an easy-to-use RL solver package that aims to build real-time and high-performance controllers in industrial fields. GOPS is built with a highly modular structure that retains a flexible framework for secondary development. Considering the diversity of industrial control tasks, GOPS also includes a conversion tool that allows for the use of Matlab/Simulink to support environment construction, controller design, and performance validation. To handle large-scale problems, GOPS can automatically create various serial and parallel trainers by flexibly combining embedded buffers and samplers. It offers a variety of common approximate functions for policy and value functions, including polynomial, multilayer perceptron, convolutional neural network, etc. Additionally, constrained and robust algorithms for special industrial control systems with state constraints and model uncertainties are also integrated into GOPS. Several examples, including linear quadratic control, inverted double pendulum, vehicle tracking, humanoid robot, obstacle avoidance, and active suspension control, are tested to verify the performances of GOPS.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Communications in Transportation Research

CiteScore

15.20

自引率

0.00%

发文量