{"title":"基于Actor-Critic强化学习的车辆系统近似最优滤波器设计","authors":"Yuming Yin, Shengbo Eben Li, Kaiming Tang, Wenhan Cao, Wei Wu, Hongbo Li","doi":"10.1007/s42154-022-00195-z","DOIUrl":null,"url":null,"abstract":"<div><p>Precise state and parameter estimations are essential for identification, analysis and control of vehicle engineering problems, especially under significant model and measurement uncertainties. The widely used filtering/estimation algorithms, such as Kalman series like Kalman filter, extended Kalman filter, unscented Kalman filter, and particle filter, generally aim to approach the true state/parameter distribution via iteratively updating the filter gain at each time step. However, the optimality of these filters would be deteriorated by unrealistic initial condition or significant model error. Alternatively, this paper proposes to approximate the optimal filter gain by considering the effect factors within infinite time horizon, on the basis of estimation-control duality. The proposed approximate optimal filter (AOF) problem is designed and subsequently solved by actor-critic reinforcement learning (RL) method. The AOF design transforms the traditional optimal filtering problem with the minimum expected mean square error into an optimal control problem with the minimum accumulated estimation error, in which the estimation error is used as the surrogate system state and the infinite-horizon filter gain is the control input. The estimation-control duality is proved to hold when certain conditions about initial vehicle state distributions and policy structure are maintained. In order to evaluate of the effectiveness of AOF, a vehicle state estimation problem is then demonstrated and compared with the steady-state Kalman filter. The results showed that the obtained filter policy via RL with different discount factors can converge to theoretical optimal gain with an error within 5%, and the average estimation errors of vehicle slip angle and yaw rate are less than 1.5 × 10<sup>–4</sup>.</p></div>","PeriodicalId":36310,"journal":{"name":"Automotive Innovation","volume":"5 4","pages":"415 - 426"},"PeriodicalIF":4.8000,"publicationDate":"2022-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Approximate Optimal Filter Design for Vehicle System through Actor-Critic Reinforcement Learning\",\"authors\":\"Yuming Yin, Shengbo Eben Li, Kaiming Tang, Wenhan Cao, Wei Wu, Hongbo Li\",\"doi\":\"10.1007/s42154-022-00195-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Precise state and parameter estimations are essential for identification, analysis and control of vehicle engineering problems, especially under significant model and measurement uncertainties. The widely used filtering/estimation algorithms, such as Kalman series like Kalman filter, extended Kalman filter, unscented Kalman filter, and particle filter, generally aim to approach the true state/parameter distribution via iteratively updating the filter gain at each time step. However, the optimality of these filters would be deteriorated by unrealistic initial condition or significant model error. Alternatively, this paper proposes to approximate the optimal filter gain by considering the effect factors within infinite time horizon, on the basis of estimation-control duality. The proposed approximate optimal filter (AOF) problem is designed and subsequently solved by actor-critic reinforcement learning (RL) method. The AOF design transforms the traditional optimal filtering problem with the minimum expected mean square error into an optimal control problem with the minimum accumulated estimation error, in which the estimation error is used as the surrogate system state and the infinite-horizon filter gain is the control input. The estimation-control duality is proved to hold when certain conditions about initial vehicle state distributions and policy structure are maintained. In order to evaluate of the effectiveness of AOF, a vehicle state estimation problem is then demonstrated and compared with the steady-state Kalman filter. The results showed that the obtained filter policy via RL with different discount factors can converge to theoretical optimal gain with an error within 5%, and the average estimation errors of vehicle slip angle and yaw rate are less than 1.5 × 10<sup>–4</sup>.</p></div>\",\"PeriodicalId\":36310,\"journal\":{\"name\":\"Automotive Innovation\",\"volume\":\"5 4\",\"pages\":\"415 - 426\"},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2022-11-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Automotive Innovation\",\"FirstCategoryId\":\"1087\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s42154-022-00195-z\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automotive Innovation","FirstCategoryId":"1087","ListUrlMain":"https://link.springer.com/article/10.1007/s42154-022-00195-z","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Approximate Optimal Filter Design for Vehicle System through Actor-Critic Reinforcement Learning
Precise state and parameter estimations are essential for identification, analysis and control of vehicle engineering problems, especially under significant model and measurement uncertainties. The widely used filtering/estimation algorithms, such as Kalman series like Kalman filter, extended Kalman filter, unscented Kalman filter, and particle filter, generally aim to approach the true state/parameter distribution via iteratively updating the filter gain at each time step. However, the optimality of these filters would be deteriorated by unrealistic initial condition or significant model error. Alternatively, this paper proposes to approximate the optimal filter gain by considering the effect factors within infinite time horizon, on the basis of estimation-control duality. The proposed approximate optimal filter (AOF) problem is designed and subsequently solved by actor-critic reinforcement learning (RL) method. The AOF design transforms the traditional optimal filtering problem with the minimum expected mean square error into an optimal control problem with the minimum accumulated estimation error, in which the estimation error is used as the surrogate system state and the infinite-horizon filter gain is the control input. The estimation-control duality is proved to hold when certain conditions about initial vehicle state distributions and policy structure are maintained. In order to evaluate of the effectiveness of AOF, a vehicle state estimation problem is then demonstrated and compared with the steady-state Kalman filter. The results showed that the obtained filter policy via RL with different discount factors can converge to theoretical optimal gain with an error within 5%, and the average estimation errors of vehicle slip angle and yaw rate are less than 1.5 × 10–4.
期刊介绍:
Automotive Innovation is dedicated to the publication of innovative findings in the automotive field as well as other related disciplines, covering the principles, methodologies, theoretical studies, experimental studies, product engineering and engineering application. The main topics include but are not limited to: energy-saving, electrification, intelligent and connected, new energy vehicle, safety and lightweight technologies. The journal presents the latest trend and advances of automotive technology.