Approximate Optimal Filter Design for Vehicle System through Actor-Critic Reinforcement Learning

IF 4.8 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Automotive Innovation Pub Date : 2022-11-04 DOI:10.1007/s42154-022-00195-z
Yuming Yin, Shengbo Eben Li, Kaiming Tang, Wenhan Cao, Wei Wu, Hongbo Li
{"title":"Approximate Optimal Filter Design for Vehicle System through Actor-Critic Reinforcement Learning","authors":"Yuming Yin,&nbsp;Shengbo Eben Li,&nbsp;Kaiming Tang,&nbsp;Wenhan Cao,&nbsp;Wei Wu,&nbsp;Hongbo Li","doi":"10.1007/s42154-022-00195-z","DOIUrl":null,"url":null,"abstract":"<div><p>Precise state and parameter estimations are essential for identification, analysis and control of vehicle engineering problems, especially under significant model and measurement uncertainties. The widely used filtering/estimation algorithms, such as Kalman series like Kalman filter, extended Kalman filter, unscented Kalman filter, and particle filter, generally aim to approach the true state/parameter distribution via iteratively updating the filter gain at each time step. However, the optimality of these filters would be deteriorated by unrealistic initial condition or significant model error. Alternatively, this paper proposes to approximate the optimal filter gain by considering the effect factors within infinite time horizon, on the basis of estimation-control duality. The proposed approximate optimal filter (AOF) problem is designed and subsequently solved by actor-critic reinforcement learning (RL) method. The AOF design transforms the traditional optimal filtering problem with the minimum expected mean square error into an optimal control problem with the minimum accumulated estimation error, in which the estimation error is used as the surrogate system state and the infinite-horizon filter gain is the control input. The estimation-control duality is proved to hold when certain conditions about initial vehicle state distributions and policy structure are maintained. In order to evaluate of the effectiveness of AOF, a vehicle state estimation problem is then demonstrated and compared with the steady-state Kalman filter. The results showed that the obtained filter policy via RL with different discount factors can converge to theoretical optimal gain with an error within 5%, and the average estimation errors of vehicle slip angle and yaw rate are less than 1.5 × 10<sup>–4</sup>.</p></div>","PeriodicalId":36310,"journal":{"name":"Automotive Innovation","volume":"5 4","pages":"415 - 426"},"PeriodicalIF":4.8000,"publicationDate":"2022-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automotive Innovation","FirstCategoryId":"1087","ListUrlMain":"https://link.springer.com/article/10.1007/s42154-022-00195-z","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 1

Abstract

Precise state and parameter estimations are essential for identification, analysis and control of vehicle engineering problems, especially under significant model and measurement uncertainties. The widely used filtering/estimation algorithms, such as Kalman series like Kalman filter, extended Kalman filter, unscented Kalman filter, and particle filter, generally aim to approach the true state/parameter distribution via iteratively updating the filter gain at each time step. However, the optimality of these filters would be deteriorated by unrealistic initial condition or significant model error. Alternatively, this paper proposes to approximate the optimal filter gain by considering the effect factors within infinite time horizon, on the basis of estimation-control duality. The proposed approximate optimal filter (AOF) problem is designed and subsequently solved by actor-critic reinforcement learning (RL) method. The AOF design transforms the traditional optimal filtering problem with the minimum expected mean square error into an optimal control problem with the minimum accumulated estimation error, in which the estimation error is used as the surrogate system state and the infinite-horizon filter gain is the control input. The estimation-control duality is proved to hold when certain conditions about initial vehicle state distributions and policy structure are maintained. In order to evaluate of the effectiveness of AOF, a vehicle state estimation problem is then demonstrated and compared with the steady-state Kalman filter. The results showed that the obtained filter policy via RL with different discount factors can converge to theoretical optimal gain with an error within 5%, and the average estimation errors of vehicle slip angle and yaw rate are less than 1.5 × 10–4.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于Actor-Critic强化学习的车辆系统近似最优滤波器设计
精确的状态和参数估计对于车辆工程问题的识别、分析和控制至关重要,特别是在模型和测量存在重大不确定性的情况下。目前广泛使用的滤波/估计算法,如卡尔曼滤波、扩展卡尔曼滤波、无气味卡尔曼滤波和粒子滤波等卡尔曼级数算法,一般都是通过在每个时间步迭代更新滤波器增益来接近真实状态/参数分布。然而,这些滤波器的最优性会因不现实的初始条件或显著的模型误差而降低。或者,本文提出在估计-控制对偶性的基础上,通过考虑无限时间范围内的影响因素来近似最优滤波器增益。设计了近似最优滤波器(AOF)问题,并采用行为-评价强化学习(RL)方法进行求解。AOF设计将传统的期望均方误差最小的最优滤波问题转化为累积估计误差最小的最优控制问题,其中估计误差作为系统状态的代理,无限水平滤波器增益作为控制输入。证明了当初始车辆状态分布和策略结构保持一定条件时,估计-控制对偶性成立。为了评价AOF算法的有效性,给出了一个车辆状态估计问题,并与稳态卡尔曼滤波进行了比较。结果表明,采用不同折现因子的RL得到的滤波策略均能收敛到理论最优增益,误差在5%以内,车辆偏转角和横摆角速度的平均估计误差小于1.5 × 10-4。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Automotive Innovation
Automotive Innovation Engineering-Automotive Engineering
CiteScore
8.50
自引率
4.90%
发文量
36
期刊介绍: Automotive Innovation is dedicated to the publication of innovative findings in the automotive field as well as other related disciplines, covering the principles, methodologies, theoretical studies, experimental studies, product engineering and engineering application. The main topics include but are not limited to: energy-saving, electrification, intelligent and connected, new energy vehicle, safety and lightweight technologies. The journal presents the latest trend and advances of automotive technology.
期刊最新文献
Driver Steering Behaviour Modelling Based on Neuromuscular Dynamics and Multi-Task Time-Series Transformer Mechanically Joined Extrusion Profiles for Battery Trays Mode Switching and Consistency Control for Electric-Hydraulic Hybrid Steering System Review of Electrical and Electronic Architectures for Autonomous Vehicles: Topologies, Networking and Simulators In-Vehicle Network Injection Attacks Detection Based on Feature Selection and Classification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1