{"title":"Output Feedback H∞ Control of Unknown Discrete-time Linear Systems: Off-policy Reinforcement Learning","authors":"P. Tooranjipour, Bahare Kiumarsi-Khomartash","doi":"10.1109/CDC45484.2021.9683057","DOIUrl":null,"url":null,"abstract":"In this paper, a data-driven output feedback approach is developed for solving H∞ control problem of linear discrete-time systems based on off-policy reinforcement learning (RL) algorithm. Past input-output measurements are leveraged to implicitly reconstruct the system's states to alleviate the requirement to measure or estimate the system's states. Then, an off-policy input-output Bellman equation is derived based on this implicit reconstruction to evaluate control policies using only input-output measurements. An improved control policy is then learned utilizing the solution to the Bellman equation without knowing the system's dynamics. In the proposed approach, unlike the on-policy methods, the disturbance does not need to be updated in a predefined manner at each iteration, which makes it more practical. While the state-feedback off-policy RL method is shown to be a bias-free approach for deterministic systems, it is shown that once the system's states have been reconstructed from the input-output measurements, the input-output off-policy method cannot be considered as an immune approach against the probing noises. To cope with this, a discount factor is utilized in the performance function to decay the deleterious effect of probing noises. Finally, to illustrate the sensitivity of the problem to the probing noises and the efficacy of the proposed approach, the flight control system is tested in the simulation.","PeriodicalId":229089,"journal":{"name":"2021 60th IEEE Conference on Decision and Control (CDC)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 60th IEEE Conference on Decision and Control (CDC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CDC45484.2021.9683057","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In this paper, a data-driven output feedback approach is developed for solving H∞ control problem of linear discrete-time systems based on off-policy reinforcement learning (RL) algorithm. Past input-output measurements are leveraged to implicitly reconstruct the system's states to alleviate the requirement to measure or estimate the system's states. Then, an off-policy input-output Bellman equation is derived based on this implicit reconstruction to evaluate control policies using only input-output measurements. An improved control policy is then learned utilizing the solution to the Bellman equation without knowing the system's dynamics. In the proposed approach, unlike the on-policy methods, the disturbance does not need to be updated in a predefined manner at each iteration, which makes it more practical. While the state-feedback off-policy RL method is shown to be a bias-free approach for deterministic systems, it is shown that once the system's states have been reconstructed from the input-output measurements, the input-output off-policy method cannot be considered as an immune approach against the probing noises. To cope with this, a discount factor is utilized in the performance function to decay the deleterious effect of probing noises. Finally, to illustrate the sensitivity of the problem to the probing noises and the efficacy of the proposed approach, the flight control system is tested in the simulation.