Farzaneh Tatari, M. Naghibi-Sistani, K. Vamvoudakis
{"title":"Distributed optimal synchronization control of linear networked systems under unknown dynamics","authors":"Farzaneh Tatari, M. Naghibi-Sistani, K. Vamvoudakis","doi":"10.23919/ACC.2017.7963029","DOIUrl":null,"url":null,"abstract":"This work proposes an online optimal distributed learning algorithm to find the game theoretic solution of systems on graphs with completely unknown dynamics. The proposed algorithm learns online the approximate solution to the cooperative coupled Hamilton-Jacobi (HJ) equations. Each player employs an actor/critic network structure to learn the optimal cost and the optimal policy along with intelligent identifiers to obviate the knowledge of the system dynamics. We use recorded experiences concurrently with current data to guarantee proper state exploration. The closed-loop system is proved to be stable and the policies form a Nash equilibrium. Finally, simulation results verify the effectiveness of the proposed approach.","PeriodicalId":422926,"journal":{"name":"2017 American Control Conference (ACC)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 American Control Conference (ACC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ACC.2017.7963029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
This work proposes an online optimal distributed learning algorithm to find the game theoretic solution of systems on graphs with completely unknown dynamics. The proposed algorithm learns online the approximate solution to the cooperative coupled Hamilton-Jacobi (HJ) equations. Each player employs an actor/critic network structure to learn the optimal cost and the optimal policy along with intelligent identifiers to obviate the knowledge of the system dynamics. We use recorded experiences concurrently with current data to guarantee proper state exploration. The closed-loop system is proved to be stable and the policies form a Nash equilibrium. Finally, simulation results verify the effectiveness of the proposed approach.