{"title":"近最优非线性控制的长序列相同动作的乐观规划","authors":"Koppány Máthé, L. Buşoniu, L. Miclea","doi":"10.1109/AQTR.2014.6857826","DOIUrl":null,"url":null,"abstract":"Optimistic planning for deterministic systems (OPD) is an algorithm able to find near-optimal control for very general, nonlinear systems. OPD iteratively builds near-optimal sequences of actions by always refining the most promising sequence; this is done by adding all possible one-step actions. However, OPD has large computational costs, which might be undesirable in real life applications. This paper proposes an adaptation of OPD for a specific subclass of control problems where control actions do not change often (e.g. bang-bang, time-optimal control). The new algorithm is called Optimistic Planning with K identical actions (OKP), and it refines sequences by adding, in addition to one-step actions, also repetitions of each action up to K times. Our analysis proves that the a posteriori performance guarantees are similar to those of OPD, improving with the length of the explored sequences, though the asymptotic behaviour of OKP cannot be formally predicted a priori. Simulations illustrate that for properly chosen parameter K, in a control problem from the class considered, OKP outperforms OPD.","PeriodicalId":297141,"journal":{"name":"2014 IEEE International Conference on Automation, Quality and Testing, Robotics","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimistic planning with long sequences of identical actions for near-optimal nonlinear control\",\"authors\":\"Koppány Máthé, L. Buşoniu, L. Miclea\",\"doi\":\"10.1109/AQTR.2014.6857826\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Optimistic planning for deterministic systems (OPD) is an algorithm able to find near-optimal control for very general, nonlinear systems. OPD iteratively builds near-optimal sequences of actions by always refining the most promising sequence; this is done by adding all possible one-step actions. However, OPD has large computational costs, which might be undesirable in real life applications. This paper proposes an adaptation of OPD for a specific subclass of control problems where control actions do not change often (e.g. bang-bang, time-optimal control). The new algorithm is called Optimistic Planning with K identical actions (OKP), and it refines sequences by adding, in addition to one-step actions, also repetitions of each action up to K times. Our analysis proves that the a posteriori performance guarantees are similar to those of OPD, improving with the length of the explored sequences, though the asymptotic behaviour of OKP cannot be formally predicted a priori. Simulations illustrate that for properly chosen parameter K, in a control problem from the class considered, OKP outperforms OPD.\",\"PeriodicalId\":297141,\"journal\":{\"name\":\"2014 IEEE International Conference on Automation, Quality and Testing, Robotics\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-05-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE International Conference on Automation, Quality and Testing, Robotics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AQTR.2014.6857826\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Conference on Automation, Quality and Testing, Robotics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AQTR.2014.6857826","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Optimistic planning with long sequences of identical actions for near-optimal nonlinear control
Optimistic planning for deterministic systems (OPD) is an algorithm able to find near-optimal control for very general, nonlinear systems. OPD iteratively builds near-optimal sequences of actions by always refining the most promising sequence; this is done by adding all possible one-step actions. However, OPD has large computational costs, which might be undesirable in real life applications. This paper proposes an adaptation of OPD for a specific subclass of control problems where control actions do not change often (e.g. bang-bang, time-optimal control). The new algorithm is called Optimistic Planning with K identical actions (OKP), and it refines sequences by adding, in addition to one-step actions, also repetitions of each action up to K times. Our analysis proves that the a posteriori performance guarantees are similar to those of OPD, improving with the length of the explored sequences, though the asymptotic behaviour of OKP cannot be formally predicted a priori. Simulations illustrate that for properly chosen parameter K, in a control problem from the class considered, OKP outperforms OPD.