首页 > 最新文献

2021 60th IEEE Conference on Decision and Control (CDC)最新文献

英文 中文
Library-Based Norm-Optimal Iterative Learning Control 基于库的范数最优迭代学习控制
Pub Date : 2021-12-14 DOI: 10.1109/CDC45484.2021.9682812
James Reed, Maxwell J. Wu, K. Barton, C. Vermillion, K. Mishra
This paper presents a new iterative learning control (ILC) methodology, termed library-based norm-optimal ILC, which optimally accounts for variations in measurable disturbances and plant parameters from one iteration to the next. In this formulation, previous iteration-varying disturbance and/or plant parameters, along with the corresponding control and error sequences, are intelligently maintained in a dynamically evolving library. The library is then referenced at each iteration, in order to base the new control sequence on the most relevant prior iterations, according to an optimization metric. In contrast with the limited number of library-based ILC methodologies pursued in the literature, the present work (i) selects provably optimal interpolation weights, (ii) presents methods for starting with an empty library and intelligently truncating the library when it becomes too large, and (iii) demonstrates convergence to an optimal performance value. To demonstrate the effectiveness of our new methodology, we simulate our library-based norm-optimal ILC method on a linear time-varying model of a micro-robotic deposition system.
本文提出了一种新的迭代学习控制(ILC)方法,称为基于库的规范-最优ILC,它最优地解释了可测量干扰和植物参数从一次迭代到下一次迭代的变化。在这个公式中,先前的迭代变化的扰动和/或对象参数,以及相应的控制和误差序列,被智能地保存在一个动态进化的库中。然后在每次迭代时引用库,以便根据优化度量,在最相关的先前迭代上建立新的控制序列。与文献中所追求的基于库的有限数量的ILC方法相比,本工作(i)选择可证明的最优插值权重,(ii)提出从空库开始并在库变得太大时智能截断库的方法,以及(iii)展示收敛到最优性能值。为了证明我们新方法的有效性,我们在微机器人沉积系统的线性时变模型上模拟了基于库的规范最优ILC方法。
{"title":"Library-Based Norm-Optimal Iterative Learning Control","authors":"James Reed, Maxwell J. Wu, K. Barton, C. Vermillion, K. Mishra","doi":"10.1109/CDC45484.2021.9682812","DOIUrl":"https://doi.org/10.1109/CDC45484.2021.9682812","url":null,"abstract":"This paper presents a new iterative learning control (ILC) methodology, termed library-based norm-optimal ILC, which optimally accounts for variations in measurable disturbances and plant parameters from one iteration to the next. In this formulation, previous iteration-varying disturbance and/or plant parameters, along with the corresponding control and error sequences, are intelligently maintained in a dynamically evolving library. The library is then referenced at each iteration, in order to base the new control sequence on the most relevant prior iterations, according to an optimization metric. In contrast with the limited number of library-based ILC methodologies pursued in the literature, the present work (i) selects provably optimal interpolation weights, (ii) presents methods for starting with an empty library and intelligently truncating the library when it becomes too large, and (iii) demonstrates convergence to an optimal performance value. To demonstrate the effectiveness of our new methodology, we simulate our library-based norm-optimal ILC method on a linear time-varying model of a micro-robotic deposition system.","PeriodicalId":229089,"journal":{"name":"2021 60th IEEE Conference on Decision and Control (CDC)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123560006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Controlling Epidemics via Testing 通过检测控制流行病
Pub Date : 2021-12-14 DOI: 10.1109/CDC45484.2021.9683289
Kyriakos Lotidis, A. L. Moustakas, N. Bambos
In this paper, we focus on the effect that testing centers (which detect and quarantine infected individuals) have on mitigating the evolution of an epidemic. We incorporate diffusion-style mobility of infected but undetected individuals, as opposed to detected and quarantined ones. We compute the total and maximum (over time) spatially averaged density of infected individuals (detected or not), which are useful metrics of the epidemic’s impact on a population, as functions of the testing center spatial density.Even under conditions where the epidemic has the natural potential to spread, we find that a ‘phase transition’ occurs as the testing center spatial density increases. For any testing density above a certain threshold the epidemic is suppressed and dies out, while below it propagates and evolves naturally albeit still strongly depending on the testing center density. This analysis further allows to optimize the testing certain density so that the epidemic’s evolution does not inundate or exhaust critical health care resources, like ICU bed capacity.
在本文中,我们重点关注检测中心(检测和隔离受感染个体)在缓解流行病演变方面的作用。我们将感染但未被发现的个体的扩散型流动性纳入其中,而不是被发现和隔离的个体。我们计算受感染个体(检测或未检测)的总和最大(随时间)空间平均密度,这是流行病对人口影响的有用指标,作为测试中心空间密度的函数。即使在疫情具有自然传播潜力的条件下,我们发现随着检测中心空间密度的增加,也会发生“相变”。对于任何超过某一阈值的检测密度,流行病都受到抑制并消失,而低于该阈值,它就会自然传播和进化,尽管仍然强烈依赖于检测中心的密度。这一分析进一步优化了检测密度,使疫情的演变不会淹没或耗尽关键的卫生保健资源,如ICU床位容量。
{"title":"Controlling Epidemics via Testing","authors":"Kyriakos Lotidis, A. L. Moustakas, N. Bambos","doi":"10.1109/CDC45484.2021.9683289","DOIUrl":"https://doi.org/10.1109/CDC45484.2021.9683289","url":null,"abstract":"In this paper, we focus on the effect that testing centers (which detect and quarantine infected individuals) have on mitigating the evolution of an epidemic. We incorporate diffusion-style mobility of infected but undetected individuals, as opposed to detected and quarantined ones. We compute the total and maximum (over time) spatially averaged density of infected individuals (detected or not), which are useful metrics of the epidemic’s impact on a population, as functions of the testing center spatial density.Even under conditions where the epidemic has the natural potential to spread, we find that a ‘phase transition’ occurs as the testing center spatial density increases. For any testing density above a certain threshold the epidemic is suppressed and dies out, while below it propagates and evolves naturally albeit still strongly depending on the testing center density. This analysis further allows to optimize the testing certain density so that the epidemic’s evolution does not inundate or exhaust critical health care resources, like ICU bed capacity.","PeriodicalId":229089,"journal":{"name":"2021 60th IEEE Conference on Decision and Control (CDC)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122123431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Distributionally Robust LQR for Systems with Multiple Uncertain Players 多不确定参与人系统的分布鲁棒LQR
Pub Date : 2021-12-14 DOI: 10.1109/CDC45484.2021.9682976
Ioannis Tzortzis, C. D. Charalambous, C. Hadjicostis
In this paper, we study the robust linear quadratic regulator (LQR) problem for a class of discrete-time dynamical systems composed of several uncertain players with unknown or ambiguous distribution information. A distinctive feature of the assumed model is that each player is prescribed by a nominal probability distribution and categorized according to an uncertainty level of confidence. Our approach is based on minimax optimization. By following a dynamic programming approach a closed-form expression of the robust control policy is derived. The effect of ambiguity on the performance of the LQR is studied via a sequential hierarchical game with one leader and several followers. The equilibrium solution is obtained through a maximizing, time-varying probability distribution characterizing each player’s optimal policy. The behavior of the proposed method is demonstrated through an application to a drop-shipping retail fulfillment model.
本文研究了一类由若干不确定参与者组成的具有未知或模糊分布信息的离散动力系统的鲁棒线性二次型调节器(LQR)问题。假设模型的一个显著特征是,每个参与者都由名义概率分布规定,并根据不确定的置信度进行分类。我们的方法是基于极大极小优化。采用动态规划方法,导出了鲁棒控制策略的封闭表达式。通过一个有一个领导者和几个追随者的顺序层级博弈,研究了模糊性对LQR性能的影响。均衡解是通过描述每个参与者的最优策略的最大化时变概率分布得到的。提出的方法的行为是通过一个应用程序到投递零售履行模型。
{"title":"A Distributionally Robust LQR for Systems with Multiple Uncertain Players","authors":"Ioannis Tzortzis, C. D. Charalambous, C. Hadjicostis","doi":"10.1109/CDC45484.2021.9682976","DOIUrl":"https://doi.org/10.1109/CDC45484.2021.9682976","url":null,"abstract":"In this paper, we study the robust linear quadratic regulator (LQR) problem for a class of discrete-time dynamical systems composed of several uncertain players with unknown or ambiguous distribution information. A distinctive feature of the assumed model is that each player is prescribed by a nominal probability distribution and categorized according to an uncertainty level of confidence. Our approach is based on minimax optimization. By following a dynamic programming approach a closed-form expression of the robust control policy is derived. The effect of ambiguity on the performance of the LQR is studied via a sequential hierarchical game with one leader and several followers. The equilibrium solution is obtained through a maximizing, time-varying probability distribution characterizing each player’s optimal policy. The behavior of the proposed method is demonstrated through an application to a drop-shipping retail fulfillment model.","PeriodicalId":229089,"journal":{"name":"2021 60th IEEE Conference on Decision and Control (CDC)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125956731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Controllability of Sobolev-Type Linear Ensemble Systems sobolev型线性系综系统的可控性
Pub Date : 2021-12-14 DOI: 10.1109/CDC45484.2021.9683659
Wei Zhang, Lin Tie, Jr-Shin Li
Systems composed of large ensembles of isolated or interacted dynamic units are prevalent in nature and engineered infrastructures. Linear ensemble systems are inarguably the simplest class of ensemble systems and have attracted intensive attention to control theorists and practionars in the past years. Comprehensive understanding of dynamic properties of such systems yet remains far-fetched and requires considerable knowledge and techniques beyond the reach of modern control theory. In this paper, we explore the classes of linear ensemble systems with system matrices that are not globally diagonalizable. In particular, we focus on analyzing their controllability properties under a Sobolev space setting and develop conditions under which uniform controllability of such ensemble systems is equivalent to that of their diagonalizable counterparts. This development significantly facilitates controllability analysis for linear ensemble systems through examining diagonalized linear systems.
由大量孤立或相互作用的动态单元组成的系统在自然界和工程基础设施中普遍存在。线性集成系统无疑是集成系统中最简单的一类,近年来引起了控制理论家和实践者的广泛关注。对这类系统的动态特性的全面理解仍然是遥不可及的,需要大量的知识和技术,超出了现代控制理论的范围。本文研究了系统矩阵不可全局对角化的线性系综系统。特别地,我们重点分析了它们在Sobolev空间下的可控性,并给出了这些系综系统的均匀可控性与对角化系综系统的一致可控性等价的条件。这一发展通过考察对角化线性系统,极大地促进了线性系综系统的可控性分析。
{"title":"Controllability of Sobolev-Type Linear Ensemble Systems","authors":"Wei Zhang, Lin Tie, Jr-Shin Li","doi":"10.1109/CDC45484.2021.9683659","DOIUrl":"https://doi.org/10.1109/CDC45484.2021.9683659","url":null,"abstract":"Systems composed of large ensembles of isolated or interacted dynamic units are prevalent in nature and engineered infrastructures. Linear ensemble systems are inarguably the simplest class of ensemble systems and have attracted intensive attention to control theorists and practionars in the past years. Comprehensive understanding of dynamic properties of such systems yet remains far-fetched and requires considerable knowledge and techniques beyond the reach of modern control theory. In this paper, we explore the classes of linear ensemble systems with system matrices that are not globally diagonalizable. In particular, we focus on analyzing their controllability properties under a Sobolev space setting and develop conditions under which uniform controllability of such ensemble systems is equivalent to that of their diagonalizable counterparts. This development significantly facilitates controllability analysis for linear ensemble systems through examining diagonalized linear systems.","PeriodicalId":229089,"journal":{"name":"2021 60th IEEE Conference on Decision and Control (CDC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126194237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Linear Quadratic Tracking Control of Hidden Markov Jump Linear Systems Subject to Ambiguity 模糊条件下隐马尔可夫跳变线性系统的线性二次跟踪控制
Pub Date : 2021-12-14 DOI: 10.1109/CDC45484.2021.9683675
Ioannis Tzortzis, C. Hadjicostis, C. D. Charalambous
The linear quadratic tracking control problem is studied for a class of discrete-time uncertain Markov jump linear systems with time-varying conditional distributions. The controller is designed under the assumption that it has no access to the true states of the Markov chain, but rather it depends on the Markov chain state estimates. To deal with uncertainty, the transition probabilities of Markov state estimates between the different operating modes of the system are considered to belong in an ambiguity set of some nominal transition probabilities. The estimation problem is solved via the one-step forward Viterbi algorithm, while the stochastic control problem is solved via minimax optimization theory. An optimal control policy with some desired robustness properties is designed, and a maximizing time-varying transition probability distribution is obtained. A numerical example is given to illustrate the applicability and effectiveness of the proposed approach.
研究了一类具有时变条件分布的离散不确定马尔可夫跳变线性系统的线性二次跟踪控制问题。该控制器是在无法访问马尔可夫链的真实状态的假设下设计的,而是依赖于马尔可夫链的状态估计。为了处理不确定性,将系统不同运行模式之间的马尔可夫状态估计的转移概率考虑为属于一些标称转移概率的模糊集。估计问题采用一步前向Viterbi算法解决,随机控制问题采用极大极小优化理论解决。设计了具有理想鲁棒性的最优控制策略,得到了最大时变转移概率分布。算例说明了该方法的适用性和有效性。
{"title":"Linear Quadratic Tracking Control of Hidden Markov Jump Linear Systems Subject to Ambiguity","authors":"Ioannis Tzortzis, C. Hadjicostis, C. D. Charalambous","doi":"10.1109/CDC45484.2021.9683675","DOIUrl":"https://doi.org/10.1109/CDC45484.2021.9683675","url":null,"abstract":"The linear quadratic tracking control problem is studied for a class of discrete-time uncertain Markov jump linear systems with time-varying conditional distributions. The controller is designed under the assumption that it has no access to the true states of the Markov chain, but rather it depends on the Markov chain state estimates. To deal with uncertainty, the transition probabilities of Markov state estimates between the different operating modes of the system are considered to belong in an ambiguity set of some nominal transition probabilities. The estimation problem is solved via the one-step forward Viterbi algorithm, while the stochastic control problem is solved via minimax optimization theory. An optimal control policy with some desired robustness properties is designed, and a maximizing time-varying transition probability distribution is obtained. A numerical example is given to illustrate the applicability and effectiveness of the proposed approach.","PeriodicalId":229089,"journal":{"name":"2021 60th IEEE Conference on Decision and Control (CDC)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124773159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distributed Consensus of Stochastic Multi-agent Systems with Prescribed Performance Constraints 具有指定性能约束的随机多智能体系统的分布式一致性
Pub Date : 2021-12-14 DOI: 10.1109/CDC45484.2021.9683249
Pushpak Jagtap, Dimos V. Dimarogonas
This paper focuses on the problem of distributed consensus control of multi-agent systems while considering two main practical concerns (i) stochastic noise in the agent dynamics and (ii) predefined performance constraints over evolutions of multi-agent systems. In particular, we consider that each agent is driven by a stochastic differential equation with state-dependent noise which makes the considered problem more challenging compare to non-stochastic agents. The work provides sufficient conditions under which the proposed time-varying distributed control laws ensure consensus in expectation and almost sure consensus of stochastic multi-agent systems while satisfying prescribed performance constraints over evolutions of the systems in the sense of the qth moment. Finally, we demonstrate the effectiveness of the proposed results with a numerical example.
本文重点研究了多智能体系统的分布式共识控制问题,同时考虑了两个主要的实际问题(1)智能体动态中的随机噪声和(2)多智能体系统演化过程中的预定义性能约束。特别是,我们认为每个智能体都是由一个具有状态相关噪声的随机微分方程驱动的,这使得所考虑的问题与非随机智能体相比更具挑战性。本文的研究为所提出的时变分布式控制律保证随机多智能体系统的期望一致性和几乎确定一致性提供了充分条件,同时满足了系统在第六个矩意义上的演化上的规定性能约束。最后,通过一个数值算例验证了所提结果的有效性。
{"title":"Distributed Consensus of Stochastic Multi-agent Systems with Prescribed Performance Constraints","authors":"Pushpak Jagtap, Dimos V. Dimarogonas","doi":"10.1109/CDC45484.2021.9683249","DOIUrl":"https://doi.org/10.1109/CDC45484.2021.9683249","url":null,"abstract":"This paper focuses on the problem of distributed consensus control of multi-agent systems while considering two main practical concerns (i) stochastic noise in the agent dynamics and (ii) predefined performance constraints over evolutions of multi-agent systems. In particular, we consider that each agent is driven by a stochastic differential equation with state-dependent noise which makes the considered problem more challenging compare to non-stochastic agents. The work provides sufficient conditions under which the proposed time-varying distributed control laws ensure consensus in expectation and almost sure consensus of stochastic multi-agent systems while satisfying prescribed performance constraints over evolutions of the systems in the sense of the qth moment. Finally, we demonstrate the effectiveness of the proposed results with a numerical example.","PeriodicalId":229089,"journal":{"name":"2021 60th IEEE Conference on Decision and Control (CDC)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128419082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatio-temporal constrained zonotopes for validation of optimal control problems * 最优控制问题的时空约束分区验证*
Pub Date : 2021-12-14 DOI: 10.1109/CDC45484.2021.9683301
Etienne Bertin, B. Hérissé, Julien Alexandre Dit Sandretto, Alexandre Chapoutot
A controlled system subject to dynamics with unknown but bounded parameters is considered. The control is defined as the solution of an optimal control problem, which induces hybrid dynamics. A method to enclose all optimal trajectories of this system is proposed. Using interval and zonotope based validated simulation and Pontryagin’s Maximum Principle, a characterization of optimal trajectories, a conservative enclosure is constructed. The usual validated simulation framework is modified so that possible trajectories are enclosed with spatio-temporal zonotopes that simplify simulation through events. Then optimality conditions are propagated backward in time and added as constraints on the previously computed enclosure. The obtained constrained zonotopes form a thin enclosure of all optimal trajectories that is less susceptible to accumulation of error. This algorithm is applied on Goddard’s problem, an aerospace problem with a bang-bang control.
研究了一个参数未知但有界的动态控制系统。该控制被定义为一个引起混合动力学的最优控制问题的解。提出了一种将该系统的所有最优轨迹封闭起来的方法。利用基于区间和分区的验证仿真和最优轨迹表征的庞特里亚金极大值原理,构造了一个保守圈闭。通常经过验证的模拟框架进行了修改,使可能的轨迹被时空分区所包围,从而简化了通过事件进行的模拟。然后,最优性条件在时间上向后传播,并作为约束添加到先前计算的外壳上。所获得的约束共体形成了所有最优轨迹的薄外壳,不易受误差积累的影响。该算法应用于戈达德问题,一个具有砰砰控制的航空航天问题。
{"title":"Spatio-temporal constrained zonotopes for validation of optimal control problems *","authors":"Etienne Bertin, B. Hérissé, Julien Alexandre Dit Sandretto, Alexandre Chapoutot","doi":"10.1109/CDC45484.2021.9683301","DOIUrl":"https://doi.org/10.1109/CDC45484.2021.9683301","url":null,"abstract":"A controlled system subject to dynamics with unknown but bounded parameters is considered. The control is defined as the solution of an optimal control problem, which induces hybrid dynamics. A method to enclose all optimal trajectories of this system is proposed. Using interval and zonotope based validated simulation and Pontryagin’s Maximum Principle, a characterization of optimal trajectories, a conservative enclosure is constructed. The usual validated simulation framework is modified so that possible trajectories are enclosed with spatio-temporal zonotopes that simplify simulation through events. Then optimality conditions are propagated backward in time and added as constraints on the previously computed enclosure. The obtained constrained zonotopes form a thin enclosure of all optimal trajectories that is less susceptible to accumulation of error. This algorithm is applied on Goddard’s problem, an aerospace problem with a bang-bang control.","PeriodicalId":229089,"journal":{"name":"2021 60th IEEE Conference on Decision and Control (CDC)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128283043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Voronoi Progressive Widening: Efficient Online Solvers for Continuous State, Action, and Observation POMDPs Voronoi渐进式扩展:连续状态,动作和观察pomdp的有效在线求解器
Pub Date : 2021-12-14 DOI: 10.1109/CDC45484.2021.9683490
M. H. Lim, C. Tomlin, Zachary Sunberg
This paper introduces Voronoi Progressive Widening (VPW), a generalization of Voronoi optimistic optimization (VOO) and action progressive widening to partially observable Markov decision processes (POMDPs). Tree search algorithms can use VPW to effectively handle continuous or hybrid action spaces by efficiently balancing local and global action searching. This paper proposes two VPW-based algorithms and analyzes them from theoretical and simulation perspectives. Voronoi Optimistic Weighted Sparse Sampling (VOWSS) is a theoretical tool that justifies VPW-based online solvers, and it is the first algorithm with global convergence guarantees for continuous state, action, and observation POMDPs. Voronoi Optimistic Monte Carlo Planning with Observation Weighting (VOMCPOW) is a versatile and efficient algorithm that consistently outperforms state-of-the-art POMDP algorithms in several simulation experiments.
本文介绍了Voronoi渐进加宽(VPW),将Voronoi乐观优化(VOO)和动作渐进加宽推广到部分可观察马尔可夫决策过程(pomdp)。树搜索算法可以利用VPW有效地处理连续或混合动作空间,从而有效地平衡局部和全局动作搜索。本文提出了两种基于vpw的算法,并从理论和仿真两方面对其进行了分析。Voronoi乐观加权稀疏抽样(VOWSS)是一种理论工具,证明了基于vpw的在线求解器,它是第一个对连续状态、动作和观察pomdp具有全局收敛保证的算法。Voronoi乐观蒙特卡罗规划与观测加权(VOMCPOW)是一个通用和高效的算法,在几个模拟实验中始终优于最先进的POMDP算法。
{"title":"Voronoi Progressive Widening: Efficient Online Solvers for Continuous State, Action, and Observation POMDPs","authors":"M. H. Lim, C. Tomlin, Zachary Sunberg","doi":"10.1109/CDC45484.2021.9683490","DOIUrl":"https://doi.org/10.1109/CDC45484.2021.9683490","url":null,"abstract":"This paper introduces Voronoi Progressive Widening (VPW), a generalization of Voronoi optimistic optimization (VOO) and action progressive widening to partially observable Markov decision processes (POMDPs). Tree search algorithms can use VPW to effectively handle continuous or hybrid action spaces by efficiently balancing local and global action searching. This paper proposes two VPW-based algorithms and analyzes them from theoretical and simulation perspectives. Voronoi Optimistic Weighted Sparse Sampling (VOWSS) is a theoretical tool that justifies VPW-based online solvers, and it is the first algorithm with global convergence guarantees for continuous state, action, and observation POMDPs. Voronoi Optimistic Monte Carlo Planning with Observation Weighting (VOMCPOW) is a versatile and efficient algorithm that consistently outperforms state-of-the-art POMDP algorithms in several simulation experiments.","PeriodicalId":229089,"journal":{"name":"2021 60th IEEE Conference on Decision and Control (CDC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128677081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Identification-based Adaptive Control for Systems with Time-varying Parameters 时变参数系统基于辨识的自适应控制
Pub Date : 2021-12-14 DOI: 10.1109/CDC45484.2021.9683217
Kaiwen Chen, A. Astolfi
This paper proposes an identification-based adaptive control scheme for nonlinear systems with time-varying parameters designed on the basis of the so-called congelation of variables method. First a scalar example to demonstrate the design methodology, which relies on re-arranging the identifier subsystems from a cascaded topology to a cyclic topology, is discussed. A small-gain-like control synthesis exploiting the cyclic topology is then presented to replace the classical control synthesis based on the swapping lemma, which exploits the cascaded topology. Then a state feedback design for a class of lower triangular nonlinear systems is presented: this combines the same design methodology with the backstepping techniques. Boundedness of all closed-loop signals and convergence of the system state are proved. Finally, simulation results showing that the proposed controller achieves superior performance than the classical design are presented.
针对具有时变参数的非线性系统,提出了一种基于辨识的自适应控制方案。首先讨论了一个标量示例来演示设计方法,该方法依赖于将标识符子系统从级联拓扑重新排列到循环拓扑。然后提出了一种利用循环拓扑的类小增益控制综合方法,以取代利用级联拓扑的基于交换引理的经典控制综合方法。然后提出了一类下三角形非线性系统的状态反馈设计方法:该方法将相同的设计方法与回溯技术相结合。证明了所有闭环信号的有界性和系统状态的收敛性。最后,仿真结果表明,所提控制器的性能优于经典设计。
{"title":"Identification-based Adaptive Control for Systems with Time-varying Parameters","authors":"Kaiwen Chen, A. Astolfi","doi":"10.1109/CDC45484.2021.9683217","DOIUrl":"https://doi.org/10.1109/CDC45484.2021.9683217","url":null,"abstract":"This paper proposes an identification-based adaptive control scheme for nonlinear systems with time-varying parameters designed on the basis of the so-called congelation of variables method. First a scalar example to demonstrate the design methodology, which relies on re-arranging the identifier subsystems from a cascaded topology to a cyclic topology, is discussed. A small-gain-like control synthesis exploiting the cyclic topology is then presented to replace the classical control synthesis based on the swapping lemma, which exploits the cascaded topology. Then a state feedback design for a class of lower triangular nonlinear systems is presented: this combines the same design methodology with the backstepping techniques. Boundedness of all closed-loop signals and convergence of the system state are proved. Finally, simulation results showing that the proposed controller achieves superior performance than the classical design are presented.","PeriodicalId":229089,"journal":{"name":"2021 60th IEEE Conference on Decision and Control (CDC)","volume":"156 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128733623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Data-Driven Control of Nonlinear Systems: Learning Koopman Operators for Policy Gradient 非线性系统的数据驱动控制:策略梯度的Koopman算子学习
Pub Date : 2021-12-14 DOI: 10.1109/CDC45484.2021.9683220
Francesco Zanini, A. Chiuso
Data-driven control of nonlinear dynamical systems is a largely open problem. In this paper, building upon the theory of Koopman operators and exploiting ideas from policy gradient methods in reinforcement learning, a novel approach for data-driven optimal control of unknown nonlinear dynamical systems is introduced.
非线性动力系统的数据驱动控制是一个悬而未决的问题。本文以库普曼算子理论为基础,利用强化学习中策略梯度方法的思想,提出了一种未知非线性动态系统数据驱动最优控制的新方法。
{"title":"Data-Driven Control of Nonlinear Systems: Learning Koopman Operators for Policy Gradient","authors":"Francesco Zanini, A. Chiuso","doi":"10.1109/CDC45484.2021.9683220","DOIUrl":"https://doi.org/10.1109/CDC45484.2021.9683220","url":null,"abstract":"Data-driven control of nonlinear dynamical systems is a largely open problem. In this paper, building upon the theory of Koopman operators and exploiting ideas from policy gradient methods in reinforcement learning, a novel approach for data-driven optimal control of unknown nonlinear dynamical systems is introduced.","PeriodicalId":229089,"journal":{"name":"2021 60th IEEE Conference on Decision and Control (CDC)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129623430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2021 60th IEEE Conference on Decision and Control (CDC)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1