Performance–Robustness Tradeoffs in Adversarially Robust Control and Estimation

IF 7 1区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS IEEE Transactions on Automatic Control Pub Date : 2024-11-07 DOI:10.1109/TAC.2024.3492318

Bruce D. Lee;Thomas T.C.K. Zhang;Hamed Hassani;Nikolai Matni

{"title":"Performance–Robustness Tradeoffs in Adversarially Robust Control and Estimation","authors":"Bruce D. Lee;Thomas T.C.K. Zhang;Hamed Hassani;Nikolai Matni","doi":"10.1109/TAC.2024.3492318","DOIUrl":null,"url":null,"abstract":"Efforts by the reinforcement learning community to close the sim-to-real gap have resulted in policy optimization objectives, which are distinct from, although related to, existing objectives in robust control, such as <inline-formula><tex-math>${\\mathcal {H}}_{\\infty }$</tex-math></inline-formula> methods. The disparity from the familiar control methods makes it challenging to make rigorous claims about these methods, and to predict the implications on performance of training a policy with a particular level of robustness. This in turn makes selecting the level of robustness a heavily heuristic exercise. Toward addressing these issues, we study the synthesis problem for a control objective consisting of both zero-mean stochastic disturbances, and bounded adversarial disturbances entering the state and measurement under linear dynamics and quadratic cost. We show that this problem admits a linear time-invariant controller that has a form closely related to suboptimal <inline-formula><tex-math>${\\mathcal {H}}_{\\infty }$</tex-math></inline-formula> solutions. We also study the tradeoffs induced by optimizing the control objective in the presence of an adversary by examining how such a solution degrades controller performance in the absence of an adversary. To this end, we provide a quantitative performance–robustness tradeoff analysis in two analytically tractable cases: state feedback control and state prediction. In these special cases, we demonstrate that the severity of the tradeoff depends in an interpretable manner upon system-theoretic properties, such as the spectrum of the controllability Gramian, the spectrum of the observability Gramian, and the stability of the system. This may provide practitioners guidance for determining how much robustness to incorporate based on a priori system knowledge, and conversely how to design systems where the tradeoff is less severe. We empirically validate our results by comparing the performance of the controller against standard baselines, and plotting performance–robustness tradeoff curves.","PeriodicalId":13201,"journal":{"name":"IEEE Transactions on Automatic Control","volume":"70 5","pages":"3133-3148"},"PeriodicalIF":7.0000,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automatic Control","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10746314/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Efforts by the reinforcement learning community to close the sim-to-real gap have resulted in policy optimization objectives, which are distinct from, although related to, existing objectives in robust control, such as

${\mathcal {H}}_{\infty }$

methods. The disparity from the familiar control methods makes it challenging to make rigorous claims about these methods, and to predict the implications on performance of training a policy with a particular level of robustness. This in turn makes selecting the level of robustness a heavily heuristic exercise. Toward addressing these issues, we study the synthesis problem for a control objective consisting of both zero-mean stochastic disturbances, and bounded adversarial disturbances entering the state and measurement under linear dynamics and quadratic cost. We show that this problem admits a linear time-invariant controller that has a form closely related to suboptimal

${\mathcal {H}}_{\infty }$

solutions. We also study the tradeoffs induced by optimizing the control objective in the presence of an adversary by examining how such a solution degrades controller performance in the absence of an adversary. To this end, we provide a quantitative performance–robustness tradeoff analysis in two analytically tractable cases: state feedback control and state prediction. In these special cases, we demonstrate that the severity of the tradeoff depends in an interpretable manner upon system-theoretic properties, such as the spectrum of the controllability Gramian, the spectrum of the observability Gramian, and the stability of the system. This may provide practitioners guidance for determining how much robustness to incorporate based on a priori system knowledge, and conversely how to design systems where the tradeoff is less severe. We empirically validate our results by comparing the performance of the controller against standard baselines, and plotting performance–robustness tradeoff curves.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

逆向鲁棒控制和估计中的性能-鲁棒性权衡

强化学习社区为缩小模拟与真实差距所做的努力导致了策略优化目标，这些目标与鲁棒控制中的现有目标（如${\mathcal {H}}_{\infty }$方法）不同，但与之相关。与熟悉的控制方法的差异使得对这些方法做出严格的声明以及预测具有特定鲁棒性水平的策略训练对性能的影响具有挑战性。这反过来又使选择鲁棒性级别成为一项非常启发式的练习。为了解决这些问题，我们研究了一个控制目标的综合问题，该控制目标由零均值随机干扰和在线性动力学和二次代价下进入状态和测量的有界对抗干扰组成。我们证明了这个问题允许一个线性定常控制器，它的形式与次优${\mathcal {H}}_{\infty }$解密切相关。我们还通过研究在没有对手的情况下这种解决方案如何降低控制器性能，研究了在对手存在的情况下优化控制目标所引起的权衡。为此，我们在两种分析可处理的情况下提供了定量的性能-鲁棒性权衡分析：状态反馈控制和状态预测。在这些特殊情况下，我们证明了权衡的严重程度以一种可解释的方式取决于系统理论性质，如可控性格拉曼谱、可观察性格拉曼谱和系统的稳定性。这可能为实践者提供指导，以确定基于先验系统知识合并多少健壮性，以及反过来如何设计折衷不那么严重的系统。我们通过将控制器的性能与标准基线进行比较，并绘制性能-鲁棒性权衡曲线，以经验验证我们的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Automatic Control 工程技术-工程：电子与电气

CiteScore

11.30

自引率

5.90%

发文量

824

审稿时长

9 months

期刊介绍： In the IEEE Transactions on Automatic Control, the IEEE Control Systems Society publishes high-quality papers on the theory, design, and applications of control engineering. Two types of contributions are regularly considered: 1) Papers: Presentation of significant research, development, or application of control concepts. 2) Technical Notes and Correspondence: Brief technical notes, comments on published areas or established control topics, corrections to papers and notes published in the Transactions. In addition, special papers (tutorials, surveys, and perspectives on the theory and applications of control systems topics) are solicited.