从专家演示中学习Lipschitz反馈策略:闭环保证、鲁棒性和泛化

Abed AlRahman Al Makdah;Vishaal Krishnan;Fabio Pasqualetti
{"title":"从专家演示中学习Lipschitz反馈策略:闭环保证、鲁棒性和泛化","authors":"Abed AlRahman Al Makdah;Vishaal Krishnan;Fabio Pasqualetti","doi":"10.1109/OJCSYS.2022.3181584","DOIUrl":null,"url":null,"abstract":"In this work, we propose a framework in which we use a Lipschitz-constrained loss minimization scheme to learn feedback control policies with guarantees on closed-loop stability, adversarial robustness, and generalization. These policies are learned directly from expert demonstrations, contained in a dataset of state-control input pairs, without any prior knowledge of the task and system model. Our analysis exploits the Lipschitz property of the learned policies to obtain closed-loop guarantees on stability, adversarial robustness, and generalization over scenarios unexplored by the expert. In particular, first, we establish robust closed-loop stability under the learned control policy, where we provide guarantees that the closed-loop trajectory under the learned policy stays within a bounded region around the expert trajectory and converges asymptotically to a bounded region around the origin. Second, we derive bounds on the closed-loop regret with respect to the expert policy and on the deterioration of the closed-loop performance under bounded (adversarial) disturbances to the state measurements. These bounds provide certificates for closed-loop performance and adversarial robustness for learned policies. Third, we derive a (probabilistic) bound on generalization error for the learned policies. Numerical results validate our analysis and demonstrate the effectiveness of our robust feedback policy learning framework. Finally, our results support the existence of a potential tradeoff between nominal closed-loop performance and adversarial robustness, and that improvements in nominal closed-loop performance can only be made at the expense of robustness to adversarial perturbations.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"1 ","pages":"85-99"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9683993/09798865.pdf","citationCount":"0","resultStr":"{\"title\":\"Learning Lipschitz Feedback Policies From Expert Demonstrations: Closed-Loop Guarantees, Robustness and Generalization\",\"authors\":\"Abed AlRahman Al Makdah;Vishaal Krishnan;Fabio Pasqualetti\",\"doi\":\"10.1109/OJCSYS.2022.3181584\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this work, we propose a framework in which we use a Lipschitz-constrained loss minimization scheme to learn feedback control policies with guarantees on closed-loop stability, adversarial robustness, and generalization. These policies are learned directly from expert demonstrations, contained in a dataset of state-control input pairs, without any prior knowledge of the task and system model. Our analysis exploits the Lipschitz property of the learned policies to obtain closed-loop guarantees on stability, adversarial robustness, and generalization over scenarios unexplored by the expert. In particular, first, we establish robust closed-loop stability under the learned control policy, where we provide guarantees that the closed-loop trajectory under the learned policy stays within a bounded region around the expert trajectory and converges asymptotically to a bounded region around the origin. Second, we derive bounds on the closed-loop regret with respect to the expert policy and on the deterioration of the closed-loop performance under bounded (adversarial) disturbances to the state measurements. These bounds provide certificates for closed-loop performance and adversarial robustness for learned policies. Third, we derive a (probabilistic) bound on generalization error for the learned policies. Numerical results validate our analysis and demonstrate the effectiveness of our robust feedback policy learning framework. Finally, our results support the existence of a potential tradeoff between nominal closed-loop performance and adversarial robustness, and that improvements in nominal closed-loop performance can only be made at the expense of robustness to adversarial perturbations.\",\"PeriodicalId\":73299,\"journal\":{\"name\":\"IEEE open journal of control systems\",\"volume\":\"1 \",\"pages\":\"85-99\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/iel7/9552933/9683993/09798865.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE open journal of control systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/9798865/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE open journal of control systems","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/9798865/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在这项工作中,我们提出了一个框架,在该框架中,我们使用Lipschitz约束的损失最小化方案来学习具有闭环稳定性、对抗性鲁棒性和泛化保证的反馈控制策略。这些策略直接从专家演示中学习,包含在状态控制输入对的数据集中,而不需要任何任务和系统模型的先验知识。我们的分析利用学习策略的Lipschitz特性,在专家未探索的场景中获得稳定性、对抗性鲁棒性和泛化的闭环保证。特别地,首先,我们在学习控制策略下建立了鲁棒闭环稳定性,其中我们保证学习策略下的闭环轨迹保持在专家轨迹周围的有界区域内,并渐近收敛到原点周围的有边界区域。其次,我们推导了关于专家策略的闭环遗憾的边界,以及在状态测量的有界(对抗性)扰动下闭环性能恶化的边界。这些边界为闭环性能和学习策略的对抗性鲁棒性提供了证书。第三,我们推导了学习策略的泛化误差的(概率)界。数值结果验证了我们的分析,并证明了我们稳健的反馈政策学习框架的有效性。最后,我们的结果支持名义闭环性能和对抗性鲁棒性之间存在潜在的折衷,并且名义闭环性能的改进只能以牺牲对抗性扰动的鲁棒性为代价。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Learning Lipschitz Feedback Policies From Expert Demonstrations: Closed-Loop Guarantees, Robustness and Generalization
In this work, we propose a framework in which we use a Lipschitz-constrained loss minimization scheme to learn feedback control policies with guarantees on closed-loop stability, adversarial robustness, and generalization. These policies are learned directly from expert demonstrations, contained in a dataset of state-control input pairs, without any prior knowledge of the task and system model. Our analysis exploits the Lipschitz property of the learned policies to obtain closed-loop guarantees on stability, adversarial robustness, and generalization over scenarios unexplored by the expert. In particular, first, we establish robust closed-loop stability under the learned control policy, where we provide guarantees that the closed-loop trajectory under the learned policy stays within a bounded region around the expert trajectory and converges asymptotically to a bounded region around the origin. Second, we derive bounds on the closed-loop regret with respect to the expert policy and on the deterioration of the closed-loop performance under bounded (adversarial) disturbances to the state measurements. These bounds provide certificates for closed-loop performance and adversarial robustness for learned policies. Third, we derive a (probabilistic) bound on generalization error for the learned policies. Numerical results validate our analysis and demonstrate the effectiveness of our robust feedback policy learning framework. Finally, our results support the existence of a potential tradeoff between nominal closed-loop performance and adversarial robustness, and that improvements in nominal closed-loop performance can only be made at the expense of robustness to adversarial perturbations.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Resiliency Through Collaboration in Heterogeneous Multi-Robot Systems Resilient Synchronization of Pulse-Coupled Oscillators Under Stealthy Attacks Pareto-Optimal Event-Based Scheme for Station and Inter-Station Control of Electric and Automated Buses A Control-Theoretical Zero-Knowledge Proof Scheme for Networked Control Systems Control of Linear-Threshold Brain Networks via Reservoir Computing
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1