风险概率估计的可推广物理知识学习框架

Conference on Learning for Dynamics & Control Pub Date : 2023-05-10 DOI:10.48550/arXiv.2305.06432

Zhuoyuan Wang, Yorie Nakahira

{"title":"风险概率估计的可推广物理知识学习框架","authors":"Zhuoyuan Wang, Yorie Nakahira","doi":"10.48550/arXiv.2305.06432","DOIUrl":null,"url":null,"abstract":"Accurate estimates of long-term risk probabilities and their gradients are critical for many stochastic safe control methods. However, computing such risk probabilities in real-time and in unseen or changing environments is challenging. Monte Carlo (MC) methods cannot accurately evaluate the probabilities and their gradients as an infinitesimal devisor can amplify the sampling noise. In this paper, we develop an efficient method to evaluate the probabilities of long-term risk and their gradients. The proposed method exploits the fact that long-term risk probability satisfies certain partial differential equations (PDEs), which characterize the neighboring relations between the probabilities, to integrate MC methods and physics-informed neural networks. We provide theoretical guarantees of the estimation error given certain choices of training configurations. Numerical results show the proposed method has better sample efficiency, generalizes well to unseen regions, and can adapt to systems with changing parameters. The proposed method can also accurately estimate the gradients of risk probabilities, which enables first- and second-order techniques on risk probabilities to be used for learning and control.","PeriodicalId":268449,"journal":{"name":"Conference on Learning for Dynamics & Control","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Generalizable Physics-informed Learning Framework for Risk Probability Estimation\",\"authors\":\"Zhuoyuan Wang, Yorie Nakahira\",\"doi\":\"10.48550/arXiv.2305.06432\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Accurate estimates of long-term risk probabilities and their gradients are critical for many stochastic safe control methods. However, computing such risk probabilities in real-time and in unseen or changing environments is challenging. Monte Carlo (MC) methods cannot accurately evaluate the probabilities and their gradients as an infinitesimal devisor can amplify the sampling noise. In this paper, we develop an efficient method to evaluate the probabilities of long-term risk and their gradients. The proposed method exploits the fact that long-term risk probability satisfies certain partial differential equations (PDEs), which characterize the neighboring relations between the probabilities, to integrate MC methods and physics-informed neural networks. We provide theoretical guarantees of the estimation error given certain choices of training configurations. Numerical results show the proposed method has better sample efficiency, generalizes well to unseen regions, and can adapt to systems with changing parameters. The proposed method can also accurately estimate the gradients of risk probabilities, which enables first- and second-order techniques on risk probabilities to be used for learning and control.\",\"PeriodicalId\":268449,\"journal\":{\"name\":\"Conference on Learning for Dynamics & Control\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Conference on Learning for Dynamics & Control\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2305.06432\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference on Learning for Dynamics & Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2305.06432","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

准确估计长期风险概率及其梯度对许多随机安全控制方法至关重要。然而，在不可见或不断变化的环境中实时计算此类风险概率具有挑战性。蒙特卡罗(MC)方法不能准确地计算概率及其梯度，因为它是一个无穷小的设计器，会放大采样噪声。本文提出了一种评估长期风险概率及其梯度的有效方法。该方法利用长期风险概率满足一定的偏微分方程(PDEs)这一事实，将MC方法与物理信息神经网络相结合。在给定训练配置的情况下，我们提供了估计误差的理论保证。数值结果表明，该方法具有较好的采样效率，对未知区域有较好的泛化能力，能够适应参数变化的系统。该方法还可以准确地估计风险概率的梯度，从而使风险概率的一阶和二阶技术可以用于学习和控制。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A Generalizable Physics-informed Learning Framework for Risk Probability Estimation

Accurate estimates of long-term risk probabilities and their gradients are critical for many stochastic safe control methods. However, computing such risk probabilities in real-time and in unseen or changing environments is challenging. Monte Carlo (MC) methods cannot accurately evaluate the probabilities and their gradients as an infinitesimal devisor can amplify the sampling noise. In this paper, we develop an efficient method to evaluate the probabilities of long-term risk and their gradients. The proposed method exploits the fact that long-term risk probability satisfies certain partial differential equations (PDEs), which characterize the neighboring relations between the probabilities, to integrate MC methods and physics-informed neural networks. We provide theoretical guarantees of the estimation error given certain choices of training configurations. Numerical results show the proposed method has better sample efficiency, generalizes well to unseen regions, and can adapt to systems with changing parameters. The proposed method can also accurately estimate the gradients of risk probabilities, which enables first- and second-order techniques on risk probabilities to be used for learning and control.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Conference on Learning for Dynamics & Control

自引率

0.00%

发文量