Response probability distribution estimation of expensive computer simulators: A Bayesian active learning perspective using Gaussian process regression

arXiv - STAT - Computation Pub Date : 2024-08-31 DOI:arxiv-2409.00407

Chao Dang, Marcos A. Valdebenito, Nataly A. Manque, Jun Xu, Matthias G. R. Faes

{"title":"Response probability distribution estimation of expensive computer simulators: A Bayesian active learning perspective using Gaussian process regression","authors":"Chao Dang, Marcos A. Valdebenito, Nataly A. Manque, Jun Xu, Matthias G. R. Faes","doi":"arxiv-2409.00407","DOIUrl":null,"url":null,"abstract":"Estimation of the response probability distributions of computer simulators\nin the presence of randomness is a crucial task in many fields. However,\nachieving this task with guaranteed accuracy remains an open computational\nchallenge, especially for expensive-to-evaluate computer simulators. In this\nwork, a Bayesian active learning perspective is presented to address the\nchallenge, which is based on the use of the Gaussian process (GP) regression.\nFirst, estimation of the response probability distributions is conceptually\ninterpreted as a Bayesian inference problem, as opposed to frequentist\ninference. This interpretation provides several important benefits: (1) it\nquantifies and propagates discretization error probabilistically; (2) it\nincorporates prior knowledge of the computer simulator, and (3) it enables the\neffective reduction of numerical uncertainty in the solution to a prescribed\nlevel. The conceptual Bayesian idea is then realized by using the GP\nregression, where we derive the posterior statistics of the response\nprobability distributions in semi-analytical form and also provide a numerical\nsolution scheme. Based on the practical Bayesian approach, a Bayesian active\nlearning (BAL) method is further proposed for estimating the response\nprobability distributions. In this context, the key contribution lies in the\ndevelopment of two crucial components for active learning, i.e., stopping\ncriterion and learning function, by taking advantage of posterior statistics.\nIt is empirically demonstrated by five numerical examples that the proposed BAL\nmethod can efficiently estimate the response probability distributions with\ndesired accuracy.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"68 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.00407","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Estimation of the response probability distributions of computer simulators in the presence of randomness is a crucial task in many fields. However, achieving this task with guaranteed accuracy remains an open computational challenge, especially for expensive-to-evaluate computer simulators. In this work, a Bayesian active learning perspective is presented to address the challenge, which is based on the use of the Gaussian process (GP) regression. First, estimation of the response probability distributions is conceptually interpreted as a Bayesian inference problem, as opposed to frequentist inference. This interpretation provides several important benefits: (1) it quantifies and propagates discretization error probabilistically; (2) it incorporates prior knowledge of the computer simulator, and (3) it enables the effective reduction of numerical uncertainty in the solution to a prescribed level. The conceptual Bayesian idea is then realized by using the GP regression, where we derive the posterior statistics of the response probability distributions in semi-analytical form and also provide a numerical solution scheme. Based on the practical Bayesian approach, a Bayesian active learning (BAL) method is further proposed for estimating the response probability distributions. In this context, the key contribution lies in the development of two crucial components for active learning, i.e., stopping criterion and learning function, by taking advantage of posterior statistics. It is empirically demonstrated by five numerical examples that the proposed BAL method can efficiently estimate the response probability distributions with desired accuracy.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

昂贵计算机模拟器的响应概率分布估计：利用高斯过程回归的贝叶斯主动学习视角

在存在随机性的情况下，估计计算机模拟器的响应概率分布是许多领域的一项重要任务。然而，如何在保证准确性的前提下完成这项任务仍然是一个有待解决的计算难题，尤其是对于评估成本高昂的计算机模拟器而言。首先，响应概率分布的估计在概念上被解释为贝叶斯推理问题，而不是频数推理问题。这种解释有几个重要的好处：(1)以概率方式量化和传播离散化误差；(2)纳入计算机模拟器的先验知识；(3)能够有效地将求解中的数值不确定性降低到规定水平。通过使用 GP 回归，我们以半分析的形式推导出了响应概率分布的后验统计量，并提供了数值求解方案，从而实现了概念性的贝叶斯思想。在实用贝叶斯方法的基础上，我们进一步提出了贝叶斯主动学习（BAL）方法，用于估计响应概率分布。在此背景下，贝叶斯主动学习方法的主要贡献在于利用后验统计量的优势，开发了主动学习的两个关键组件，即停止准则和学习函数，并通过五个数值示例实证证明了所提出的贝叶斯主动学习方法能够以期望的精度有效地估计响应概率分布。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

arXiv - STAT - Computation

自引率

0.00%

发文量