通过对生产数据的调节来提高高斯混合模型不确定性量化拟合性能的策略

Day 1 Tue, October 26, 2021 Pub Date : 2021-10-19 DOI:10.2118/204008-ms

G. Gao, J. Vink, F. Saaf, T. Wells

{"title":"通过对生产数据的调节来提高高斯混合模型不确定性量化拟合性能的策略","authors":"G. Gao, J. Vink, F. Saaf, T. Wells","doi":"10.2118/204008-ms","DOIUrl":null,"url":null,"abstract":"\n When formulating history matching within the Bayesian framework, we may quantify the uncertainty of model parameters and production forecasts using conditional realizations sampled from the posterior probability density function (PDF). It is quite challenging to sample such a posterior PDF. Some methods e.g., Markov chain Monte Carlo (MCMC), are very expensive (e.g., MCMC) while others are cheaper but may generate biased samples. In this paper, we propose an unconstrained Gaussian Mixture Model (GMM) fitting method to approximate the posterior PDF and investigate new strategies to further enhance its performance.\n To reduce the CPU time of handling bound constraints, we reformulate the GMM fitting formulation such that an unconstrained optimization algorithm can be applied to find the optimal solution of unknown GMM parameters. To obtain a sufficiently accurate GMM approximation with the lowest number of Gaussian components, we generate random initial guesses, remove components with very small or very large mixture weights after each GMM fitting iteration and prevent their reappearance using a dedicated filter. To prevent overfitting, we only add a new Gaussian component if the quality of the GMM approximation on a (large) set of blind-test data sufficiently improves.\n The unconstrained GMM fitting method with the new strategies proposed in this paper is validated using nonlinear toy problems and then applied to a synthetic history matching example. It can construct a GMM approximation of the posterior PDF that is comparable to the MCMC method, and it is significantly more efficient than the constrained GMM fitting formulation, e.g., reducing the CPU time by a factor of 800 to 7300 for problems we tested, which makes it quite attractive for large scale history matching problems.","PeriodicalId":11146,"journal":{"name":"Day 1 Tue, October 26, 2021","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Strategies to Enhance the Performance of Gaussian Mixture Model Fitting for Uncertainty Quantification by Conditioning to Production Data\",\"authors\":\"G. Gao, J. Vink, F. Saaf, T. Wells\",\"doi\":\"10.2118/204008-ms\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n When formulating history matching within the Bayesian framework, we may quantify the uncertainty of model parameters and production forecasts using conditional realizations sampled from the posterior probability density function (PDF). It is quite challenging to sample such a posterior PDF. Some methods e.g., Markov chain Monte Carlo (MCMC), are very expensive (e.g., MCMC) while others are cheaper but may generate biased samples. In this paper, we propose an unconstrained Gaussian Mixture Model (GMM) fitting method to approximate the posterior PDF and investigate new strategies to further enhance its performance.\\n To reduce the CPU time of handling bound constraints, we reformulate the GMM fitting formulation such that an unconstrained optimization algorithm can be applied to find the optimal solution of unknown GMM parameters. To obtain a sufficiently accurate GMM approximation with the lowest number of Gaussian components, we generate random initial guesses, remove components with very small or very large mixture weights after each GMM fitting iteration and prevent their reappearance using a dedicated filter. To prevent overfitting, we only add a new Gaussian component if the quality of the GMM approximation on a (large) set of blind-test data sufficiently improves.\\n The unconstrained GMM fitting method with the new strategies proposed in this paper is validated using nonlinear toy problems and then applied to a synthetic history matching example. It can construct a GMM approximation of the posterior PDF that is comparable to the MCMC method, and it is significantly more efficient than the constrained GMM fitting formulation, e.g., reducing the CPU time by a factor of 800 to 7300 for problems we tested, which makes it quite attractive for large scale history matching problems.\",\"PeriodicalId\":11146,\"journal\":{\"name\":\"Day 1 Tue, October 26, 2021\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Day 1 Tue, October 26, 2021\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2118/204008-ms\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Day 1 Tue, October 26, 2021","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2118/204008-ms","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

当在贝叶斯框架内制定历史匹配时，我们可以使用从后验概率密度函数(PDF)中采样的条件实现来量化模型参数和生产预测的不确定性。这是相当具有挑战性的采样这样一个后验PDF。一些方法，例如，马尔可夫链蒙特卡罗(MCMC)，是非常昂贵的(例如，MCMC)，而其他方法更便宜，但可能产生有偏差的样本。在本文中，我们提出了一种无约束高斯混合模型(GMM)拟合方法来近似后验PDF，并研究了进一步提高其性能的新策略。为了减少处理有界约束的CPU时间，我们重新制定了GMM拟合公式，使无约束优化算法可以用于寻找未知GMM参数的最优解。为了用最少的高斯分量获得足够精确的GMM近似值，我们生成随机的初始猜测，在每次GMM拟合迭代后去除混合权重非常小或非常大的分量，并使用专用滤波器防止它们再次出现。为了防止过拟合，只有在(大)盲测数据集上的GMM近似质量得到充分改善时，我们才添加新的高斯分量。利用非线性玩具问题验证了采用新策略的无约束GMM拟合方法，并将其应用于一个综合历史匹配实例。它可以构建与MCMC方法相当的后验PDF的GMM近似，并且它比约束GMM拟合公式显着更高效，例如，对于我们测试的问题，将CPU时间减少了800到7300倍，这使得它对大规模历史匹配问题非常有吸引力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Strategies to Enhance the Performance of Gaussian Mixture Model Fitting for Uncertainty Quantification by Conditioning to Production Data

When formulating history matching within the Bayesian framework, we may quantify the uncertainty of model parameters and production forecasts using conditional realizations sampled from the posterior probability density function (PDF). It is quite challenging to sample such a posterior PDF. Some methods e.g., Markov chain Monte Carlo (MCMC), are very expensive (e.g., MCMC) while others are cheaper but may generate biased samples. In this paper, we propose an unconstrained Gaussian Mixture Model (GMM) fitting method to approximate the posterior PDF and investigate new strategies to further enhance its performance. To reduce the CPU time of handling bound constraints, we reformulate the GMM fitting formulation such that an unconstrained optimization algorithm can be applied to find the optimal solution of unknown GMM parameters. To obtain a sufficiently accurate GMM approximation with the lowest number of Gaussian components, we generate random initial guesses, remove components with very small or very large mixture weights after each GMM fitting iteration and prevent their reappearance using a dedicated filter. To prevent overfitting, we only add a new Gaussian component if the quality of the GMM approximation on a (large) set of blind-test data sufficiently improves. The unconstrained GMM fitting method with the new strategies proposed in this paper is validated using nonlinear toy problems and then applied to a synthetic history matching example. It can construct a GMM approximation of the posterior PDF that is comparable to the MCMC method, and it is significantly more efficient than the constrained GMM fitting formulation, e.g., reducing the CPU time by a factor of 800 to 7300 for problems we tested, which makes it quite attractive for large scale history matching problems.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Day 1 Tue, October 26, 2021

自引率

0.00%

发文量