物理信息高斯过程超参数调整的新方法：应用于参数 PDEs

IF 8.7 2区工程技术 Q1 Mathematics Engineering with Computers Pub Date : 2024-04-08 DOI:10.1007/s00366-024-01970-8

Masoud Ezati, Mohsen Esmaeilbeigi, Ahmad Kamandi

{"title":"物理信息高斯过程超参数调整的新方法：应用于参数 PDEs","authors":"Masoud Ezati, Mohsen Esmaeilbeigi, Ahmad Kamandi","doi":"10.1007/s00366-024-01970-8","DOIUrl":null,"url":null,"abstract":"<p>Today, Physics-informed machine learning (PIML) methods are one of the effective tools with high flexibility for solving inverse problems and operational equations. Among these methods, physics-informed learning model built upon Gaussian processes (PIGP) has a special place due to provide the posterior probabilistic distribution of their predictions in the context of Bayesian inference. In this method, the training phase to determine the optimal hyper parameters is equivalent to the optimization of a non-convex function called the likelihood function. Due to access the explicit form of the gradient, it is recommended to use conjugate gradient (CG) optimization algorithms. In addition, due to the necessity of computation of the determinant and inverse of the covariance matrix in each evaluation of the likelihood function, it is recommended to use CG methods in such a way that it can be completed in the minimum number of evaluations. In previous studies, only special form of CG method has been considered, which naturally will not have high efficiency. In this paper, the efficiency of the CG methods for optimization of the likelihood function in PIGP has been studied. The results of the numerical simulations show that the initial step length and search direction in CG methods have a significant effect on the number of evaluations of the likelihood function and consequently on the efficiency of the PIGP. Also, according to the specific characteristics of the objective function in this problem, in the traditional CG methods, normalizing the initial step length to avoid getting stuck in bad conditioned points and improving the search direction by using angle condition to guarantee global convergence have been proposed. The results of numerical simulations obtained from the investigation of seven different improved CG methods with different angles in angle condition (four angles) and different initial step lengths (three step lengths), show the significant effect of the proposed modifications in reducing the number of iterations and the number of evaluations in different types of CG methods. This increases the efficiency of the PIGP method significantly, especially when the traditional CG algorithms fail in the optimization process, the improved algorithms perform well. Finally, in order to make it possible to implement the studies carried out in this paper for other parametric equations, the compiled package including the methods used in this paper is attached.</p>","PeriodicalId":11696,"journal":{"name":"Engineering with Computers","volume":"42 1","pages":""},"PeriodicalIF":8.7000,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Novel approaches for hyper-parameter tuning of physics-informed Gaussian processes: application to parametric PDEs\",\"authors\":\"Masoud Ezati, Mohsen Esmaeilbeigi, Ahmad Kamandi\",\"doi\":\"10.1007/s00366-024-01970-8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Today, Physics-informed machine learning (PIML) methods are one of the effective tools with high flexibility for solving inverse problems and operational equations. Among these methods, physics-informed learning model built upon Gaussian processes (PIGP) has a special place due to provide the posterior probabilistic distribution of their predictions in the context of Bayesian inference. In this method, the training phase to determine the optimal hyper parameters is equivalent to the optimization of a non-convex function called the likelihood function. Due to access the explicit form of the gradient, it is recommended to use conjugate gradient (CG) optimization algorithms. In addition, due to the necessity of computation of the determinant and inverse of the covariance matrix in each evaluation of the likelihood function, it is recommended to use CG methods in such a way that it can be completed in the minimum number of evaluations. In previous studies, only special form of CG method has been considered, which naturally will not have high efficiency. In this paper, the efficiency of the CG methods for optimization of the likelihood function in PIGP has been studied. The results of the numerical simulations show that the initial step length and search direction in CG methods have a significant effect on the number of evaluations of the likelihood function and consequently on the efficiency of the PIGP. Also, according to the specific characteristics of the objective function in this problem, in the traditional CG methods, normalizing the initial step length to avoid getting stuck in bad conditioned points and improving the search direction by using angle condition to guarantee global convergence have been proposed. The results of numerical simulations obtained from the investigation of seven different improved CG methods with different angles in angle condition (four angles) and different initial step lengths (three step lengths), show the significant effect of the proposed modifications in reducing the number of iterations and the number of evaluations in different types of CG methods. This increases the efficiency of the PIGP method significantly, especially when the traditional CG algorithms fail in the optimization process, the improved algorithms perform well. Finally, in order to make it possible to implement the studies carried out in this paper for other parametric equations, the compiled package including the methods used in this paper is attached.</p>\",\"PeriodicalId\":11696,\"journal\":{\"name\":\"Engineering with Computers\",\"volume\":\"42 1\",\"pages\":\"\"},\"PeriodicalIF\":8.7000,\"publicationDate\":\"2024-04-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering with Computers\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1007/s00366-024-01970-8\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering with Computers","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s00366-024-01970-8","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Mathematics","Score":null,"Total":0}

引用次数: 0

摘要

如今，物理信息机器学习（PIML）方法是解决逆问题和运算方程的有效工具之一，具有很高的灵活性。在这些方法中，建立在高斯过程基础上的物理信息学习模型（PIGP）具有特殊的地位，因为它在贝叶斯推理的背景下提供了预测的后验概率分布。在这种方法中，确定最佳超参数的训练阶段等同于优化一个称为似然函数的非凸函数。由于要获取梯度的显式形式，建议使用共轭梯度（CG）优化算法。此外，由于在每次评估似然函数时都必须计算协方差矩阵的行列式和逆矩阵，因此建议使用共轭梯度（CG）方法，以便以最少的评估次数完成评估。以往的研究只考虑了 CG 方法的特殊形式，效率自然不会高。本文研究了 CG 方法在 PIGP 中优化似然函数的效率。数值模拟结果表明，CG 方法中的初始步长和搜索方向对似然函数的求值次数有显著影响，进而影响 PIGP 的效率。同时，根据该问题目标函数的具体特点，在传统的 CG 方法中提出了将初始步长归一化以避免卡在条件不好的点上，以及利用角度条件改善搜索方向以保证全局收敛。通过对不同角度条件（四个角度）和不同初始步长（三个步长）的七种不同改进 CG 方法的数值模拟研究结果表明，所提出的改进措施在减少不同类型 CG 方法的迭代次数和评估次数方面效果显著。这大大提高了 PIGP 方法的效率，特别是当传统 CG 算法在优化过程中失败时，改进后的算法表现良好。最后，为了使本文的研究能够应用于其他参数方程，本文附有包括本文所用方法在内的编译包。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Novel approaches for hyper-parameter tuning of physics-informed Gaussian processes: application to parametric PDEs

Today, Physics-informed machine learning (PIML) methods are one of the effective tools with high flexibility for solving inverse problems and operational equations. Among these methods, physics-informed learning model built upon Gaussian processes (PIGP) has a special place due to provide the posterior probabilistic distribution of their predictions in the context of Bayesian inference. In this method, the training phase to determine the optimal hyper parameters is equivalent to the optimization of a non-convex function called the likelihood function. Due to access the explicit form of the gradient, it is recommended to use conjugate gradient (CG) optimization algorithms. In addition, due to the necessity of computation of the determinant and inverse of the covariance matrix in each evaluation of the likelihood function, it is recommended to use CG methods in such a way that it can be completed in the minimum number of evaluations. In previous studies, only special form of CG method has been considered, which naturally will not have high efficiency. In this paper, the efficiency of the CG methods for optimization of the likelihood function in PIGP has been studied. The results of the numerical simulations show that the initial step length and search direction in CG methods have a significant effect on the number of evaluations of the likelihood function and consequently on the efficiency of the PIGP. Also, according to the specific characteristics of the objective function in this problem, in the traditional CG methods, normalizing the initial step length to avoid getting stuck in bad conditioned points and improving the search direction by using angle condition to guarantee global convergence have been proposed. The results of numerical simulations obtained from the investigation of seven different improved CG methods with different angles in angle condition (four angles) and different initial step lengths (three step lengths), show the significant effect of the proposed modifications in reducing the number of iterations and the number of evaluations in different types of CG methods. This increases the efficiency of the PIGP method significantly, especially when the traditional CG algorithms fail in the optimization process, the improved algorithms perform well. Finally, in order to make it possible to implement the studies carried out in this paper for other parametric equations, the compiled package including the methods used in this paper is attached.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Engineering with Computers 工程技术-工程：机械

CiteScore

16.50

自引率

2.30%

发文量

203

审稿时长

9 months

期刊介绍： Engineering with Computers is an international journal dedicated to simulation-based engineering. It features original papers and comprehensive reviews on technologies supporting simulation-based engineering, along with demonstrations of operational simulation-based engineering systems. The journal covers various technical areas such as adaptive simulation techniques, engineering databases, CAD geometry integration, mesh generation, parallel simulation methods, simulation frameworks, user interface technologies, and visualization techniques. It also encompasses a wide range of application areas where engineering technologies are applied, spanning from automotive industry applications to medical device design.