{"title":"流形上的概率学习","authors":"Christian Soize, R. Ghanem","doi":"10.3934/fods.2020013","DOIUrl":null,"url":null,"abstract":"This paper presents mathematical results in support of the methodology of the probabilistic learning on manifolds (PLoM) recently introduced by the authors, which has been used with success for analyzing complex engineering systems. The PLoM considers a given initial dataset constituted of a small number of points given in an Euclidean space, which are interpreted as independent realizations of a vector-valued random variable for which its non-Gaussian probability measure is unknown but is, \\textit{a priori}, concentrated in an unknown subset of the Euclidean space. The objective is to construct a learned dataset constituted of additional realizations that allow the evaluation of converged statistics. A transport of the probability measure estimated with the initial dataset is done through a linear transformation constructed using a reduced-order diffusion-maps basis. In this paper, it is proven that this transported measure is a marginal distribution of the invariant measure of a reduced-order Ito stochastic differential equation that corresponds to a dissipative Hamiltonian dynamical system. This construction allows for preserving the concentration of the probability measure. This property is shown by analyzing a distance between the random matrix constructed with the PLoM and the matrix representing the initial dataset, as a function of the dimension of the basis. It is further proven that this distance has a minimum for a dimension of the reduced-order diffusion-maps basis that is strictly smaller than the number of points in the initial dataset. Finally, a brief numerical application illustrates the mathematical results.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":" ","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2020-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"Probabilistic learning on manifolds\",\"authors\":\"Christian Soize, R. Ghanem\",\"doi\":\"10.3934/fods.2020013\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents mathematical results in support of the methodology of the probabilistic learning on manifolds (PLoM) recently introduced by the authors, which has been used with success for analyzing complex engineering systems. The PLoM considers a given initial dataset constituted of a small number of points given in an Euclidean space, which are interpreted as independent realizations of a vector-valued random variable for which its non-Gaussian probability measure is unknown but is, \\\\textit{a priori}, concentrated in an unknown subset of the Euclidean space. The objective is to construct a learned dataset constituted of additional realizations that allow the evaluation of converged statistics. A transport of the probability measure estimated with the initial dataset is done through a linear transformation constructed using a reduced-order diffusion-maps basis. In this paper, it is proven that this transported measure is a marginal distribution of the invariant measure of a reduced-order Ito stochastic differential equation that corresponds to a dissipative Hamiltonian dynamical system. This construction allows for preserving the concentration of the probability measure. This property is shown by analyzing a distance between the random matrix constructed with the PLoM and the matrix representing the initial dataset, as a function of the dimension of the basis. It is further proven that this distance has a minimum for a dimension of the reduced-order diffusion-maps basis that is strictly smaller than the number of points in the initial dataset. Finally, a brief numerical application illustrates the mathematical results.\",\"PeriodicalId\":73054,\"journal\":{\"name\":\"Foundations of data science (Springfield, Mo.)\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2020-02-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Foundations of data science (Springfield, Mo.)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3934/fods.2020013\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATHEMATICS, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Foundations of data science (Springfield, Mo.)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3934/fods.2020013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
This paper presents mathematical results in support of the methodology of the probabilistic learning on manifolds (PLoM) recently introduced by the authors, which has been used with success for analyzing complex engineering systems. The PLoM considers a given initial dataset constituted of a small number of points given in an Euclidean space, which are interpreted as independent realizations of a vector-valued random variable for which its non-Gaussian probability measure is unknown but is, \textit{a priori}, concentrated in an unknown subset of the Euclidean space. The objective is to construct a learned dataset constituted of additional realizations that allow the evaluation of converged statistics. A transport of the probability measure estimated with the initial dataset is done through a linear transformation constructed using a reduced-order diffusion-maps basis. In this paper, it is proven that this transported measure is a marginal distribution of the invariant measure of a reduced-order Ito stochastic differential equation that corresponds to a dissipative Hamiltonian dynamical system. This construction allows for preserving the concentration of the probability measure. This property is shown by analyzing a distance between the random matrix constructed with the PLoM and the matrix representing the initial dataset, as a function of the dimension of the basis. It is further proven that this distance has a minimum for a dimension of the reduced-order diffusion-maps basis that is strictly smaller than the number of points in the initial dataset. Finally, a brief numerical application illustrates the mathematical results.