A. Litvinenko, Y. Marzouk, H. Matthies, M. Scavino, Alessio Spantini
{"title":"计算高维概率密度函数的f -散度和距离","authors":"A. Litvinenko, Y. Marzouk, H. Matthies, M. Scavino, Alessio Spantini","doi":"10.1002/nla.2467","DOIUrl":null,"url":null,"abstract":"Very often, in the course of uncertainty quantification tasks or data analysis, one has to deal with high‐dimensional random variables. Here the interest is mainly to compute characterizations like the entropy, the Kullback–Leibler divergence, more general f$$ f $$ ‐divergences, or other such characteristics based on the probability density. The density is often not available directly, and it is a computational challenge to just represent it in a numerically feasible fashion in case the dimension is even moderately large. It is an even stronger numerical challenge to then actually compute said characteristics in the high‐dimensional case. In this regard it is proposed to approximate the discretized density in a compressed form, in particular by a low‐rank tensor. This can alternatively be obtained from the corresponding probability characteristic function, or more general representations of the underlying random variable. The mentioned characterizations need point‐wise functions like the logarithm. This normally rather trivial task becomes computationally difficult when the density is approximated in a compressed resp. low‐rank tensor format, as the point values are not directly accessible. The computations become possible by considering the compressed data as an element of an associative, commutative algebra with an inner product, and using matrix algorithms to accomplish the mentioned tasks. The representation as a low‐rank element of a high order tensor space allows to reduce the computational complexity and storage cost from exponential in the dimension to almost linear.","PeriodicalId":1,"journal":{"name":"Accounts of Chemical Research","volume":null,"pages":null},"PeriodicalIF":16.4000,"publicationDate":"2022-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Computing f ‐divergences and distances of high‐dimensional probability density functions\",\"authors\":\"A. Litvinenko, Y. Marzouk, H. Matthies, M. Scavino, Alessio Spantini\",\"doi\":\"10.1002/nla.2467\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Very often, in the course of uncertainty quantification tasks or data analysis, one has to deal with high‐dimensional random variables. Here the interest is mainly to compute characterizations like the entropy, the Kullback–Leibler divergence, more general f$$ f $$ ‐divergences, or other such characteristics based on the probability density. The density is often not available directly, and it is a computational challenge to just represent it in a numerically feasible fashion in case the dimension is even moderately large. It is an even stronger numerical challenge to then actually compute said characteristics in the high‐dimensional case. In this regard it is proposed to approximate the discretized density in a compressed form, in particular by a low‐rank tensor. This can alternatively be obtained from the corresponding probability characteristic function, or more general representations of the underlying random variable. The mentioned characterizations need point‐wise functions like the logarithm. This normally rather trivial task becomes computationally difficult when the density is approximated in a compressed resp. low‐rank tensor format, as the point values are not directly accessible. The computations become possible by considering the compressed data as an element of an associative, commutative algebra with an inner product, and using matrix algorithms to accomplish the mentioned tasks. The representation as a low‐rank element of a high order tensor space allows to reduce the computational complexity and storage cost from exponential in the dimension to almost linear.\",\"PeriodicalId\":1,\"journal\":{\"name\":\"Accounts of Chemical Research\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":16.4000,\"publicationDate\":\"2022-09-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Accounts of Chemical Research\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1002/nla.2467\",\"RegionNum\":1,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of Chemical Research","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1002/nla.2467","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
Computing f ‐divergences and distances of high‐dimensional probability density functions
Very often, in the course of uncertainty quantification tasks or data analysis, one has to deal with high‐dimensional random variables. Here the interest is mainly to compute characterizations like the entropy, the Kullback–Leibler divergence, more general f$$ f $$ ‐divergences, or other such characteristics based on the probability density. The density is often not available directly, and it is a computational challenge to just represent it in a numerically feasible fashion in case the dimension is even moderately large. It is an even stronger numerical challenge to then actually compute said characteristics in the high‐dimensional case. In this regard it is proposed to approximate the discretized density in a compressed form, in particular by a low‐rank tensor. This can alternatively be obtained from the corresponding probability characteristic function, or more general representations of the underlying random variable. The mentioned characterizations need point‐wise functions like the logarithm. This normally rather trivial task becomes computationally difficult when the density is approximated in a compressed resp. low‐rank tensor format, as the point values are not directly accessible. The computations become possible by considering the compressed data as an element of an associative, commutative algebra with an inner product, and using matrix algorithms to accomplish the mentioned tasks. The representation as a low‐rank element of a high order tensor space allows to reduce the computational complexity and storage cost from exponential in the dimension to almost linear.
期刊介绍:
Accounts of Chemical Research presents short, concise and critical articles offering easy-to-read overviews of basic research and applications in all areas of chemistry and biochemistry. These short reviews focus on research from the author’s own laboratory and are designed to teach the reader about a research project. In addition, Accounts of Chemical Research publishes commentaries that give an informed opinion on a current research problem. Special Issues online are devoted to a single topic of unusual activity and significance.
Accounts of Chemical Research replaces the traditional article abstract with an article "Conspectus." These entries synopsize the research affording the reader a closer look at the content and significance of an article. Through this provision of a more detailed description of the article contents, the Conspectus enhances the article's discoverability by search engines and the exposure for the research.