arXiv - STAT - Machine Learning最新文献

英文中文

Modelling Global Trade with Optimal Transport 以最佳运输方式模拟全球贸易

arXiv - STAT - Machine Learning

Pub Date : 2024-09-10 DOI: arxiv-2409.06554

Thomas Gaskin, Marie-Therese Wolfram, Andrew Duncan, Guven Demirel

Global trade is shaped by a complex mix of factors beyond supply and demand,including tangible variables like transport costs and tariffs, as well as lessquantifiable influences such as political and economic relations.Traditionally, economists model trade using gravity models, which rely onexplicit covariates but often struggle to capture these subtler drivers oftrade. In this work, we employ optimal transport and a deep neural network tolearn a time-dependent cost function from data, without imposing a specificfunctional form. This approach consistently outperforms traditional gravitymodels in accuracy while providing natural uncertainty quantification. Applyingour framework to global food and agricultural trade, we show that the globalSouth suffered disproportionately from the war in Ukraine's impact on wheatmarkets. We also analyze the effects of free-trade agreements and tradedisputes with China, as well as Brexit's impact on British trade with Europe,uncovering hidden patterns that trade volumes alone cannot reveal.

全球贸易是由供需之外的各种复杂因素形成的，包括运输成本和关税等有形变量，以及政治和经济关系等较难量化的影响因素。传统上，经济学家使用引力模型对贸易进行建模，这种模型依赖于一个明确的协变量，但往往难以捕捉这些更微妙的贸易驱动因素。在这项工作中，我们采用最优传输和深度神经网络，从数据中学习随时间变化的成本函数，而不强加特定的函数形式。这种方法的准确性一直优于传统的重力模型，同时提供了自然的不确定性量化。我们将这一框架应用于全球粮食和农产品贸易，结果表明，乌克兰战争对小麦市场的影响使全球南方国家遭受了不成比例的损失。我们还分析了与中国的自由贸易协定和贸易争端的影响，以及英国脱欧对英国与欧洲贸易的影响，揭示了仅靠贸易量无法揭示的隐藏模式。

引用次数: 0

Quasi-potential and drift decomposition in stochastic systems by sparse identification 通过稀疏识别进行随机系统中的准势垒和漂移分解

arXiv - STAT - Machine Learning

Pub Date : 2024-09-10 DOI: arxiv-2409.06886

Leonardo Grigorio, Mnerh Alqahtani

The quasi-potential is a key concept in stochastic systems as it accounts forthe long-term behavior of the dynamics of such systems. It also allows us toestimate mean exit times from the attractors of the system, and transitionrates between states. This is of significance in many applications acrossvarious areas such as physics, biology, ecology, and economy. Computation ofthe quasi-potential is often obtained via a functional minimization problemthat can be challenging. This paper combines a sparse learning technique withaction minimization methods in order to: (i) Identify the orthogonaldecomposition of the deterministic vector field (drift) driving the stochasticdynamics; (ii) Determine the quasi-potential from this decomposition. Thisdecomposition of the drift vector field into its gradient and orthogonal partsis accomplished with the help of a machine learning-based sparse identificationtechnique. Specifically, the so-called sparse identification of non-lineardynamics (SINDy) [1] is applied to the most likely trajectory in a stochasticsystem (instanton) to learn the orthogonal decomposition of the drift.Consequently, the quasi-potential can be evaluated even at points outside theinstanton path, allowing our method to provide the complete quasi-potentiallandscape from this single trajectory. Additionally, the orthogonal driftcomponent obtained within our framework is important as a correction to theexponential decay of transition rates and exit times. We implemented theproposed approach in 2- and 3-D systems, covering various types of potentiallandscapes and attractors.

准势垒是随机系统中的一个关键概念，因为它说明了此类系统动力学的长期行为。它还能让我们估算出系统吸引子的平均退出时间以及状态之间的过渡率。这在物理学、生物学、生态学和经济学等各个领域的许多应用中都具有重要意义。准势垒的计算通常是通过函数最小化问题来实现的，这可能具有挑战性。本文将稀疏学习技术与函数最小化方法相结合，旨在(i) 确定驱动随机动力学的确定性向量场（漂移）的正交分解；(ii) 根据该分解确定准势垒。将漂移矢量场分解为梯度和正交部分，是借助基于机器学习的稀疏识别技术完成的。具体来说，所谓的非线性动力学稀疏识别（SINDy）[1]应用于随机系统（瞬时）中最可能的轨迹，以学习漂移的正交分解。因此，即使在瞬时路径之外的点也可以评估准势能，从而使我们的方法能够从这一单一轨迹中提供完整的准势能场。此外，在我们的框架内获得的正交漂移分量对于校正过渡率和退出时间的指数衰减非常重要。我们在二维和三维系统中实施了所提出的方法，涵盖了各种类型的潜在地景和吸引子。

{"title":"Quasi-potential and drift decomposition in stochastic systems by sparse identification","authors":"Leonardo Grigorio, Mnerh Alqahtani","doi":"arxiv-2409.06886","DOIUrl":"https://doi.org/arxiv-2409.06886","url":null,"abstract":"The quasi-potential is a key concept in stochastic systems as it accounts for\u0000the long-term behavior of the dynamics of such systems. It also allows us to\u0000estimate mean exit times from the attractors of the system, and transition\u0000rates between states. This is of significance in many applications across\u0000various areas such as physics, biology, ecology, and economy. Computation of\u0000the quasi-potential is often obtained via a functional minimization problem\u0000that can be challenging. This paper combines a sparse learning technique with\u0000action minimization methods in order to: (i) Identify the orthogonal\u0000decomposition of the deterministic vector field (drift) driving the stochastic\u0000dynamics; (ii) Determine the quasi-potential from this decomposition. This\u0000decomposition of the drift vector field into its gradient and orthogonal parts\u0000is accomplished with the help of a machine learning-based sparse identification\u0000technique. Specifically, the so-called sparse identification of non-linear\u0000dynamics (SINDy) [1] is applied to the most likely trajectory in a stochastic\u0000system (instanton) to learn the orthogonal decomposition of the drift.\u0000Consequently, the quasi-potential can be evaluated even at points outside the\u0000instanton path, allowing our method to provide the complete quasi-potential\u0000landscape from this single trajectory. Additionally, the orthogonal drift\u0000component obtained within our framework is important as a correction to the\u0000exponential decay of transition rates and exit times. We implemented the\u0000proposed approach in 2- and 3-D systems, covering various types of potential\u0000landscapes and attractors.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Uncertainty Quantification in Seismic Inversion Through Integrated Importance Sampling and Ensemble Methods 通过综合重要度采样和集合方法量化地震反演中的不确定性

arXiv - STAT - Machine Learning

Pub Date : 2024-09-10 DOI: arxiv-2409.06840

Luping Qu, Mauricio Araya-Polo, Laurent Demanet

Seismic inversion is essential for geophysical exploration and geologicalassessment, but it is inherently subject to significant uncertainty. Thisuncertainty stems primarily from the limited information provided by observedseismic data, which is largely a result of constraints in data collectiongeometry. As a result, multiple plausible velocity models can often explain thesame set of seismic observations. In deep learning-based seismic inversion,uncertainty arises from various sources, including data noise, neural networkdesign and training, and inherent data limitations. This study introduces anovel approach to uncertainty quantification in seismic inversion byintegrating ensemble methods with importance sampling. By leveraging ensembleapproach in combination with importance sampling, we enhance the accuracy ofuncertainty analysis while maintaining computational efficiency. The methodinvolves initializing each model in the ensemble with different weights,introducing diversity in predictions and thereby improving the robustness andreliability of the inversion outcomes. Additionally, the use of importancesampling weights the contribution of each ensemble sample, allowing us to use alimited number of ensemble samples to obtain more accurate estimates of theposterior distribution. Our approach enables more precise quantification ofuncertainty in velocity models derived from seismic data. By utilizing alimited number of ensemble samples, this method achieves an accurate andreliable assessment of uncertainty, ultimately providing greater confidence inseismic inversion results.

地震反演对于地球物理勘探和地质评估至关重要，但它本身存在很大的不确定性。这种不确定性主要源于观测到的地震数据所提供的信息有限，这在很大程度上是数据采集几何限制的结果。因此，多个可信的速度模型往往可以解释同一组地震观测数据。在基于深度学习的地震反演中，不确定性有多种来源，包括数据噪声、神经网络设计和训练以及固有数据限制。本研究通过将集合方法与重要性采样相结合，为地震反演中的不确定性量化引入了一种新方法。通过将集合方法与重要性采样相结合，我们在保持计算效率的同时提高了不确定性分析的准确性。该方法涉及以不同权重初始化集合中的每个模型，引入预测的多样性，从而提高反演结果的稳健性和可靠性。此外，使用输入采样对每个集合样本的贡献进行加权，使我们能够使用有限数量的集合样本来获得更精确的后向分布估计值。我们的方法能够更精确地量化地震数据速度模型的不确定性。通过利用有限数量的集合样本，该方法实现了对不确定性的准确可靠评估，最终为地震反演结果提供了更大的可信度。

{"title":"Uncertainty Quantification in Seismic Inversion Through Integrated Importance Sampling and Ensemble Methods","authors":"Luping Qu, Mauricio Araya-Polo, Laurent Demanet","doi":"arxiv-2409.06840","DOIUrl":"https://doi.org/arxiv-2409.06840","url":null,"abstract":"Seismic inversion is essential for geophysical exploration and geological\u0000assessment, but it is inherently subject to significant uncertainty. This\u0000uncertainty stems primarily from the limited information provided by observed\u0000seismic data, which is largely a result of constraints in data collection\u0000geometry. As a result, multiple plausible velocity models can often explain the\u0000same set of seismic observations. In deep learning-based seismic inversion,\u0000uncertainty arises from various sources, including data noise, neural network\u0000design and training, and inherent data limitations. This study introduces a\u0000novel approach to uncertainty quantification in seismic inversion by\u0000integrating ensemble methods with importance sampling. By leveraging ensemble\u0000approach in combination with importance sampling, we enhance the accuracy of\u0000uncertainty analysis while maintaining computational efficiency. The method\u0000involves initializing each model in the ensemble with different weights,\u0000introducing diversity in predictions and thereby improving the robustness and\u0000reliability of the inversion outcomes. Additionally, the use of importance\u0000sampling weights the contribution of each ensemble sample, allowing us to use a\u0000limited number of ensemble samples to obtain more accurate estimates of the\u0000posterior distribution. Our approach enables more precise quantification of\u0000uncertainty in velocity models derived from seismic data. By utilizing a\u0000limited number of ensemble samples, this method achieves an accurate and\u0000reliable assessment of uncertainty, ultimately providing greater confidence in\u0000seismic inversion results.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"203 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Applications of machine learning to predict seasonal precipitation for East Africa 应用机器学习预测东非季节性降水量

arXiv - STAT - Machine Learning

Pub Date : 2024-09-10 DOI: arxiv-2409.06238

Michael Scheuerer, Claudio Heinrich-Mertsching, Titike K. Bahaga, Masilin Gudoshava, Thordis L. Thorarinsdottir

Seasonal climate forecasts are commonly based on model runs from fullycoupled forecasting systems that use Earth system models to representinteractions between the atmosphere, ocean, land and other Earth-systemcomponents. Recently, machine learning (ML) methods are increasingly beinginvestigated for this task where large-scale climate variability is linked tolocal or regional temperature or precipitation in a linear or non-linearfashion. This paper investigates the use of interpretable ML methods to predictseasonal precipitation for East Africa in an operational setting. Dimensionreduction is performed by decomposing the precipitation fields via empiricalorthogonal functions (EOFs), such that only the respective factor loadings needto the predicted. Indices of large-scale climate variability--including therate of change in individual indices as well as interactions between differentindices--are then used as potential features to obtain tercile forecasts froman interpretable ML algorithm. Several research questions regarding the use ofdata and the effect of model complexity are studied. The results are comparedagainst the ECMWF seasonal forecasting system (SEAS5) for three seasons--MAM,JJAS and OND--over the period 1993-2020. Compared to climatology for the sameperiod, the ECMWF forecasts have negative skill in MAM and JJAS and significantpositive skill in OND. The ML approach is on par with climatology in MAM andJJAS and a significantly positive skill in OND, if not quite at the level ofthe OND ECMWF forecast.

季节性气候预报通常基于全耦合预报系统的模型运行，这些系统使用地球系统模型来表示大气、海洋、陆地和其他地球系统组成部分之间的相互作用。最近，机器学习（ML）方法越来越多地被用于这项任务，在这项任务中，大尺度气候变率以线性或非线性方式与本地或区域温度或降水相关联。本文研究了如何使用可解释的 ML 方法来预测东非的季节性降水量。通过经验正交函数（EOFs）对降水场进行分解，从而只需预测各自的因子载荷。然后将大尺度气候变异性指数（包括单个指数的变化率以及不同指数之间的相互作用）作为潜在特征，通过可解释的 ML 算法获得三元预报。研究了有关数据使用和模型复杂性影响的几个研究问题。研究结果与 ECMWF 季节预报系统（SEAS5）进行了比较，包括 1993-2020 年间的三个季节--MAM、JJAS 和 OND。与同期气候学相比，ECMWF的预报在MAM和JJAS中的技能为负，在OND中的技能为显著的正。ML方法在MAM和JJAS方面与气候学相近，在OND方面具有显著的正技能，尽管还没有达到ECMWF预测的OND水平。

{"title":"Applications of machine learning to predict seasonal precipitation for East Africa","authors":"Michael Scheuerer, Claudio Heinrich-Mertsching, Titike K. Bahaga, Masilin Gudoshava, Thordis L. Thorarinsdottir","doi":"arxiv-2409.06238","DOIUrl":"https://doi.org/arxiv-2409.06238","url":null,"abstract":"Seasonal climate forecasts are commonly based on model runs from fully\u0000coupled forecasting systems that use Earth system models to represent\u0000interactions between the atmosphere, ocean, land and other Earth-system\u0000components. Recently, machine learning (ML) methods are increasingly being\u0000investigated for this task where large-scale climate variability is linked to\u0000local or regional temperature or precipitation in a linear or non-linear\u0000fashion. This paper investigates the use of interpretable ML methods to predict\u0000seasonal precipitation for East Africa in an operational setting. Dimension\u0000reduction is performed by decomposing the precipitation fields via empirical\u0000orthogonal functions (EOFs), such that only the respective factor loadings need\u0000to the predicted. Indices of large-scale climate variability--including the\u0000rate of change in individual indices as well as interactions between different\u0000indices--are then used as potential features to obtain tercile forecasts from\u0000an interpretable ML algorithm. Several research questions regarding the use of\u0000data and the effect of model complexity are studied. The results are compared\u0000against the ECMWF seasonal forecasting system (SEAS5) for three seasons--MAM,\u0000JJAS and OND--over the period 1993-2020. Compared to climatology for the same\u0000period, the ECMWF forecasts have negative skill in MAM and JJAS and significant\u0000positive skill in OND. The ML approach is on par with climatology in MAM and\u0000JJAS and a significantly positive skill in OND, if not quite at the level of\u0000the OND ECMWF forecast.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Extending Explainable Ensemble Trees (E2Tree) to regression contexts 将可解释集合树（E2Tree）扩展到回归情境中

arXiv - STAT - Machine Learning

Pub Date : 2024-09-10 DOI: arxiv-2409.06439

Massimo Aria, Agostino Gnasso, Carmela Iorio, Marjolein Fokkema

Ensemble methods such as random forests have transformed the landscape ofsupervised learning, offering highly accurate prediction through theaggregation of multiple weak learners. However, despite their effectiveness,these methods often lack transparency, impeding users' comprehension of how RFmodels arrive at their predictions. Explainable ensemble trees (E2Tree) is anovel methodology for explaining random forests, that provides a graphicalrepresentation of the relationship between response variables and predictors. Astriking characteristic of E2Tree is that it not only accounts for the effectsof predictor variables on the response but also accounts for associationsbetween the predictor variables through the computation and use ofdissimilarity measures. The E2Tree methodology was initially proposed for usein classification tasks. In this paper, we extend the methodology to encompassregression contexts. To demonstrate the explanatory power of the proposedalgorithm, we illustrate its use on real-world datasets.

随机森林等集合方法改变了监督学习的格局，通过对多个弱学习者进行集合，提供了高精度的预测。然而，尽管这些方法很有效，但往往缺乏透明度，妨碍用户理解 RF 模型是如何得出预测结果的。可解释集合树（E2Tree）是解释随机森林的一种新方法，它以图形的形式展示了响应变量和预测因子之间的关系。E2Tree 的一个显著特点是，它不仅能说明预测变量对响应的影响，还能通过计算和使用不相似度量来说明预测变量之间的关联。E2Tree 方法最初是为用于分类任务而提出的。在本文中，我们将该方法扩展到了回归情境中。为了证明所提算法的解释能力，我们在真实世界的数据集上对其使用进行了说明。

引用次数: 0

The Weak Form Is Stronger Than You Think 弱者比你想象的更强大

arXiv - STAT - Machine Learning

Pub Date : 2024-09-10 DOI: arxiv-2409.06751

Daniel A. Messenger, April Tran, Vanja Dukic, David M. Bortz

The weak form is a ubiquitous, well-studied, and widely-utilized mathematicaltool in modern computational and applied mathematics. In this work we provide asurvey of both the history and recent developments for several fields in whichthe weak form can play a critical role. In particular, we highlight severalrecent advances in weak form versions of equation learning, parameterestimation, and coarse graining, which offer surprising noise robustness,accuracy, and computational efficiency. We note that this manuscript is a companion piece to our October 2024 SIAMNews article of the same name. Here we provide more detailed explanations ofmathematical developments as well as a more complete list of references.Lastly, we note that the software with which to reproduce the results in thismanuscript is also available on our group's GitHub websitehttps://github.com/MathBioCU .

在现代计算数学和应用数学中，弱形式是一种无处不在、研究透彻、应用广泛的数学工具。在这项研究中，我们对弱形式能发挥关键作用的几个领域的历史和最新发展进行了调查。我们特别强调了方程学习、参数估计和粗粒化的弱形式版本的最新进展，这些进展提供了令人惊讶的噪声鲁棒性、准确性和计算效率。我们注意到，本手稿是 2024 年 10 月 SIAMNews 同名文章的配套文章。在此，我们提供了更详细的数学发展解释以及更完整的参考文献列表。最后，我们要说明的是，重现本手稿中结果的软件也可在我们小组的 GitHub 网站https://github.com/MathBioCU。

引用次数: 0

Variational Search Distributions 变量搜索分布

arXiv - STAT - Machine Learning

Pub Date : 2024-09-10 DOI: arxiv-2409.06142

Daniel M. Steinberg, Rafael Oliveira, Cheng Soon Ong, Edwin V. Bonilla

We develop variational search distributions (VSD), a method for findingdiscrete, combinatorial designs of a rare desired class in a batch sequentialmanner with a fixed experimental budget. We formalize the requirements anddesiderata for this problem and formulate a solution via variational inferencethat fulfill these. In particular, VSD uses off-the-shelf gradient basedoptimization routines, and can take advantage of scalable predictive models. Weshow that VSD can outperform existing baseline methods on a set of realsequence-design problems in various biological systems.

我们开发了变异搜索分布（VSD），这是一种在固定的实验预算下，以批量顺序的方式寻找稀有的理想类别的离散组合设计的方法。我们对这一问题的要求和考虑因素进行了形式化，并通过变异推理提出了一个能满足这些要求的解决方案。特别是，VSD 使用现成的基于梯度的优化程序，并能利用可扩展的预测模型。结果表明，在各种生物系统的一系列实际序列设计问题上，VSD的性能优于现有的基线方法。

引用次数: 0

Joint trajectory and network inference via reference fitting 通过参考拟合进行联合轨迹和网络推理

arXiv - STAT - Machine Learning

Pub Date : 2024-09-10 DOI: arxiv-2409.06879

Stephen Y Zhang

Network inference, the task of reconstructing interactions in a complexsystem from experimental observables, is a central yet extremely challengingproblem in systems biology. While much progress has been made in the last twodecades, network inference remains an open problem. For systems observed atsteady state, limited insights are available since temporal information isunavailable and thus causal information is lost. Two common avenues for gainingcausal insights into system behaviour are to leverage temporal dynamics in theform of trajectories, and to apply interventions such as knock-outperturbations. We propose an approach for leveraging both dynamical andperturbational single cell data to jointly learn cellular trajectories andpower network inference. Our approach is motivated by min-entropy estimationfor stochastic dynamics and can infer directed and signed networks fromtime-stamped single cell snapshots.

网络推断是根据实验观测数据重建复杂系统中相互作用的任务，是系统生物学中一个核心但极具挑战性的问题。虽然在过去二十年中已经取得了很大进展，但网络推断仍然是一个悬而未决的问题。对于观察到的处于非稳定状态的系统，由于无法获得时间信息，因而也就失去了因果信息，因此只能获得有限的见解。获得系统行为因果关系洞察力的两个常见途径是利用轨迹形式的时间动态，以及应用诸如敲除扰动等干预措施。我们提出了一种利用动态和扰动单细胞数据来共同学习细胞轨迹和增强网络推断能力的方法。我们的方法受随机动力学最小熵估计的启发，可以从有时间戳的单细胞快照中推断出有向和有符号的网络。

引用次数: 0

A Primer on Variational Inference for Physics-Informed Deep Generative Modelling 物理信息深度生成模型的变量推理入门指南

arXiv - STAT - Machine Learning

Pub Date : 2024-09-10 DOI: arxiv-2409.06560

Alex Glyn-Davies, Arnaud Vadeboncoeur, O. Deniz Akyildiz, Ieva Kazlauskaite, Mark Girolami

Variational inference (VI) is a computationally efficient and scalablemethodology for approximate Bayesian inference. It strikes a balance betweenaccuracy of uncertainty quantification and practical tractability. It excels atgenerative modelling and inversion tasks due to its built-in Bayesianregularisation and flexibility, essential qualities for physics relatedproblems. Deriving the central learning objective for VI must often be tailoredto new learning tasks where the nature of the problems dictates the conditionaldependence between variables of interest, such as arising in physics problems.In this paper, we provide an accessible and thorough technical introduction toVI for forward and inverse problems, guiding the reader through standardderivations of the VI framework and how it can best be realized through deeplearning. We then review and unify recent literature exemplifying the creativeflexibility allowed by VI. This paper is designed for a general scientificaudience looking to solve physics-based problems with an emphasis onuncertainty quantification.

变量推理（Variational inference，VI）是一种计算效率高、可扩展的近似贝叶斯推理方法。它在不确定性量化的准确性和实用性之间取得了平衡。由于其内置的贝叶斯规则化和灵活性，它在生成建模和反演任务中表现出色，这些都是物理相关问题的基本特征。在本文中，我们针对正演和反演问题对 VI 进行了深入浅出的技术介绍，引导读者了解 VI 框架的标准衍生，以及如何通过深度学习最好地实现 VI。然后，我们回顾并统一了最近的文献，这些文献体现了 VI 所允许的创造性灵活性。本文面向希望解决物理问题的普通科学读者，重点关注不确定性量化。

{"title":"A Primer on Variational Inference for Physics-Informed Deep Generative Modelling","authors":"Alex Glyn-Davies, Arnaud Vadeboncoeur, O. Deniz Akyildiz, Ieva Kazlauskaite, Mark Girolami","doi":"arxiv-2409.06560","DOIUrl":"https://doi.org/arxiv-2409.06560","url":null,"abstract":"Variational inference (VI) is a computationally efficient and scalable\u0000methodology for approximate Bayesian inference. It strikes a balance between\u0000accuracy of uncertainty quantification and practical tractability. It excels at\u0000generative modelling and inversion tasks due to its built-in Bayesian\u0000regularisation and flexibility, essential qualities for physics related\u0000problems. Deriving the central learning objective for VI must often be tailored\u0000to new learning tasks where the nature of the problems dictates the conditional\u0000dependence between variables of interest, such as arising in physics problems.\u0000In this paper, we provide an accessible and thorough technical introduction to\u0000VI for forward and inverse problems, guiding the reader through standard\u0000derivations of the VI framework and how it can best be realized through deep\u0000learning. We then review and unify recent literature exemplifying the creative\u0000flexibility allowed by VI. This paper is designed for a general scientific\u0000audience looking to solve physics-based problems with an emphasis on\u0000uncertainty quantification.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deep Neural Networks: Multi-Classification and Universal Approximation 深度神经网络：多分类和通用逼近

arXiv - STAT - Machine Learning

Pub Date : 2024-09-10 DOI: arxiv-2409.06555

Martín Hernández, Enrique Zuazua

We demonstrate that a ReLU deep neural network with a width of $2$ and adepth of $2N+4M-1$ layers can achieve finite sample memorization for anydataset comprising $N$ elements in $mathbb{R}^d$, where $dge1,$ and $M$classes, thereby ensuring accurate classification. By modeling the neural network as a time-discrete nonlinear dynamical system,we interpret the memorization property as a problem of simultaneous or ensemblecontrollability. This problem is addressed by constructing the networkparameters inductively and explicitly, bypassing the need for training orsolving any optimization problem. Additionally, we establish that such a network can achieve universalapproximation in $L^p(Omega;mathbb{R}_+)$, where $Omega$ is a bounded subsetof $mathbb{R}^d$ and $pin[1,infty)$, using a ReLU deep neural network with awidth of $d+1$. We also provide depth estimates for approximating $W^{1,p}$functions and width estimates for approximating $L^p(Omega;mathbb{R}^m)$ for$mgeq1$. Our proofs are constructive, offering explicit values for the biasesand weights involved.

我们证明了一个宽度为 2 美元、深度为 2N+4M-1 美元层的 ReLU 深度神经网络可以对任何由 $mathbb{R}^d$ 中的 $N$ 元素（其中 $dge1,$ 和 $M$ 类）组成的数据集实现有限样本记忆，从而确保准确分类。通过将神经网络建模为一个时间离散的非线性动态系统，我们将记忆特性解释为同步或集合可控性问题。解决这个问题的方法是通过归纳和显式构建网络参数，从而绕过了训练或解决任何优化问题的需要。此外，我们还利用带宽为 $d+1$ 的 ReLU 深度神经网络，确定这种网络可以在 $L^p(Omega;mathbb{R}_+)$（其中 $Omega$ 是 $mathbb{R}^d$ 的有界子集，且 $p/in[1,infty)$ 中实现通用逼近。我们还提供了近似$W^{1,p}$函数的深度估计值，以及近似$mgeq1$的$L^p(Omega;mathbb{R}^m)$的宽度估计值。我们的证明是建设性的，为所涉及的偏差和权重提供了明确的数值。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

arXiv - STAT - Machine Learning

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀