arXiv - STAT - Computation最新文献

英文中文

A Bayesian Optimization through Sequential Monte Carlo and Statistical Physics-Inspired Techniques 通过序列蒙特卡洛和统计物理学启发技术进行贝叶斯优化

arXiv - STAT - Computation

Pub Date : 2024-09-04 DOI: arxiv-2409.03094

Anton Lebedev, Thomas Warford, M. Emre Şahin

In this paper, we propose an approach for an application of Bayesianoptimization using Sequential Monte Carlo (SMC) and concepts from thestatistical physics of classical systems. Our method leverages the power ofmodern machine learning libraries such as NumPyro and JAX, allowing us toperform Bayesian optimization on multiple platforms, including CPUs, GPUs,TPUs, and in parallel. Our approach enables a low entry level for explorationof the methods while maintaining high performance. We present a promisingdirection for developing more efficient and effective techniques for a widerange of optimization problems in diverse fields.

在本文中，我们提出了一种使用序列蒙特卡罗（SMC）和经典系统统计物理学概念的贝叶斯优化应用方法。我们的方法利用了 NumPyro 和 JAX 等现代机器学习库的强大功能，使我们能够在多个平台（包括 CPU、GPU、TPU）上并行执行贝叶斯优化。我们的方法在保持高性能的同时，还降低了方法探索的入门门槛。我们提出了一个很有前途的方向，可以为不同领域中更广泛的优化问题开发更高效、更有效的技术。

引用次数: 0

Guidance for twisted particle filter: a continuous-time perspective 扭转粒子滤波器的指导：连续时间视角

arXiv - STAT - Computation

Pub Date : 2024-09-04 DOI: arxiv-2409.02399

Jianfeng Lu, Yuliang Wang

The particle filter (PF), also known as the sequential Monte Carlo (SMC), isdesigned to approximate high-dimensional probability distributions and theirnormalizing constants in the discrete-time setting. To reduce the variance ofthe Monte Carlo approximation, several twisted particle filters (TPF) have beenproposed by researchers, where one chooses or learns a twisting function thatmodifies the Markov transition kernel. In this paper, we study the TPF from acontinuous-time perspective. Under suitable settings, we show that thediscrete-time model converges to a continuous-time limit, which can be solvedthrough a series of well-studied control-based importance sampling algorithms.This discrete-continuous connection allows the design of new TPF algorithmsinspired by established continuous-time algorithms. As a concrete example,guided by existing importance sampling algorithms in the continuous-timesetting, we propose a novel algorithm called ``Twisted-Path Particle Filter"(TPPF), where the twist function, parameterized by neural networks, minimizesspecific KL-divergence between path measures. Some numerical experiments aregiven to illustrate the capability of the proposed algorithm.

粒子滤波器（PF），又称序列蒙特卡罗（SMC），设计用于在离散时间环境中近似高维概率分布及其归一化常数。为了降低蒙特卡罗近似的方差，研究人员提出了几种扭曲粒子滤波器（TPF），即选择或学习一个扭曲函数来修改马尔科夫转换核。本文从连续时间的角度研究了 TPF。在合适的设置下，我们证明离散时间模型会收敛到连续时间极限，而连续时间极限可以通过一系列经过充分研究的基于控制的重要性采样算法来求解。这种离散-连续的联系使得我们可以从已有的连续时间算法中汲取灵感，设计出新的 TPF 算法。作为一个具体的例子，在连续时间设置中现有重要性采样算法的指导下，我们提出了一种称为 "扭曲路径粒子滤波器"（TPPF）的新算法，其中扭曲函数由神经网络参数化，最小化路径度量之间的特定 KL-发散。本文给出了一些数值实验来说明所提算法的能力。

{"title":"Guidance for twisted particle filter: a continuous-time perspective","authors":"Jianfeng Lu, Yuliang Wang","doi":"arxiv-2409.02399","DOIUrl":"https://doi.org/arxiv-2409.02399","url":null,"abstract":"The particle filter (PF), also known as the sequential Monte Carlo (SMC), is\u0000designed to approximate high-dimensional probability distributions and their\u0000normalizing constants in the discrete-time setting. To reduce the variance of\u0000the Monte Carlo approximation, several twisted particle filters (TPF) have been\u0000proposed by researchers, where one chooses or learns a twisting function that\u0000modifies the Markov transition kernel. In this paper, we study the TPF from a\u0000continuous-time perspective. Under suitable settings, we show that the\u0000discrete-time model converges to a continuous-time limit, which can be solved\u0000through a series of well-studied control-based importance sampling algorithms.\u0000This discrete-continuous connection allows the design of new TPF algorithms\u0000inspired by established continuous-time algorithms. As a concrete example,\u0000guided by existing importance sampling algorithms in the continuous-time\u0000setting, we propose a novel algorithm called ``Twisted-Path Particle Filter\"\u0000(TPPF), where the twist function, parameterized by neural networks, minimizes\u0000specific KL-divergence between path measures. Some numerical experiments are\u0000given to illustrate the capability of the proposed algorithm.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Parameter estimation of hidden Markov models: comparison of EM and quasi-Newton methods with a new hybrid algorithm 隐马尔可夫模型的参数估计：EM 和准牛顿方法与新混合算法的比较

arXiv - STAT - Computation

Pub Date : 2024-09-04 DOI: arxiv-2409.02477

Sidonie FoulonCESP, NeuroDiderot, Thérèse TruongCESP, Anne-Louise LeuteneggerNeuroDiderot, Hervé PerdryCESP

Hidden Markov Models (HMM) model a sequence of observations that aredependent on a hidden (or latent) state that follow a Markov chain. Thesemodels are widely used in diverse fields including ecology, speech recognition,and genetics.Parameter estimation in HMM is typically performed using theBaum-Welch algorithm, a special case of the Expectation-Maximisation (EM)algorithm. While this method guarantee the convergence to a local maximum, itsconvergence rates is usually slow.Alternative methods, such as the directmaximisation of the likelihood using quasi-Newton methods (such as L-BFGS-B)can offer faster convergence but can be more complicated to implement due tochallenges to deal with the presence of bounds on the space of parameters.Wepropose a novel hybrid algorithm, QNEM, that combines the Baum-Welch and thequasi-Newton algorithms. QNEM aims to leverage the strength of both algorithmsby switching from one method to the other based on the convexity of thelikelihood function.We conducted a comparative analysis between QNEM, theBaum-Welch algorithm, an EM acceleration algorithm called SQUAREM (Varadhan,2008, Scand J Statist), and the L-BFGS-B quasi-Newton method by applying thesealgorithms to four examples built on different models. We estimated theparameters of each model using the different algorithms and evaluated theirperformances.Our results show that the best-performing algorithm depends on themodel considered. QNEM performs well overall, always being faster or equivalentto L-BFGS-B. The Baum-Welch and SQUAREM algorithms are faster than thequasi-Newton and QNEM algorithms in certain scenarios with multiple optimum. Inconclusion, QNEM offers a promising alternative to existing algorithms.

隐马尔可夫模型（HMM）是对一连串观测值的建模，这些观测值依赖于马尔可夫链上的隐藏（或潜在）状态。这些模型被广泛应用于生态学、语音识别和遗传学等多个领域。HMM 的参数估计通常使用鲍姆-韦尔奇算法（Baum-Welch algorithm）进行，该算法是期望最大化算法（EM）的一个特例。其他方法，如使用准牛顿方法（如 L-BFGS-B）直接最大化似然，可以提供更快的收敛速度，但由于要处理参数空间上存在的边界问题，实现起来可能会更加复杂。我们提出了一种新型混合算法 QNEM，它结合了 Baum-Welch 算法和准牛顿算法。我们对 QNEM、鲍姆-韦尔奇算法、一种名为 SQUAREM 的 EM 加速算法（Varadhan，2008 年，Scand J Statist）和 L-BFGS-B 准牛顿方法进行了比较分析，将这些算法应用于四个基于不同模型的示例。结果表明，最佳算法取决于所考虑的模型。QNEM 总体表现良好，速度始终快于或等同于 L-BFGS-B。在某些有多个最优的情况下，Baum-Welch 算法和 SQUAREM 算法比准牛顿算法和 QNEM 算法更快。总之，QNEM 为现有算法提供了一种有前途的替代方案。

{"title":"Parameter estimation of hidden Markov models: comparison of EM and quasi-Newton methods with a new hybrid algorithm","authors":"Sidonie FoulonCESP, NeuroDiderot, Thérèse TruongCESP, Anne-Louise LeuteneggerNeuroDiderot, Hervé PerdryCESP","doi":"arxiv-2409.02477","DOIUrl":"https://doi.org/arxiv-2409.02477","url":null,"abstract":"Hidden Markov Models (HMM) model a sequence of observations that are\u0000dependent on a hidden (or latent) state that follow a Markov chain. These\u0000models are widely used in diverse fields including ecology, speech recognition,\u0000and genetics.Parameter estimation in HMM is typically performed using the\u0000Baum-Welch algorithm, a special case of the Expectation-Maximisation (EM)\u0000algorithm. While this method guarantee the convergence to a local maximum, its\u0000convergence rates is usually slow.Alternative methods, such as the direct\u0000maximisation of the likelihood using quasi-Newton methods (such as L-BFGS-B)\u0000can offer faster convergence but can be more complicated to implement due to\u0000challenges to deal with the presence of bounds on the space of parameters.We\u0000propose a novel hybrid algorithm, QNEM, that combines the Baum-Welch and the\u0000quasi-Newton algorithms. QNEM aims to leverage the strength of both algorithms\u0000by switching from one method to the other based on the convexity of the\u0000likelihood function.We conducted a comparative analysis between QNEM, the\u0000Baum-Welch algorithm, an EM acceleration algorithm called SQUAREM (Varadhan,\u00002008, Scand J Statist), and the L-BFGS-B quasi-Newton method by applying these\u0000algorithms to four examples built on different models. We estimated the\u0000parameters of each model using the different algorithms and evaluated their\u0000performances.Our results show that the best-performing algorithm depends on the\u0000model considered. QNEM performs well overall, always being faster or equivalent\u0000to L-BFGS-B. The Baum-Welch and SQUAREM algorithms are faster than the\u0000quasi-Newton and QNEM algorithms in certain scenarios with multiple optimum. In\u0000conclusion, QNEM offers a promising alternative to existing algorithms.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Dataset Distillation from First Principles: Integrating Core Information Extraction and Purposeful Learning 从第一原理出发的数据集提炼：整合核心信息提取和有目的的学习

arXiv - STAT - Computation

Pub Date : 2024-09-02 DOI: arxiv-2409.01410

Vyacheslav Kungurtsev, Yuanfang Peng, Jianyang Gu, Saeed Vahidian, Anthony Quinn, Fadwa Idlahcen, Yiran Chen

Dataset distillation (DD) is an increasingly important technique that focuseson constructing a synthetic dataset capable of capturing the core informationin training data to achieve comparable performance in models trained on thelatter. While DD has a wide range of applications, the theory supporting it isless well evolved. New methods of DD are compared on a common set ofbenchmarks, rather than oriented towards any particular learning task. In thiswork, we present a formal model of DD, arguing that a precise characterizationof the underlying optimization problem must specify the inference taskassociated with the application of interest. Without this task-specific focus,the DD problem is under-specified, and the selection of a DD algorithm for aparticular task is merely heuristic. Our formalization reveals novelapplications of DD across different modeling environments. We analyze existingDD methods through this broader lens, highlighting their strengths andlimitations in terms of accuracy and faithfulness to optimal DD operation.Finally, we present numerical results for two case studies important incontemporary settings. Firstly, we address a critical challenge in medical dataanalysis: merging the knowledge from different datasets composed ofintersecting, but not identical, sets of features, in order to construct alarger dataset in what is usually a small sample setting. Secondly, we considerout-of-distribution error across boundary conditions for physics-informedneural networks (PINNs), showing the potential for DD to provide morephysically faithful data. By establishing this general formulation of DD, weaim to establish a new research paradigm by which DD can be understood and fromwhich new DD techniques can arise.

数据集提炼（Dataset distillation，DD）是一种越来越重要的技术，它主要是构建一个能够捕捉训练数据中核心信息的合成数据集，从而使在训练数据上训练的模型达到可比的性能。虽然 DD 的应用范围很广，但支持它的理论却不够成熟。DD的新方法是在一组通用基准上进行比较的，而不是面向任何特定的学习任务。在这项工作中，我们提出了一个 DD 的正式模型，认为对底层优化问题的精确描述必须明确与感兴趣的应用相关的推理任务。如果缺乏对特定任务的关注，DD 问题就不够明确，为特定任务选择 DD 算法也只是启发式的。我们的形式化揭示了 DD 在不同建模环境中的新应用。我们通过这个更广阔的视角分析了现有的 DD 方法，强调了它们在准确性和忠实于最佳 DD 操作方面的优势和局限性。最后，我们展示了两个在当代环境中非常重要的案例研究的数值结果。首先，我们讨论了医学数据分析中的一个关键挑战：合并由相互交叉但不完全相同的特征集组成的不同数据集的知识，以便在通常是小样本的情况下构建更大的数据集。其次，我们考虑了物理信息神经网络（PINNs）跨边界条件的分布误差，展示了 DD 提供更忠实于物理的数据的潜力。通过建立这种 DD 的一般表述，我们希望建立一种新的研究范式，从而理解 DD 并从中产生新的 DD 技术。

{"title":"Dataset Distillation from First Principles: Integrating Core Information Extraction and Purposeful Learning","authors":"Vyacheslav Kungurtsev, Yuanfang Peng, Jianyang Gu, Saeed Vahidian, Anthony Quinn, Fadwa Idlahcen, Yiran Chen","doi":"arxiv-2409.01410","DOIUrl":"https://doi.org/arxiv-2409.01410","url":null,"abstract":"Dataset distillation (DD) is an increasingly important technique that focuses\u0000on constructing a synthetic dataset capable of capturing the core information\u0000in training data to achieve comparable performance in models trained on the\u0000latter. While DD has a wide range of applications, the theory supporting it is\u0000less well evolved. New methods of DD are compared on a common set of\u0000benchmarks, rather than oriented towards any particular learning task. In this\u0000work, we present a formal model of DD, arguing that a precise characterization\u0000of the underlying optimization problem must specify the inference task\u0000associated with the application of interest. Without this task-specific focus,\u0000the DD problem is under-specified, and the selection of a DD algorithm for a\u0000particular task is merely heuristic. Our formalization reveals novel\u0000applications of DD across different modeling environments. We analyze existing\u0000DD methods through this broader lens, highlighting their strengths and\u0000limitations in terms of accuracy and faithfulness to optimal DD operation.\u0000Finally, we present numerical results for two case studies important in\u0000contemporary settings. Firstly, we address a critical challenge in medical data\u0000analysis: merging the knowledge from different datasets composed of\u0000intersecting, but not identical, sets of features, in order to construct a\u0000larger dataset in what is usually a small sample setting. Secondly, we consider\u0000out-of-distribution error across boundary conditions for physics-informed\u0000neural networks (PINNs), showing the potential for DD to provide more\u0000physically faithful data. By establishing this general formulation of DD, we\u0000aim to establish a new research paradigm by which DD can be understood and from\u0000which new DD techniques can arise.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Plasmode simulation for the evaluation of causal inference methods in homophilous social networks 质点模拟用于评估同亲社会网络中的因果推理方法

arXiv - STAT - Computation

Pub Date : 2024-09-02 DOI: arxiv-2409.01316

Vanessa McNealis, Erica E. M. Moodie, Nema Dean

Typical simulation approaches for evaluating the performance of statisticalmethods on populations embedded in social networks may fail to captureimportant features of real-world networks. It can therefore be unclear whetherinference methods for causal effects due to interference that have been shownto perform well in such synthetic networks are applicable to social networkswhich arise in the real world. Plasmode simulation studies use a real datasetcreated from natural processes, but with part of the data-generation mechanismknown. However, given the sensitivity of relational data, many network data areprotected from unauthorized access or disclosure. In such case, plasmodesimulations cannot use released versions of real datasets which often omit thenetwork links, and instead can only rely on parameters estimated from them. Astatistical framework for creating replicated simulation datasets from privatesocial network data is developed and validated. The approach consists ofsimulating from a parametric exponential family random graph model fitted tothe network data and resampling from the observed exposure and covariatedistributions to preserve the associations among these variables.

典型的模拟方法用于评估统计方法在嵌入社交网络的人群中的表现，可能无法捕捉真实世界网络的重要特征。因此，在此类合成网络中表现出色的干扰因果效应推断方法是否适用于真实世界中出现的社会网络，可能并不清楚。质点模拟研究使用的是由自然过程创建的真实数据集，但数据生成机制的一部分是已知的。然而，鉴于关系数据的敏感性，许多网络数据都受到保护，以防止未经授权的访问或泄露。在这种情况下，等离子体模拟无法使用真实数据集的发布版本，因为这些数据集往往省略了网络链接，而只能依靠从中估算出的参数。本文开发并验证了从私人社交网络数据创建复制模拟数据集的统计框架。该方法包括从一个拟合网络数据的参数指数族随机图模型中进行模拟，并从观察到的暴露和协变分布中重新采样，以保留这些变量之间的关联。

{"title":"Plasmode simulation for the evaluation of causal inference methods in homophilous social networks","authors":"Vanessa McNealis, Erica E. M. Moodie, Nema Dean","doi":"arxiv-2409.01316","DOIUrl":"https://doi.org/arxiv-2409.01316","url":null,"abstract":"Typical simulation approaches for evaluating the performance of statistical\u0000methods on populations embedded in social networks may fail to capture\u0000important features of real-world networks. It can therefore be unclear whether\u0000inference methods for causal effects due to interference that have been shown\u0000to perform well in such synthetic networks are applicable to social networks\u0000which arise in the real world. Plasmode simulation studies use a real dataset\u0000created from natural processes, but with part of the data-generation mechanism\u0000known. However, given the sensitivity of relational data, many network data are\u0000protected from unauthorized access or disclosure. In such case, plasmode\u0000simulations cannot use released versions of real datasets which often omit the\u0000network links, and instead can only rely on parameters estimated from them. A\u0000statistical framework for creating replicated simulation datasets from private\u0000social network data is developed and validated. The approach consists of\u0000simulating from a parametric exponential family random graph model fitted to\u0000the network data and resampling from the observed exposure and covariate\u0000distributions to preserve the associations among these variables.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Armadillo and Eigen: A Tale of Two Linear Algebra Libraries 犰狳和 Eigen：两个线性代数库的故事

arXiv - STAT - Computation

Pub Date : 2024-09-01 DOI: arxiv-2409.00568

Mauricio Vargas Sepulveda

This article introduces `cpp11eigen`, a new R package that integrates thepowerful Eigen C++ library for linear algebra into the R programmingenvironment. This article provides a detailed comparison between Armadillo andEigen speed and syntax. The `cpp11eigen` package simplifies a part of theprocess of using C++ within R by offering additional ease of integration forthose who require high-performance linear algebra operations in their Rworkflows. This work aims to discuss the tradeoff between computationalefficiency and accessibility.

本文介绍了 "cpp11eigen"，这是一个新的 R 软件包，它将强大的线性代数 Eigen C++ 库集成到了 R 编程环境中。本文详细比较了 Armadillo 和 Eigen 的速度和语法。cpp11eigen "软件包简化了在 R 中使用 C++ 的部分过程，为那些在 R 工作流程中需要高性能线性代数运算的人提供了额外的集成便利。这项工作旨在讨论计算效率和易用性之间的权衡。

引用次数: 0

Response probability distribution estimation of expensive computer simulators: A Bayesian active learning perspective using Gaussian process regression 昂贵计算机模拟器的响应概率分布估计：利用高斯过程回归的贝叶斯主动学习视角

arXiv - STAT - Computation

Pub Date : 2024-08-31 DOI: arxiv-2409.00407

Chao Dang, Marcos A. Valdebenito, Nataly A. Manque, Jun Xu, Matthias G. R. Faes

Estimation of the response probability distributions of computer simulatorsin the presence of randomness is a crucial task in many fields. However,achieving this task with guaranteed accuracy remains an open computationalchallenge, especially for expensive-to-evaluate computer simulators. In thiswork, a Bayesian active learning perspective is presented to address thechallenge, which is based on the use of the Gaussian process (GP) regression.First, estimation of the response probability distributions is conceptuallyinterpreted as a Bayesian inference problem, as opposed to frequentistinference. This interpretation provides several important benefits: (1) itquantifies and propagates discretization error probabilistically; (2) itincorporates prior knowledge of the computer simulator, and (3) it enables theeffective reduction of numerical uncertainty in the solution to a prescribedlevel. The conceptual Bayesian idea is then realized by using the GPregression, where we derive the posterior statistics of the responseprobability distributions in semi-analytical form and also provide a numericalsolution scheme. Based on the practical Bayesian approach, a Bayesian activelearning (BAL) method is further proposed for estimating the responseprobability distributions. In this context, the key contribution lies in thedevelopment of two crucial components for active learning, i.e., stoppingcriterion and learning function, by taking advantage of posterior statistics.It is empirically demonstrated by five numerical examples that the proposed BALmethod can efficiently estimate the response probability distributions withdesired accuracy.

在存在随机性的情况下，估计计算机模拟器的响应概率分布是许多领域的一项重要任务。然而，如何在保证准确性的前提下完成这项任务仍然是一个有待解决的计算难题，尤其是对于评估成本高昂的计算机模拟器而言。首先，响应概率分布的估计在概念上被解释为贝叶斯推理问题，而不是频数推理问题。这种解释有几个重要的好处：(1)以概率方式量化和传播离散化误差；(2)纳入计算机模拟器的先验知识；(3)能够有效地将求解中的数值不确定性降低到规定水平。通过使用 GP 回归，我们以半分析的形式推导出了响应概率分布的后验统计量，并提供了数值求解方案，从而实现了概念性的贝叶斯思想。在实用贝叶斯方法的基础上，我们进一步提出了贝叶斯主动学习（BAL）方法，用于估计响应概率分布。在此背景下，贝叶斯主动学习方法的主要贡献在于利用后验统计量的优势，开发了主动学习的两个关键组件，即停止准则和学习函数，并通过五个数值示例实证证明了所提出的贝叶斯主动学习方法能够以期望的精度有效地估计响应概率分布。

{"title":"Response probability distribution estimation of expensive computer simulators: A Bayesian active learning perspective using Gaussian process regression","authors":"Chao Dang, Marcos A. Valdebenito, Nataly A. Manque, Jun Xu, Matthias G. R. Faes","doi":"arxiv-2409.00407","DOIUrl":"https://doi.org/arxiv-2409.00407","url":null,"abstract":"Estimation of the response probability distributions of computer simulators\u0000in the presence of randomness is a crucial task in many fields. However,\u0000achieving this task with guaranteed accuracy remains an open computational\u0000challenge, especially for expensive-to-evaluate computer simulators. In this\u0000work, a Bayesian active learning perspective is presented to address the\u0000challenge, which is based on the use of the Gaussian process (GP) regression.\u0000First, estimation of the response probability distributions is conceptually\u0000interpreted as a Bayesian inference problem, as opposed to frequentist\u0000inference. This interpretation provides several important benefits: (1) it\u0000quantifies and propagates discretization error probabilistically; (2) it\u0000incorporates prior knowledge of the computer simulator, and (3) it enables the\u0000effective reduction of numerical uncertainty in the solution to a prescribed\u0000level. The conceptual Bayesian idea is then realized by using the GP\u0000regression, where we derive the posterior statistics of the response\u0000probability distributions in semi-analytical form and also provide a numerical\u0000solution scheme. Based on the practical Bayesian approach, a Bayesian active\u0000learning (BAL) method is further proposed for estimating the response\u0000probability distributions. In this context, the key contribution lies in the\u0000development of two crucial components for active learning, i.e., stopping\u0000criterion and learning function, by taking advantage of posterior statistics.\u0000It is empirically demonstrated by five numerical examples that the proposed BAL\u0000method can efficiently estimate the response probability distributions with\u0000desired accuracy.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"68 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Stochastic Vector Approximate Message Passing with applications to phase retrieval 随机向量近似信息传递与相位检索的应用

arXiv - STAT - Computation

Pub Date : 2024-08-30 DOI: arxiv-2408.17102

Hajime Ueda, Shun Katakami, Masato Okada

Phase retrieval refers to the problem of recovering a high-dimensional vector$boldsymbol{x} in mathbb{C}^N$ from the magnitude of its linear transform$boldsymbol{z} = A boldsymbol{x}$, observed through a noisy channel. Toimprove the ill-posed nature of the inverse problem, it is a common practice toobserve the magnitude of linear measurements $boldsymbol{z}^{(1)} = A^{(1)}boldsymbol{x},..., boldsymbol{z}^{(L)} = A^{(L)}boldsymbol{x}$ usingmultiple sensing matrices $A^{(1)},..., A^{(L)}$, with ptychographic imagingbeing a remarkable example of such strategies. Inspired by existing algorithmsfor ptychographic reconstruction, we introduce stochasticity to VectorApproximate Message Passing (VAMP), a computationally efficient algorithmapplicable to a wide range of Bayesian inverse problems. By testing ourapproach in the setup of phase retrieval, we show the superior convergencespeed of the proposed algorithm.

相位检索指的是从（mathbb{C}^N）中的高维向量（vector）的线性变换（linear transform）的大小中恢复高维向量（vector）的问题。的线性变换$boldsymbol{z} = A boldsymbol{x}$的大小，并通过噪声信道进行观测。为了改善逆问题的无解性质，通常的做法是观察线性测量的大小 $boldsymbol{z}^{(1)} = A^{(1)}boldsymbol{x},..., boldsymbol{z}^{(L)} = A^{(L)}boldsymbol{x}$ 使用多个传感矩阵 $A^{(1)},..., A^{(L)}$，梯度成像就是这种策略的一个显著例子。受现有的阶梯图像重建算法的启发，我们在矢量近似信息传递（VAMP）中引入了随机性，这是一种适用于多种贝叶斯逆问题的高效计算算法。通过在相位检索设置中测试我们的方法，我们展示了所提出算法的卓越收敛速度。

引用次数: 0

Continuous Gaussian mixture solution for linear Bayesian inversion with application to Laplace priors 线性贝叶斯反演的连续高斯混合解法与拉普拉斯先验的应用

arXiv - STAT - Computation

Pub Date : 2024-08-29 DOI: arxiv-2408.16594

Rafael Flock, Yiqiu Dong, Felipe Uribe, Olivier Zahm

We focus on Bayesian inverse problems with Gaussian likelihood, linearforward model, and priors that can be formulated as a Gaussian mixture. Such amixture is expressed as an integral of Gaussian density functions weighted by amixing density over the mixing variables. Within this framework, thecorresponding posterior distribution also takes the form of a Gaussian mixture,and we derive the closed-form expression for its posterior mixing density. Tosample from the posterior Gaussian mixture, we propose a two-step samplingmethod. First, we sample the mixture variables from the posterior mixingdensity, and then we sample the variables of interest from Gaussian densitiesconditioned on the sampled mixing variables. However, the posterior mixingdensity is relatively difficult to sample from, especially in high dimensions.Therefore, we propose to replace the posterior mixing density by adimension-reduced approximation, and we provide a bound in the Hellingerdistance for the resulting approximate posterior. We apply the proposedapproach to a posterior with Laplace prior, where we introduce twodimension-reduced approximations for the posterior mixing density. Ournumerical experiments indicate that samples generated via the proposedapproximations have very low correlation and are close to the exact posterior.

我们重点研究具有高斯似然、线性前向模型和可表述为高斯混合物的先验的贝叶斯逆问题。这种混合物表示为混合变量上混合密度加权的高斯密度函数的积分。在这个框架内，相应的后验分布也是高斯混合物的形式，我们推导出了其后验混合密度的闭式表达式。为了从后验高斯混合分布中采样，我们提出了一种两步采样法。首先，我们根据后验混合密度对混合变量进行采样，然后根据以采样混合变量为条件的高斯密度对相关变量进行采样。因此，我们建议用降低维度的近似值来代替后验混合密度，并为得到的近似后验值提供了一个海林距离约束。我们将所提出的方法应用于具有拉普拉斯先验的后验，其中我们为后验混合密度引入了两个维度降低的近似值。数值实验表明，通过所提出的近似方法生成的样本具有非常低的相关性，并且接近精确后验。

{"title":"Continuous Gaussian mixture solution for linear Bayesian inversion with application to Laplace priors","authors":"Rafael Flock, Yiqiu Dong, Felipe Uribe, Olivier Zahm","doi":"arxiv-2408.16594","DOIUrl":"https://doi.org/arxiv-2408.16594","url":null,"abstract":"We focus on Bayesian inverse problems with Gaussian likelihood, linear\u0000forward model, and priors that can be formulated as a Gaussian mixture. Such a\u0000mixture is expressed as an integral of Gaussian density functions weighted by a\u0000mixing density over the mixing variables. Within this framework, the\u0000corresponding posterior distribution also takes the form of a Gaussian mixture,\u0000and we derive the closed-form expression for its posterior mixing density. To\u0000sample from the posterior Gaussian mixture, we propose a two-step sampling\u0000method. First, we sample the mixture variables from the posterior mixing\u0000density, and then we sample the variables of interest from Gaussian densities\u0000conditioned on the sampled mixing variables. However, the posterior mixing\u0000density is relatively difficult to sample from, especially in high dimensions.\u0000Therefore, we propose to replace the posterior mixing density by a\u0000dimension-reduced approximation, and we provide a bound in the Hellinger\u0000distance for the resulting approximate posterior. We apply the proposed\u0000approach to a posterior with Laplace prior, where we introduce two\u0000dimension-reduced approximations for the posterior mixing density. Our\u0000numerical experiments indicate that samples generated via the proposed\u0000approximations have very low correlation and are close to the exact posterior.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A review of sequential Monte Carlo methods for real-time disease modeling 用于实时疾病建模的蒙特卡洛序列方法综述

arXiv - STAT - Computation

Pub Date : 2024-08-28 DOI: arxiv-2408.15739

Dhorasso Temfack, Jason Wyse

Sequential Monte Carlo methods are a powerful framework for approximating theposterior distribution of a state variable in a sequential manner. They providean attractive way of analyzing dynamic systems in real-time, taking intoaccount the limitations of traditional approaches such as Markov Chain MonteCarlo methods, which are not well suited to data that arrives incrementally.This paper reviews and explores the application of Sequential Monte Carlo indynamic disease modeling, highlighting its capacity for online inference andreal-time adaptation to evolving disease dynamics. The integration of kerneldensity approximation techniques within the stochasticSusceptible-Exposed-Infectious-Recovered (SEIR) compartment model is examined,demonstrating the algorithm's effectiveness in monitoring time-varyingparameters such as the effective reproduction number. Case studies, includingsimulations with synthetic data and analysis of real-world COVID-19 data fromIreland, demonstrate the practical applicability of this approach for informingtimely public health interventions.

序列蒙特卡罗方法是一种强大的框架，用于以序列方式逼近状态变量的后验分布。本文回顾并探讨了序列蒙特卡罗方法在动态疾病建模中的应用，强调了其在线推断和实时适应不断变化的疾病动态的能力。本文研究了在随机易感-暴露-感染-恢复（SEIR）区隔模型中整合核密度近似技术的问题，展示了该算法在监测有效繁殖数等时变参数方面的有效性。案例研究包括对合成数据的模拟和对爱尔兰 COVID-19 实际数据的分析，证明了这种方法在为及时的公共卫生干预提供信息方面的实际适用性。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

arXiv - STAT - Computation

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀