In this paper, we propose an approach for an application of Bayesian optimization using Sequential Monte Carlo (SMC) and concepts from the statistical physics of classical systems. Our method leverages the power of modern machine learning libraries such as NumPyro and JAX, allowing us to perform Bayesian optimization on multiple platforms, including CPUs, GPUs, TPUs, and in parallel. Our approach enables a low entry level for exploration of the methods while maintaining high performance. We present a promising direction for developing more efficient and effective techniques for a wide range of optimization problems in diverse fields.
{"title":"A Bayesian Optimization through Sequential Monte Carlo and Statistical Physics-Inspired Techniques","authors":"Anton Lebedev, Thomas Warford, M. Emre Şahin","doi":"arxiv-2409.03094","DOIUrl":"https://doi.org/arxiv-2409.03094","url":null,"abstract":"In this paper, we propose an approach for an application of Bayesian\u0000optimization using Sequential Monte Carlo (SMC) and concepts from the\u0000statistical physics of classical systems. Our method leverages the power of\u0000modern machine learning libraries such as NumPyro and JAX, allowing us to\u0000perform Bayesian optimization on multiple platforms, including CPUs, GPUs,\u0000TPUs, and in parallel. Our approach enables a low entry level for exploration\u0000of the methods while maintaining high performance. We present a promising\u0000direction for developing more efficient and effective techniques for a wide\u0000range of optimization problems in diverse fields.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The particle filter (PF), also known as the sequential Monte Carlo (SMC), is designed to approximate high-dimensional probability distributions and their normalizing constants in the discrete-time setting. To reduce the variance of the Monte Carlo approximation, several twisted particle filters (TPF) have been proposed by researchers, where one chooses or learns a twisting function that modifies the Markov transition kernel. In this paper, we study the TPF from a continuous-time perspective. Under suitable settings, we show that the discrete-time model converges to a continuous-time limit, which can be solved through a series of well-studied control-based importance sampling algorithms. This discrete-continuous connection allows the design of new TPF algorithms inspired by established continuous-time algorithms. As a concrete example, guided by existing importance sampling algorithms in the continuous-time setting, we propose a novel algorithm called ``Twisted-Path Particle Filter" (TPPF), where the twist function, parameterized by neural networks, minimizes specific KL-divergence between path measures. Some numerical experiments are given to illustrate the capability of the proposed algorithm.
{"title":"Guidance for twisted particle filter: a continuous-time perspective","authors":"Jianfeng Lu, Yuliang Wang","doi":"arxiv-2409.02399","DOIUrl":"https://doi.org/arxiv-2409.02399","url":null,"abstract":"The particle filter (PF), also known as the sequential Monte Carlo (SMC), is\u0000designed to approximate high-dimensional probability distributions and their\u0000normalizing constants in the discrete-time setting. To reduce the variance of\u0000the Monte Carlo approximation, several twisted particle filters (TPF) have been\u0000proposed by researchers, where one chooses or learns a twisting function that\u0000modifies the Markov transition kernel. In this paper, we study the TPF from a\u0000continuous-time perspective. Under suitable settings, we show that the\u0000discrete-time model converges to a continuous-time limit, which can be solved\u0000through a series of well-studied control-based importance sampling algorithms.\u0000This discrete-continuous connection allows the design of new TPF algorithms\u0000inspired by established continuous-time algorithms. As a concrete example,\u0000guided by existing importance sampling algorithms in the continuous-time\u0000setting, we propose a novel algorithm called ``Twisted-Path Particle Filter\"\u0000(TPPF), where the twist function, parameterized by neural networks, minimizes\u0000specific KL-divergence between path measures. Some numerical experiments are\u0000given to illustrate the capability of the proposed algorithm.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hidden Markov Models (HMM) model a sequence of observations that are dependent on a hidden (or latent) state that follow a Markov chain. These models are widely used in diverse fields including ecology, speech recognition, and genetics.Parameter estimation in HMM is typically performed using the Baum-Welch algorithm, a special case of the Expectation-Maximisation (EM) algorithm. While this method guarantee the convergence to a local maximum, its convergence rates is usually slow.Alternative methods, such as the direct maximisation of the likelihood using quasi-Newton methods (such as L-BFGS-B) can offer faster convergence but can be more complicated to implement due to challenges to deal with the presence of bounds on the space of parameters.We propose a novel hybrid algorithm, QNEM, that combines the Baum-Welch and the quasi-Newton algorithms. QNEM aims to leverage the strength of both algorithms by switching from one method to the other based on the convexity of the likelihood function.We conducted a comparative analysis between QNEM, the Baum-Welch algorithm, an EM acceleration algorithm called SQUAREM (Varadhan, 2008, Scand J Statist), and the L-BFGS-B quasi-Newton method by applying these algorithms to four examples built on different models. We estimated the parameters of each model using the different algorithms and evaluated their performances.Our results show that the best-performing algorithm depends on the model considered. QNEM performs well overall, always being faster or equivalent to L-BFGS-B. The Baum-Welch and SQUAREM algorithms are faster than the quasi-Newton and QNEM algorithms in certain scenarios with multiple optimum. In conclusion, QNEM offers a promising alternative to existing algorithms.
{"title":"Parameter estimation of hidden Markov models: comparison of EM and quasi-Newton methods with a new hybrid algorithm","authors":"Sidonie FoulonCESP, NeuroDiderot, Thérèse TruongCESP, Anne-Louise LeuteneggerNeuroDiderot, Hervé PerdryCESP","doi":"arxiv-2409.02477","DOIUrl":"https://doi.org/arxiv-2409.02477","url":null,"abstract":"Hidden Markov Models (HMM) model a sequence of observations that are\u0000dependent on a hidden (or latent) state that follow a Markov chain. These\u0000models are widely used in diverse fields including ecology, speech recognition,\u0000and genetics.Parameter estimation in HMM is typically performed using the\u0000Baum-Welch algorithm, a special case of the Expectation-Maximisation (EM)\u0000algorithm. While this method guarantee the convergence to a local maximum, its\u0000convergence rates is usually slow.Alternative methods, such as the direct\u0000maximisation of the likelihood using quasi-Newton methods (such as L-BFGS-B)\u0000can offer faster convergence but can be more complicated to implement due to\u0000challenges to deal with the presence of bounds on the space of parameters.We\u0000propose a novel hybrid algorithm, QNEM, that combines the Baum-Welch and the\u0000quasi-Newton algorithms. QNEM aims to leverage the strength of both algorithms\u0000by switching from one method to the other based on the convexity of the\u0000likelihood function.We conducted a comparative analysis between QNEM, the\u0000Baum-Welch algorithm, an EM acceleration algorithm called SQUAREM (Varadhan,\u00002008, Scand J Statist), and the L-BFGS-B quasi-Newton method by applying these\u0000algorithms to four examples built on different models. We estimated the\u0000parameters of each model using the different algorithms and evaluated their\u0000performances.Our results show that the best-performing algorithm depends on the\u0000model considered. QNEM performs well overall, always being faster or equivalent\u0000to L-BFGS-B. The Baum-Welch and SQUAREM algorithms are faster than the\u0000quasi-Newton and QNEM algorithms in certain scenarios with multiple optimum. In\u0000conclusion, QNEM offers a promising alternative to existing algorithms.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dataset distillation (DD) is an increasingly important technique that focuses on constructing a synthetic dataset capable of capturing the core information in training data to achieve comparable performance in models trained on the latter. While DD has a wide range of applications, the theory supporting it is less well evolved. New methods of DD are compared on a common set of benchmarks, rather than oriented towards any particular learning task. In this work, we present a formal model of DD, arguing that a precise characterization of the underlying optimization problem must specify the inference task associated with the application of interest. Without this task-specific focus, the DD problem is under-specified, and the selection of a DD algorithm for a particular task is merely heuristic. Our formalization reveals novel applications of DD across different modeling environments. We analyze existing DD methods through this broader lens, highlighting their strengths and limitations in terms of accuracy and faithfulness to optimal DD operation. Finally, we present numerical results for two case studies important in contemporary settings. Firstly, we address a critical challenge in medical data analysis: merging the knowledge from different datasets composed of intersecting, but not identical, sets of features, in order to construct a larger dataset in what is usually a small sample setting. Secondly, we consider out-of-distribution error across boundary conditions for physics-informed neural networks (PINNs), showing the potential for DD to provide more physically faithful data. By establishing this general formulation of DD, we aim to establish a new research paradigm by which DD can be understood and from which new DD techniques can arise.
{"title":"Dataset Distillation from First Principles: Integrating Core Information Extraction and Purposeful Learning","authors":"Vyacheslav Kungurtsev, Yuanfang Peng, Jianyang Gu, Saeed Vahidian, Anthony Quinn, Fadwa Idlahcen, Yiran Chen","doi":"arxiv-2409.01410","DOIUrl":"https://doi.org/arxiv-2409.01410","url":null,"abstract":"Dataset distillation (DD) is an increasingly important technique that focuses\u0000on constructing a synthetic dataset capable of capturing the core information\u0000in training data to achieve comparable performance in models trained on the\u0000latter. While DD has a wide range of applications, the theory supporting it is\u0000less well evolved. New methods of DD are compared on a common set of\u0000benchmarks, rather than oriented towards any particular learning task. In this\u0000work, we present a formal model of DD, arguing that a precise characterization\u0000of the underlying optimization problem must specify the inference task\u0000associated with the application of interest. Without this task-specific focus,\u0000the DD problem is under-specified, and the selection of a DD algorithm for a\u0000particular task is merely heuristic. Our formalization reveals novel\u0000applications of DD across different modeling environments. We analyze existing\u0000DD methods through this broader lens, highlighting their strengths and\u0000limitations in terms of accuracy and faithfulness to optimal DD operation.\u0000Finally, we present numerical results for two case studies important in\u0000contemporary settings. Firstly, we address a critical challenge in medical data\u0000analysis: merging the knowledge from different datasets composed of\u0000intersecting, but not identical, sets of features, in order to construct a\u0000larger dataset in what is usually a small sample setting. Secondly, we consider\u0000out-of-distribution error across boundary conditions for physics-informed\u0000neural networks (PINNs), showing the potential for DD to provide more\u0000physically faithful data. By establishing this general formulation of DD, we\u0000aim to establish a new research paradigm by which DD can be understood and from\u0000which new DD techniques can arise.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Typical simulation approaches for evaluating the performance of statistical methods on populations embedded in social networks may fail to capture important features of real-world networks. It can therefore be unclear whether inference methods for causal effects due to interference that have been shown to perform well in such synthetic networks are applicable to social networks which arise in the real world. Plasmode simulation studies use a real dataset created from natural processes, but with part of the data-generation mechanism known. However, given the sensitivity of relational data, many network data are protected from unauthorized access or disclosure. In such case, plasmode simulations cannot use released versions of real datasets which often omit the network links, and instead can only rely on parameters estimated from them. A statistical framework for creating replicated simulation datasets from private social network data is developed and validated. The approach consists of simulating from a parametric exponential family random graph model fitted to the network data and resampling from the observed exposure and covariate distributions to preserve the associations among these variables.
{"title":"Plasmode simulation for the evaluation of causal inference methods in homophilous social networks","authors":"Vanessa McNealis, Erica E. M. Moodie, Nema Dean","doi":"arxiv-2409.01316","DOIUrl":"https://doi.org/arxiv-2409.01316","url":null,"abstract":"Typical simulation approaches for evaluating the performance of statistical\u0000methods on populations embedded in social networks may fail to capture\u0000important features of real-world networks. It can therefore be unclear whether\u0000inference methods for causal effects due to interference that have been shown\u0000to perform well in such synthetic networks are applicable to social networks\u0000which arise in the real world. Plasmode simulation studies use a real dataset\u0000created from natural processes, but with part of the data-generation mechanism\u0000known. However, given the sensitivity of relational data, many network data are\u0000protected from unauthorized access or disclosure. In such case, plasmode\u0000simulations cannot use released versions of real datasets which often omit the\u0000network links, and instead can only rely on parameters estimated from them. A\u0000statistical framework for creating replicated simulation datasets from private\u0000social network data is developed and validated. The approach consists of\u0000simulating from a parametric exponential family random graph model fitted to\u0000the network data and resampling from the observed exposure and covariate\u0000distributions to preserve the associations among these variables.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This article introduces `cpp11eigen`, a new R package that integrates the powerful Eigen C++ library for linear algebra into the R programming environment. This article provides a detailed comparison between Armadillo and Eigen speed and syntax. The `cpp11eigen` package simplifies a part of the process of using C++ within R by offering additional ease of integration for those who require high-performance linear algebra operations in their R workflows. This work aims to discuss the tradeoff between computational efficiency and accessibility.
本文介绍了 "cpp11eigen",这是一个新的 R 软件包,它将强大的线性代数 Eigen C++ 库集成到了 R 编程环境中。本文详细比较了 Armadillo 和 Eigen 的速度和语法。cpp11eigen "软件包简化了在 R 中使用 C++ 的部分过程,为那些在 R 工作流程中需要高性能线性代数运算的人提供了额外的集成便利。这项工作旨在讨论计算效率和易用性之间的权衡。
{"title":"Armadillo and Eigen: A Tale of Two Linear Algebra Libraries","authors":"Mauricio Vargas Sepulveda","doi":"arxiv-2409.00568","DOIUrl":"https://doi.org/arxiv-2409.00568","url":null,"abstract":"This article introduces `cpp11eigen`, a new R package that integrates the\u0000powerful Eigen C++ library for linear algebra into the R programming\u0000environment. This article provides a detailed comparison between Armadillo and\u0000Eigen speed and syntax. The `cpp11eigen` package simplifies a part of the\u0000process of using C++ within R by offering additional ease of integration for\u0000those who require high-performance linear algebra operations in their R\u0000workflows. This work aims to discuss the tradeoff between computational\u0000efficiency and accessibility.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chao Dang, Marcos A. Valdebenito, Nataly A. Manque, Jun Xu, Matthias G. R. Faes
Estimation of the response probability distributions of computer simulators in the presence of randomness is a crucial task in many fields. However, achieving this task with guaranteed accuracy remains an open computational challenge, especially for expensive-to-evaluate computer simulators. In this work, a Bayesian active learning perspective is presented to address the challenge, which is based on the use of the Gaussian process (GP) regression. First, estimation of the response probability distributions is conceptually interpreted as a Bayesian inference problem, as opposed to frequentist inference. This interpretation provides several important benefits: (1) it quantifies and propagates discretization error probabilistically; (2) it incorporates prior knowledge of the computer simulator, and (3) it enables the effective reduction of numerical uncertainty in the solution to a prescribed level. The conceptual Bayesian idea is then realized by using the GP regression, where we derive the posterior statistics of the response probability distributions in semi-analytical form and also provide a numerical solution scheme. Based on the practical Bayesian approach, a Bayesian active learning (BAL) method is further proposed for estimating the response probability distributions. In this context, the key contribution lies in the development of two crucial components for active learning, i.e., stopping criterion and learning function, by taking advantage of posterior statistics. It is empirically demonstrated by five numerical examples that the proposed BAL method can efficiently estimate the response probability distributions with desired accuracy.
在存在随机性的情况下,估计计算机模拟器的响应概率分布是许多领域的一项重要任务。然而,如何在保证准确性的前提下完成这项任务仍然是一个有待解决的计算难题,尤其是对于评估成本高昂的计算机模拟器而言。首先,响应概率分布的估计在概念上被解释为贝叶斯推理问题,而不是频数推理问题。这种解释有几个重要的好处:(1)以概率方式量化和传播离散化误差;(2)纳入计算机模拟器的先验知识;(3)能够有效地将求解中的数值不确定性降低到规定水平。通过使用 GP 回归,我们以半分析的形式推导出了响应概率分布的后验统计量,并提供了数值求解方案,从而实现了概念性的贝叶斯思想。在实用贝叶斯方法的基础上,我们进一步提出了贝叶斯主动学习(BAL)方法,用于估计响应概率分布。在此背景下,贝叶斯主动学习方法的主要贡献在于利用后验统计量的优势,开发了主动学习的两个关键组件,即停止准则和学习函数,并通过五个数值示例实证证明了所提出的贝叶斯主动学习方法能够以期望的精度有效地估计响应概率分布。
{"title":"Response probability distribution estimation of expensive computer simulators: A Bayesian active learning perspective using Gaussian process regression","authors":"Chao Dang, Marcos A. Valdebenito, Nataly A. Manque, Jun Xu, Matthias G. R. Faes","doi":"arxiv-2409.00407","DOIUrl":"https://doi.org/arxiv-2409.00407","url":null,"abstract":"Estimation of the response probability distributions of computer simulators\u0000in the presence of randomness is a crucial task in many fields. However,\u0000achieving this task with guaranteed accuracy remains an open computational\u0000challenge, especially for expensive-to-evaluate computer simulators. In this\u0000work, a Bayesian active learning perspective is presented to address the\u0000challenge, which is based on the use of the Gaussian process (GP) regression.\u0000First, estimation of the response probability distributions is conceptually\u0000interpreted as a Bayesian inference problem, as opposed to frequentist\u0000inference. This interpretation provides several important benefits: (1) it\u0000quantifies and propagates discretization error probabilistically; (2) it\u0000incorporates prior knowledge of the computer simulator, and (3) it enables the\u0000effective reduction of numerical uncertainty in the solution to a prescribed\u0000level. The conceptual Bayesian idea is then realized by using the GP\u0000regression, where we derive the posterior statistics of the response\u0000probability distributions in semi-analytical form and also provide a numerical\u0000solution scheme. Based on the practical Bayesian approach, a Bayesian active\u0000learning (BAL) method is further proposed for estimating the response\u0000probability distributions. In this context, the key contribution lies in the\u0000development of two crucial components for active learning, i.e., stopping\u0000criterion and learning function, by taking advantage of posterior statistics.\u0000It is empirically demonstrated by five numerical examples that the proposed BAL\u0000method can efficiently estimate the response probability distributions with\u0000desired accuracy.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"68 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Phase retrieval refers to the problem of recovering a high-dimensional vector $boldsymbol{x} in mathbb{C}^N$ from the magnitude of its linear transform $boldsymbol{z} = A boldsymbol{x}$, observed through a noisy channel. To improve the ill-posed nature of the inverse problem, it is a common practice to observe the magnitude of linear measurements $boldsymbol{z}^{(1)} = A^{(1)} boldsymbol{x},..., boldsymbol{z}^{(L)} = A^{(L)}boldsymbol{x}$ using multiple sensing matrices $A^{(1)},..., A^{(L)}$, with ptychographic imaging being a remarkable example of such strategies. Inspired by existing algorithms for ptychographic reconstruction, we introduce stochasticity to Vector Approximate Message Passing (VAMP), a computationally efficient algorithm applicable to a wide range of Bayesian inverse problems. By testing our approach in the setup of phase retrieval, we show the superior convergence speed of the proposed algorithm.
{"title":"Stochastic Vector Approximate Message Passing with applications to phase retrieval","authors":"Hajime Ueda, Shun Katakami, Masato Okada","doi":"arxiv-2408.17102","DOIUrl":"https://doi.org/arxiv-2408.17102","url":null,"abstract":"Phase retrieval refers to the problem of recovering a high-dimensional vector\u0000$boldsymbol{x} in mathbb{C}^N$ from the magnitude of its linear transform\u0000$boldsymbol{z} = A boldsymbol{x}$, observed through a noisy channel. To\u0000improve the ill-posed nature of the inverse problem, it is a common practice to\u0000observe the magnitude of linear measurements $boldsymbol{z}^{(1)} = A^{(1)}\u0000boldsymbol{x},..., boldsymbol{z}^{(L)} = A^{(L)}boldsymbol{x}$ using\u0000multiple sensing matrices $A^{(1)},..., A^{(L)}$, with ptychographic imaging\u0000being a remarkable example of such strategies. Inspired by existing algorithms\u0000for ptychographic reconstruction, we introduce stochasticity to Vector\u0000Approximate Message Passing (VAMP), a computationally efficient algorithm\u0000applicable to a wide range of Bayesian inverse problems. By testing our\u0000approach in the setup of phase retrieval, we show the superior convergence\u0000speed of the proposed algorithm.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rafael Flock, Yiqiu Dong, Felipe Uribe, Olivier Zahm
We focus on Bayesian inverse problems with Gaussian likelihood, linear forward model, and priors that can be formulated as a Gaussian mixture. Such a mixture is expressed as an integral of Gaussian density functions weighted by a mixing density over the mixing variables. Within this framework, the corresponding posterior distribution also takes the form of a Gaussian mixture, and we derive the closed-form expression for its posterior mixing density. To sample from the posterior Gaussian mixture, we propose a two-step sampling method. First, we sample the mixture variables from the posterior mixing density, and then we sample the variables of interest from Gaussian densities conditioned on the sampled mixing variables. However, the posterior mixing density is relatively difficult to sample from, especially in high dimensions. Therefore, we propose to replace the posterior mixing density by a dimension-reduced approximation, and we provide a bound in the Hellinger distance for the resulting approximate posterior. We apply the proposed approach to a posterior with Laplace prior, where we introduce two dimension-reduced approximations for the posterior mixing density. Our numerical experiments indicate that samples generated via the proposed approximations have very low correlation and are close to the exact posterior.
{"title":"Continuous Gaussian mixture solution for linear Bayesian inversion with application to Laplace priors","authors":"Rafael Flock, Yiqiu Dong, Felipe Uribe, Olivier Zahm","doi":"arxiv-2408.16594","DOIUrl":"https://doi.org/arxiv-2408.16594","url":null,"abstract":"We focus on Bayesian inverse problems with Gaussian likelihood, linear\u0000forward model, and priors that can be formulated as a Gaussian mixture. Such a\u0000mixture is expressed as an integral of Gaussian density functions weighted by a\u0000mixing density over the mixing variables. Within this framework, the\u0000corresponding posterior distribution also takes the form of a Gaussian mixture,\u0000and we derive the closed-form expression for its posterior mixing density. To\u0000sample from the posterior Gaussian mixture, we propose a two-step sampling\u0000method. First, we sample the mixture variables from the posterior mixing\u0000density, and then we sample the variables of interest from Gaussian densities\u0000conditioned on the sampled mixing variables. However, the posterior mixing\u0000density is relatively difficult to sample from, especially in high dimensions.\u0000Therefore, we propose to replace the posterior mixing density by a\u0000dimension-reduced approximation, and we provide a bound in the Hellinger\u0000distance for the resulting approximate posterior. We apply the proposed\u0000approach to a posterior with Laplace prior, where we introduce two\u0000dimension-reduced approximations for the posterior mixing density. Our\u0000numerical experiments indicate that samples generated via the proposed\u0000approximations have very low correlation and are close to the exact posterior.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sequential Monte Carlo methods are a powerful framework for approximating the posterior distribution of a state variable in a sequential manner. They provide an attractive way of analyzing dynamic systems in real-time, taking into account the limitations of traditional approaches such as Markov Chain Monte Carlo methods, which are not well suited to data that arrives incrementally. This paper reviews and explores the application of Sequential Monte Carlo in dynamic disease modeling, highlighting its capacity for online inference and real-time adaptation to evolving disease dynamics. The integration of kernel density approximation techniques within the stochastic Susceptible-Exposed-Infectious-Recovered (SEIR) compartment model is examined, demonstrating the algorithm's effectiveness in monitoring time-varying parameters such as the effective reproduction number. Case studies, including simulations with synthetic data and analysis of real-world COVID-19 data from Ireland, demonstrate the practical applicability of this approach for informing timely public health interventions.
{"title":"A review of sequential Monte Carlo methods for real-time disease modeling","authors":"Dhorasso Temfack, Jason Wyse","doi":"arxiv-2408.15739","DOIUrl":"https://doi.org/arxiv-2408.15739","url":null,"abstract":"Sequential Monte Carlo methods are a powerful framework for approximating the\u0000posterior distribution of a state variable in a sequential manner. They provide\u0000an attractive way of analyzing dynamic systems in real-time, taking into\u0000account the limitations of traditional approaches such as Markov Chain Monte\u0000Carlo methods, which are not well suited to data that arrives incrementally.\u0000This paper reviews and explores the application of Sequential Monte Carlo in\u0000dynamic disease modeling, highlighting its capacity for online inference and\u0000real-time adaptation to evolving disease dynamics. The integration of kernel\u0000density approximation techniques within the stochastic\u0000Susceptible-Exposed-Infectious-Recovered (SEIR) compartment model is examined,\u0000demonstrating the algorithm's effectiveness in monitoring time-varying\u0000parameters such as the effective reproduction number. Case studies, including\u0000simulations with synthetic data and analysis of real-world COVID-19 data from\u0000Ireland, demonstrate the practical applicability of this approach for informing\u0000timely public health interventions.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}