Niccolò Anceschi, Augusto Fasano, Beatrice Franzolini, Giovanni Rebaudo
Generalized linear models (GLMs) arguably represent the standard approach for statistical regression beyond the Gaussian likelihood scenario. When Bayesian formulations are employed, the general absence of a tractable posterior distribution has motivated the development of deterministic approximations, which are generally more scalable than sampling techniques. Among them, expectation propagation (EP) showed extreme accuracy, usually higher than many variational Bayes solutions. However, the higher computational cost of EP posed concerns about its practical feasibility, especially in high-dimensional settings. We address these concerns by deriving a novel efficient formulation of EP for GLMs, whose cost scales linearly in the number of covariates p. This reduces the state-of-the-art O(p^2 n) per-iteration computational cost of the EP routine for GLMs to O(p n min{p,n}), with n being the sample size. We also show that, for binary models and log-linear GLMs approximate predictive means can be obtained at no additional cost. To preserve efficient moment matching for count data, we propose employing a combination of log-normal Laplace transform approximations, avoiding numerical integration. These novel results open the possibility of employing EP in settings that were believed to be practically impossible. Improvements over state-of-the-art approaches are illustrated both for simulated and real data. The efficient EP implementation is available at https://github.com/niccoloanceschi/EPglm.
广义线性模型(GLM)可以说是超越高斯似然情景的标准统计回归方法。在使用贝叶斯公式时,由于普遍缺乏可操作的后分布,因此人们开发了确定性近似方法,这种方法通常比抽样技术更具可扩展性。其中,期望传播(EP)显示出极高的准确性,通常高于许多变量贝叶斯解决方案。然而,EP 较高的计算成本使人们对其实际可行性产生了担忧,尤其是在高维环境中。为了解决这些问题,我们为 GLMs 推导了一种新的高效 EP 方案,其成本与协方差的数量 p 成线性比例,从而将 GLMs 的 EP 例程的最新 O(p^2 n) 每次迭代计算成本降至 O(p,n),n 为样本大小。我们还证明,对于二元模型和对数线性 GLM,可以在不增加成本的情况下获得近似预测均值。为了保持计数数据的有效矩匹配,我们建议采用对数正态拉普变换近似的组合,避免数值积分。这些新颖的结果为在人们认为实际上不可能的情况下使用 EP 提供了可能性。在模拟数据和真实数据方面,与最先进的方法相比都有很大改进。高效的 EP 实现可在 https://github.com/niccoloanceschi/EPglm 上获取。
{"title":"Scalable expectation propagation for generalized linear models","authors":"Niccolò Anceschi, Augusto Fasano, Beatrice Franzolini, Giovanni Rebaudo","doi":"arxiv-2407.02128","DOIUrl":"https://doi.org/arxiv-2407.02128","url":null,"abstract":"Generalized linear models (GLMs) arguably represent the standard approach for\u0000statistical regression beyond the Gaussian likelihood scenario. When Bayesian\u0000formulations are employed, the general absence of a tractable posterior\u0000distribution has motivated the development of deterministic approximations,\u0000which are generally more scalable than sampling techniques. Among them,\u0000expectation propagation (EP) showed extreme accuracy, usually higher than many\u0000variational Bayes solutions. However, the higher computational cost of EP posed\u0000concerns about its practical feasibility, especially in high-dimensional\u0000settings. We address these concerns by deriving a novel efficient formulation\u0000of EP for GLMs, whose cost scales linearly in the number of covariates p. This\u0000reduces the state-of-the-art O(p^2 n) per-iteration computational cost of the\u0000EP routine for GLMs to O(p n min{p,n}), with n being the sample size. We also\u0000show that, for binary models and log-linear GLMs approximate predictive means\u0000can be obtained at no additional cost. To preserve efficient moment matching\u0000for count data, we propose employing a combination of log-normal Laplace\u0000transform approximations, avoiding numerical integration. These novel results\u0000open the possibility of employing EP in settings that were believed to be\u0000practically impossible. Improvements over state-of-the-art approaches are\u0000illustrated both for simulated and real data. The efficient EP implementation\u0000is available at https://github.com/niccoloanceschi/EPglm.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141531894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a general-purpose approximation to the Ferguson-Klass algorithm for generating samples from L'evy processes without Gaussian components. We show that the proposed method is more than 1000 times faster than the standard Ferguson-Klass algorithm without a significant loss of precision. This method can open an avenue for computationally efficient and scalable Bayesian nonparametric models which go beyond conjugacy assumptions, as demonstrated in the examples section.
{"title":"A General Purpose Approximation to the Ferguson-Klass Algorithm for Sampling from Lévy Processes Without Gaussian Components","authors":"Dawid Bernaciak, Jim E. Griffin","doi":"arxiv-2407.01483","DOIUrl":"https://doi.org/arxiv-2407.01483","url":null,"abstract":"We propose a general-purpose approximation to the Ferguson-Klass algorithm\u0000for generating samples from L'evy processes without Gaussian components. We\u0000show that the proposed method is more than 1000 times faster than the standard\u0000Ferguson-Klass algorithm without a significant loss of precision. This method\u0000can open an avenue for computationally efficient and scalable Bayesian\u0000nonparametric models which go beyond conjugacy assumptions, as demonstrated in\u0000the examples section.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141509193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
For linear systems $Ax=b$ we develop iterative algorithms based on a sketch-and-project approach. By using judicious choices for the sketch, such as the history of residuals, we develop weighting strategies that enable short recursive formulas. The proposed algorithms have a low memory footprint and iteration complexity compared to regular sketch-and-project methods. In a set of numerical experiments the new methods compare well to GMRES, SYMMLQ and state-of-the-art randomized solvers.
{"title":"Structured Sketching for Linear Systems","authors":"Johannes J Brust, Michael A Saunders","doi":"arxiv-2407.00746","DOIUrl":"https://doi.org/arxiv-2407.00746","url":null,"abstract":"For linear systems $Ax=b$ we develop iterative algorithms based on a\u0000sketch-and-project approach. By using judicious choices for the sketch, such as\u0000the history of residuals, we develop weighting strategies that enable short\u0000recursive formulas. The proposed algorithms have a low memory footprint and\u0000iteration complexity compared to regular sketch-and-project methods. In a set\u0000of numerical experiments the new methods compare well to GMRES, SYMMLQ and\u0000state-of-the-art randomized solvers.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141531895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Motivated by applications in emergency response and experimental design, we consider smooth stochastic optimization problems over probability measures supported on compact subsets of the Euclidean space. With the influence function as the variational object, we construct a deterministic Frank-Wolfe (dFW) recursion for probability spaces, made especially possible by a lemma that identifies a ``closed-form'' solution to the infinite-dimensional Frank-Wolfe sub-problem. Each iterate in dFW is expressed as a convex combination of the incumbent iterate and a Dirac measure concentrating on the minimum of the influence function at the incumbent iterate. To address common application contexts that have access only to Monte Carlo observations of the objective and influence function, we construct a stochastic Frank-Wolfe (sFW) variation that generates a random sequence of probability measures constructed using minima of increasingly accurate estimates of the influence function. We demonstrate that sFW's optimality gap sequence exhibits $O(k^{-1})$ iteration complexity almost surely and in expectation for smooth convex objectives, and $O(k^{-1/2})$ (in Frank-Wolfe gap) for smooth non-convex objectives. Furthermore, we show that an easy-to-implement fixed-step, fixed-sample version of (sFW) exhibits exponential convergence to $varepsilon$-optimality. We end with a central limit theorem on the observed objective values at the sequence of generated random measures. To further intuition, we include several illustrative examples with exact influence function calculations.
{"title":"Deterministic and Stochastic Frank-Wolfe Recursion on Probability Spaces","authors":"Di Yu, Shane G. Henderson, Raghu Pasupathy","doi":"arxiv-2407.00307","DOIUrl":"https://doi.org/arxiv-2407.00307","url":null,"abstract":"Motivated by applications in emergency response and experimental design, we\u0000consider smooth stochastic optimization problems over probability measures\u0000supported on compact subsets of the Euclidean space. With the influence\u0000function as the variational object, we construct a deterministic Frank-Wolfe\u0000(dFW) recursion for probability spaces, made especially possible by a lemma\u0000that identifies a ``closed-form'' solution to the infinite-dimensional\u0000Frank-Wolfe sub-problem. Each iterate in dFW is expressed as a convex\u0000combination of the incumbent iterate and a Dirac measure concentrating on the\u0000minimum of the influence function at the incumbent iterate. To address common\u0000application contexts that have access only to Monte Carlo observations of the\u0000objective and influence function, we construct a stochastic Frank-Wolfe (sFW)\u0000variation that generates a random sequence of probability measures constructed\u0000using minima of increasingly accurate estimates of the influence function. We\u0000demonstrate that sFW's optimality gap sequence exhibits $O(k^{-1})$ iteration\u0000complexity almost surely and in expectation for smooth convex objectives, and\u0000$O(k^{-1/2})$ (in Frank-Wolfe gap) for smooth non-convex objectives.\u0000Furthermore, we show that an easy-to-implement fixed-step, fixed-sample version\u0000of (sFW) exhibits exponential convergence to $varepsilon$-optimality. We end\u0000with a central limit theorem on the observed objective values at the sequence\u0000of generated random measures. To further intuition, we include several\u0000illustrative examples with exact influence function calculations.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141531754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The purpose of this technical note is to summarize the relationship between the marginal variance and correlation length of a Gaussian random field with Mat'ern covariance and the coefficients of the corresponding partial-differential-equation (PDE)-based precision operator.
{"title":"A note on the relationship between PDE-based precision operators and Matérn covariances","authors":"Umberto Villa, Thomas O'Leary-Roseberry","doi":"arxiv-2407.00471","DOIUrl":"https://doi.org/arxiv-2407.00471","url":null,"abstract":"The purpose of this technical note is to summarize the relationship between\u0000the marginal variance and correlation length of a Gaussian random field with\u0000Mat'ern covariance and the coefficients of the corresponding\u0000partial-differential-equation (PDE)-based precision operator.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141523198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
When the target of inference is a real-valued function of probability parameters in the k-sample multinomial problem, variance estimation may be challenging. In small samples, methods like the nonparametric bootstrap or delta method may perform poorly. We propose a novel general method in this setting for computing exact p-values and confidence intervals which means that type I error rates are correctly bounded and confidence intervals have at least nominal coverage at all sample sizes. Our method is applicable to any real-valued function of multinomial probabilities, accommodating an arbitrary number of samples with varying category counts. We describe the method and provide an implementation of it in R, with some computational optimization to ensure broad applicability. Simulations demonstrate our method's ability to maintain correct coverage rates in settings where the nonparametric bootstrap fails.
当推断的目标是 k 样本多项式问题中概率参数的实值函数时,方差估计可能会很困难。在小样本中,像非参数自举阶梯法这样的方法可能会表现不佳。在这种情况下,我们提出了一种计算精确 p 值和置信区间的新颖通用方法,这意味着在所有样本大小下,I 型误差率都能得到正确的约束,置信区间至少有名义覆盖率。我们的方法适用于多项式概率的任何实值函数,可容纳任意数量的具有不同类别计数的样本。我们描述了该方法,并提供了它在 R 语言中的实现,同时进行了一些计算优化,以确保广泛的适用性。模拟证明了我们的方法能够在非参数引导法失效的情况下保持正确的覆盖率。
{"title":"Exact confidence intervals for functions of parameters in the k-sample multinomial problem","authors":"Michael C Sachs, Erin E Gabriel, Michael P Fay","doi":"arxiv-2406.19141","DOIUrl":"https://doi.org/arxiv-2406.19141","url":null,"abstract":"When the target of inference is a real-valued function of probability\u0000parameters in the k-sample multinomial problem, variance estimation may be\u0000challenging. In small samples, methods like the nonparametric bootstrap or\u0000delta method may perform poorly. We propose a novel general method in this\u0000setting for computing exact p-values and confidence intervals which means that\u0000type I error rates are correctly bounded and confidence intervals have at least\u0000nominal coverage at all sample sizes. Our method is applicable to any\u0000real-valued function of multinomial probabilities, accommodating an arbitrary\u0000number of samples with varying category counts. We describe the method and\u0000provide an implementation of it in R, with some computational optimization to\u0000ensure broad applicability. Simulations demonstrate our method's ability to\u0000maintain correct coverage rates in settings where the nonparametric bootstrap\u0000fails.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141509194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mathieu Fourment, Matthew Macaulay, Christiaan J Swanepoel, Xiang Ji, Marc A Suchard, Frederick A Matsen IV
Bayesian inference has predominantly relied on the Markov chain Monte Carlo (MCMC) algorithm for many years. However, MCMC is computationally laborious, especially for complex phylogenetic models of time trees. This bottleneck has led to the search for alternatives, such as variational Bayes, which can scale better to large datasets. In this paper, we introduce torchtree, a framework written in Python that allows developers to easily implement rich phylogenetic models and algorithms using a fixed tree topology. One can either use automatic differentiation, or leverage torchtree's plug-in system to compute gradients analytically for model components for which automatic differentiation is slow. We demonstrate that the torchtree variational inference framework performs similarly to BEAST in terms of speed and approximation accuracy. Furthermore, we explore the use of the forward KL divergence as an optimizing criterion for variational inference, which can handle discontinuous and non-differentiable models. Our experiments show that inference using the forward KL divergence tends to be faster per iteration compared to the evidence lower bound (ELBO) criterion, although the ELBO-based inference may converge faster in some cases. Overall, torchtree provides a flexible and efficient framework for phylogenetic model development and inference using PyTorch.
{"title":"Torchtree: flexible phylogenetic model development and inference using PyTorch","authors":"Mathieu Fourment, Matthew Macaulay, Christiaan J Swanepoel, Xiang Ji, Marc A Suchard, Frederick A Matsen IV","doi":"arxiv-2406.18044","DOIUrl":"https://doi.org/arxiv-2406.18044","url":null,"abstract":"Bayesian inference has predominantly relied on the Markov chain Monte Carlo\u0000(MCMC) algorithm for many years. However, MCMC is computationally laborious,\u0000especially for complex phylogenetic models of time trees. This bottleneck has\u0000led to the search for alternatives, such as variational Bayes, which can scale\u0000better to large datasets. In this paper, we introduce torchtree, a framework\u0000written in Python that allows developers to easily implement rich phylogenetic\u0000models and algorithms using a fixed tree topology. One can either use automatic\u0000differentiation, or leverage torchtree's plug-in system to compute gradients\u0000analytically for model components for which automatic differentiation is slow.\u0000We demonstrate that the torchtree variational inference framework performs\u0000similarly to BEAST in terms of speed and approximation accuracy. Furthermore,\u0000we explore the use of the forward KL divergence as an optimizing criterion for\u0000variational inference, which can handle discontinuous and non-differentiable\u0000models. Our experiments show that inference using the forward KL divergence\u0000tends to be faster per iteration compared to the evidence lower bound (ELBO)\u0000criterion, although the ELBO-based inference may converge faster in some cases.\u0000Overall, torchtree provides a flexible and efficient framework for phylogenetic\u0000model development and inference using PyTorch.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141509195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a linear-complexity method for sampling from truncated multivariate normal (TMVN) distributions with high fidelity by applying nearest-neighbor approximations to a product-of-conditionals decomposition of the TMVN density. To make the sequential sampling based on the decomposition feasible, we introduce a novel method that avoids the intractable high-dimensional TMVN distribution by sampling sequentially from $m$-dimensional TMVN distributions, where $m$ is a tuning parameter controlling the fidelity. This allows us to overcome the existing methods' crucial problem of rapidly decreasing acceptance rates for increasing dimension. Throughout our experiments with up to tens of thousands of dimensions, we can produce high-fidelity samples with $m$ in the dozens, achieving superior scalability compared to existing state-of-the-art methods. We study a tetrachloroethylene concentration dataset that has $3{,}971$ observed responses and $20{,}730$ undetected responses, together modeled as a partially censored Gaussian process, where our method enables posterior inference for the censored responses through sampling a $20{,}730$-dimensional TMVN distribution.
{"title":"Scalable Sampling of Truncated Multivariate Normals Using Sequential Nearest-Neighbor Approximation","authors":"Jian Cao, Matthias Katzfuss","doi":"arxiv-2406.17307","DOIUrl":"https://doi.org/arxiv-2406.17307","url":null,"abstract":"We propose a linear-complexity method for sampling from truncated\u0000multivariate normal (TMVN) distributions with high fidelity by applying\u0000nearest-neighbor approximations to a product-of-conditionals decomposition of\u0000the TMVN density. To make the sequential sampling based on the decomposition\u0000feasible, we introduce a novel method that avoids the intractable\u0000high-dimensional TMVN distribution by sampling sequentially from\u0000$m$-dimensional TMVN distributions, where $m$ is a tuning parameter controlling\u0000the fidelity. This allows us to overcome the existing methods' crucial problem\u0000of rapidly decreasing acceptance rates for increasing dimension. Throughout our\u0000experiments with up to tens of thousands of dimensions, we can produce\u0000high-fidelity samples with $m$ in the dozens, achieving superior scalability\u0000compared to existing state-of-the-art methods. We study a tetrachloroethylene\u0000concentration dataset that has $3{,}971$ observed responses and $20{,}730$\u0000undetected responses, together modeled as a partially censored Gaussian\u0000process, where our method enables posterior inference for the censored\u0000responses through sampling a $20{,}730$-dimensional TMVN distribution.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141523218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jere Koskela, Paul A. Jenkins, Adam M. Johansen, Dario Spano
We show that genealogical trees arising from a broad class of non-neutral models of population evolution converge to the Kingman coalescent under a suitable rescaling of time. As well as non-neutral biological evolution, our results apply to genetic algorithms encompassing the prominent class of sequential Monte Carlo (SMC) methods. The time rescaling we need differs slightly from that used in classical results for convergence to the Kingman coalescent, which has implications for the performance of different resampling schemes in SMC algorithms. In addition, our work substantially simplifies earlier proofs of convergence to the Kingman coalescent, and corrects an error common to several earlier results.
{"title":"Genealogical processes of non-neutral population models under rapid mutation","authors":"Jere Koskela, Paul A. Jenkins, Adam M. Johansen, Dario Spano","doi":"arxiv-2406.16465","DOIUrl":"https://doi.org/arxiv-2406.16465","url":null,"abstract":"We show that genealogical trees arising from a broad class of non-neutral\u0000models of population evolution converge to the Kingman coalescent under a\u0000suitable rescaling of time. As well as non-neutral biological evolution, our\u0000results apply to genetic algorithms encompassing the prominent class of\u0000sequential Monte Carlo (SMC) methods. The time rescaling we need differs\u0000slightly from that used in classical results for convergence to the Kingman\u0000coalescent, which has implications for the performance of different resampling\u0000schemes in SMC algorithms. In addition, our work substantially simplifies\u0000earlier proofs of convergence to the Kingman coalescent, and corrects an error\u0000common to several earlier results.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141523192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Parameter inference for linear and non-Gaussian state space models is challenging because the likelihood function contains an intractable integral over the latent state variables. Exact inference using Markov chain Monte Carlo is computationally expensive, particularly for long time series data. Variational Bayes methods are useful when exact inference is infeasible. These methods approximate the posterior density of the parameters by a simple and tractable distribution found through optimisation. In this paper, we propose a novel sequential variational Bayes approach that makes use of the Whittle likelihood for computationally efficient parameter inference in this class of state space models. Our algorithm, which we call Recursive Variational Gaussian Approximation with the Whittle Likelihood (R-VGA-Whittle), updates the variational parameters by processing data in the frequency domain. At each iteration, R-VGA-Whittle requires the gradient and Hessian of the Whittle log-likelihood, which are available in closed form for a wide class of models. Through several examples using a linear Gaussian state space model and a univariate/bivariate non-Gaussian stochastic volatility model, we show that R-VGA-Whittle provides good approximations to posterior distributions of the parameters and is very computationally efficient when compared to asymptotically exact methods such as Hamiltonian Monte Carlo.
{"title":"Recursive variational Gaussian approximation with the Whittle likelihood for linear non-Gaussian state space models","authors":"Bao Anh Vu, David Gunawan, Andrew Zammit-Mangion","doi":"arxiv-2406.15998","DOIUrl":"https://doi.org/arxiv-2406.15998","url":null,"abstract":"Parameter inference for linear and non-Gaussian state space models is\u0000challenging because the likelihood function contains an intractable integral\u0000over the latent state variables. Exact inference using Markov chain Monte Carlo\u0000is computationally expensive, particularly for long time series data.\u0000Variational Bayes methods are useful when exact inference is infeasible. These\u0000methods approximate the posterior density of the parameters by a simple and\u0000tractable distribution found through optimisation. In this paper, we propose a\u0000novel sequential variational Bayes approach that makes use of the Whittle\u0000likelihood for computationally efficient parameter inference in this class of\u0000state space models. Our algorithm, which we call Recursive Variational Gaussian\u0000Approximation with the Whittle Likelihood (R-VGA-Whittle), updates the\u0000variational parameters by processing data in the frequency domain. At each\u0000iteration, R-VGA-Whittle requires the gradient and Hessian of the Whittle\u0000log-likelihood, which are available in closed form for a wide class of models.\u0000Through several examples using a linear Gaussian state space model and a\u0000univariate/bivariate non-Gaussian stochastic volatility model, we show that\u0000R-VGA-Whittle provides good approximations to posterior distributions of the\u0000parameters and is very computationally efficient when compared to\u0000asymptotically exact methods such as Hamiltonian Monte Carlo.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141509196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}