Karthik Menon, Andrea Zanoni, Owais Khan, Gianluca Geraci, Koen Nieman, Daniele E. Schiavazzi, Alison L. Marsden
Simulations of coronary hemodynamics have improved non-invasive clinical risk stratification and treatment outcomes for coronary artery disease, compared to relying on anatomical imaging alone. However, simulations typically use empirical approaches to distribute total coronary flow amongst the arteries in the coronary tree. This ignores patient variability, the presence of disease, and other clinical factors. Further, uncertainty in the clinical data often remains unaccounted for in the modeling pipeline. We present an end-to-end uncertainty-aware pipeline to (1) personalize coronary flow simulations by incorporating branch-specific coronary flows as well as cardiac function; and (2) predict clinical and biomechanical quantities of interest with improved precision, while accounting for uncertainty in the clinical data. We assimilate patient-specific measurements of myocardial blood flow from CT myocardial perfusion imaging to estimate branch-specific coronary flows. We use adaptive Markov Chain Monte Carlo sampling to estimate the joint posterior distributions of model parameters with simulated noise in the clinical data. Additionally, we determine the posterior predictive distribution for relevant quantities of interest using a new approach combining multi-fidelity Monte Carlo estimation with non-linear, data-driven dimensionality reduction. Our framework recapitulates clinically measured cardiac function as well as branch-specific coronary flows under measurement uncertainty. We substantially shrink the confidence intervals for estimated quantities of interest compared to single-fidelity and state-of-the-art multi-fidelity Monte Carlo methods. This is especially true for quantities that showed limited correlation between the low- and high-fidelity model predictions. Moreover, the proposed estimators are significantly cheaper to compute for a specified confidence level or variance.
{"title":"Personalized and uncertainty-aware coronary hemodynamics simulations: From Bayesian estimation to improved multi-fidelity uncertainty quantification","authors":"Karthik Menon, Andrea Zanoni, Owais Khan, Gianluca Geraci, Koen Nieman, Daniele E. Schiavazzi, Alison L. Marsden","doi":"arxiv-2409.02247","DOIUrl":"https://doi.org/arxiv-2409.02247","url":null,"abstract":"Simulations of coronary hemodynamics have improved non-invasive clinical risk\u0000stratification and treatment outcomes for coronary artery disease, compared to\u0000relying on anatomical imaging alone. However, simulations typically use\u0000empirical approaches to distribute total coronary flow amongst the arteries in\u0000the coronary tree. This ignores patient variability, the presence of disease,\u0000and other clinical factors. Further, uncertainty in the clinical data often\u0000remains unaccounted for in the modeling pipeline. We present an end-to-end\u0000uncertainty-aware pipeline to (1) personalize coronary flow simulations by\u0000incorporating branch-specific coronary flows as well as cardiac function; and\u0000(2) predict clinical and biomechanical quantities of interest with improved\u0000precision, while accounting for uncertainty in the clinical data. We assimilate\u0000patient-specific measurements of myocardial blood flow from CT myocardial\u0000perfusion imaging to estimate branch-specific coronary flows. We use adaptive\u0000Markov Chain Monte Carlo sampling to estimate the joint posterior distributions\u0000of model parameters with simulated noise in the clinical data. Additionally, we\u0000determine the posterior predictive distribution for relevant quantities of\u0000interest using a new approach combining multi-fidelity Monte Carlo estimation\u0000with non-linear, data-driven dimensionality reduction. Our framework\u0000recapitulates clinically measured cardiac function as well as branch-specific\u0000coronary flows under measurement uncertainty. We substantially shrink the\u0000confidence intervals for estimated quantities of interest compared to\u0000single-fidelity and state-of-the-art multi-fidelity Monte Carlo methods. This\u0000is especially true for quantities that showed limited correlation between the\u0000low- and high-fidelity model predictions. Moreover, the proposed estimators are\u0000significantly cheaper to compute for a specified confidence level or variance.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this work, we investigate the convergence properties of the backward regularized Wasserstein proximal (BRWP) method for sampling a target distribution. The BRWP approach can be shown as a semi-implicit time discretization for a probability flow ODE with the score function whose density satisfies the Fokker-Planck equation of the overdamped Langevin dynamics. Specifically, the evolution of the score function is computed using a kernel formula derived from the regularized Wasserstein proximal operator. By applying the Laplace method to obtain the asymptotic expansion of this kernel formula, we establish guaranteed convergence in terms of the Kullback-Leibler divergence for the BRWP method towards a strongly log-concave target distribution. Our analysis also identifies the optimal and maximum step sizes for convergence. Furthermore, we demonstrate that the deterministic and semi-implicit BRWP scheme outperforms many classical Langevin Monte Carlo methods, such as the Unadjusted Langevin Algorithm (ULA), by offering faster convergence and reduced bias. Numerical experiments further validate the convergence analysis of the BRWP method.
{"title":"Convergence of Noise-Free Sampling Algorithms with Regularized Wasserstein Proximals","authors":"Fuqun Han, Stanley Osher, Wuchen Li","doi":"arxiv-2409.01567","DOIUrl":"https://doi.org/arxiv-2409.01567","url":null,"abstract":"In this work, we investigate the convergence properties of the backward\u0000regularized Wasserstein proximal (BRWP) method for sampling a target\u0000distribution. The BRWP approach can be shown as a semi-implicit time\u0000discretization for a probability flow ODE with the score function whose density\u0000satisfies the Fokker-Planck equation of the overdamped Langevin dynamics.\u0000Specifically, the evolution of the score function is computed using a kernel\u0000formula derived from the regularized Wasserstein proximal operator. By applying\u0000the Laplace method to obtain the asymptotic expansion of this kernel formula,\u0000we establish guaranteed convergence in terms of the Kullback-Leibler divergence\u0000for the BRWP method towards a strongly log-concave target distribution. Our\u0000analysis also identifies the optimal and maximum step sizes for convergence.\u0000Furthermore, we demonstrate that the deterministic and semi-implicit BRWP\u0000scheme outperforms many classical Langevin Monte Carlo methods, such as the\u0000Unadjusted Langevin Algorithm (ULA), by offering faster convergence and reduced\u0000bias. Numerical experiments further validate the convergence analysis of the\u0000BRWP method.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we study the characterization of a network population by analyzing a single observed network, focusing on the counts of multiple network motifs or their corresponding multivariate network moments. We introduce an algorithm based on node subsampling to approximate the nontrivial joint distribution of the network moments, and prove its asymptotic accuracy. By examining the joint distribution of these moments, our approach captures complex dependencies among network motifs, making a significant advancement over earlier methods that rely on individual motifs marginally. This enables more accurate and robust network inference. Through real-world applications, such as comparing coexpression networks of distinct gene sets and analyzing collaboration patterns within the statistical community, we demonstrate that the multivariate inference of network moments provides deeper insights than marginal approaches, thereby enhancing our understanding of network mechanisms.
{"title":"Multivariate Inference of Network Moments by Subsampling","authors":"Mingyu Qi, Tianxi Li, Wen Zhou","doi":"arxiv-2409.01599","DOIUrl":"https://doi.org/arxiv-2409.01599","url":null,"abstract":"In this paper, we study the characterization of a network population by\u0000analyzing a single observed network, focusing on the counts of multiple network\u0000motifs or their corresponding multivariate network moments. We introduce an\u0000algorithm based on node subsampling to approximate the nontrivial joint\u0000distribution of the network moments, and prove its asymptotic accuracy. By\u0000examining the joint distribution of these moments, our approach captures\u0000complex dependencies among network motifs, making a significant advancement\u0000over earlier methods that rely on individual motifs marginally. This enables\u0000more accurate and robust network inference. Through real-world applications,\u0000such as comparing coexpression networks of distinct gene sets and analyzing\u0000collaboration patterns within the statistical community, we demonstrate that\u0000the multivariate inference of network moments provides deeper insights than\u0000marginal approaches, thereby enhancing our understanding of network mechanisms.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"74 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matthieu Pluntz, Cyril Dalmasso, Pascale Tubert-Bitter, Ismail Ahmed
We propose a test of the significance of a variable appearing on the Lasso path and use it in a procedure for selecting one of the models of the Lasso path, controlling the Family-Wise Error Rate. Our null hypothesis depends on a set A of already selected variables and states that it contains all the active variables. We focus on the regularization parameter value from which a first variable outside A is selected. As the test statistic, we use this quantity's conditional p-value, which we define conditional on the non-penalized estimated coefficients of the model restricted to A. We estimate this by simulating outcome vectors and then calibrating them on the observed outcome's estimated coefficients. We adapt the calibration heuristically to the case of generalized linear models in which it turns into an iterative stochastic procedure. We prove that the test controls the risk of selecting a false positive in linear models, both under the null hypothesis and, under a correlation condition, when A does not contain all active variables. We assess the performance of our procedure through extensive simulation studies. We also illustrate it in the detection of exposures associated with drug-induced liver injuries in the French pharmacovigilance database.
我们建议对拉索帕斯中出现的变量进行重要性检验,并将其用于选择拉索帕斯模型之一的程序中,同时控制全族平均误差率(Family-Wise Error Rate)。我们的零假设取决于已选定变量的集合 A,并指出它包含所有有效变量。我们的重点是正则化参数值,从中选出 A 以外的第一个变量。作为检验统计量,我们使用这个量的条件 p 值,它是以限制在 A 中的模型的非惩罚估计系数为条件定义的。我们通过模拟结果向量,然后根据观察结果的估计系数进行校准来估计这个值。我们将校准启发式地应用于广义线性模型的情况,在这种情况下,校准变成了一个迭代随机过程。我们证明,无论是在零假设下,还是在相关条件下,当 A 不包含所有活动变量时,该检验都能控制线性模型中选择假阳性的风险。我们通过大量的模拟研究评估了我们程序的性能。我们还以法国药物警戒数据库中与药物引起的肝损伤相关的暴露检测为例进行了说明。
{"title":"Simulation-calibration testing for inference in Lasso regressions","authors":"Matthieu Pluntz, Cyril Dalmasso, Pascale Tubert-Bitter, Ismail Ahmed","doi":"arxiv-2409.02269","DOIUrl":"https://doi.org/arxiv-2409.02269","url":null,"abstract":"We propose a test of the significance of a variable appearing on the Lasso\u0000path and use it in a procedure for selecting one of the models of the Lasso\u0000path, controlling the Family-Wise Error Rate. Our null hypothesis depends on a\u0000set A of already selected variables and states that it contains all the active\u0000variables. We focus on the regularization parameter value from which a first\u0000variable outside A is selected. As the test statistic, we use this quantity's\u0000conditional p-value, which we define conditional on the non-penalized estimated\u0000coefficients of the model restricted to A. We estimate this by simulating\u0000outcome vectors and then calibrating them on the observed outcome's estimated\u0000coefficients. We adapt the calibration heuristically to the case of generalized\u0000linear models in which it turns into an iterative stochastic procedure. We\u0000prove that the test controls the risk of selecting a false positive in linear\u0000models, both under the null hypothesis and, under a correlation condition, when\u0000A does not contain all active variables. We assess the performance of our\u0000procedure through extensive simulation studies. We also illustrate it in the\u0000detection of exposures associated with drug-induced liver injuries in the\u0000French pharmacovigilance database.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"61 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142193001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Building on recent studies of large-dimensional kernel regression, particularly those involving inner product kernels on the sphere $mathbb{S}^{d}$, we investigate the Pinsker bound for inner product kernel regression in such settings. Specifically, we address the scenario where the sample size $n$ is given by $alpha d^{gamma}(1+o_{d}(1))$ for some $alpha, gamma>0$. We have determined the exact minimax risk for kernel regression in this setting, not only identifying the minimax rate but also the exact constant, known as the Pinsker constant, associated with the excess risk.
{"title":"On the Pinsker bound of inner product kernel regression in large dimensions","authors":"Weihao Lu, Jialin Ding, Haobo Zhang, Qian Lin","doi":"arxiv-2409.00915","DOIUrl":"https://doi.org/arxiv-2409.00915","url":null,"abstract":"Building on recent studies of large-dimensional kernel regression,\u0000particularly those involving inner product kernels on the sphere\u0000$mathbb{S}^{d}$, we investigate the Pinsker bound for inner product kernel\u0000regression in such settings. Specifically, we address the scenario where the\u0000sample size $n$ is given by $alpha d^{gamma}(1+o_{d}(1))$ for some $alpha,\u0000gamma>0$. We have determined the exact minimax risk for kernel regression in\u0000this setting, not only identifying the minimax rate but also the exact\u0000constant, known as the Pinsker constant, associated with the excess risk.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In extreme value inference it is a fundamental problem how the target value is required to be extreme by the extreme value theory. In iid settings this study both theoretically and numerically compares tail estimators, which are based on either or both of the extreme value theory and the nonparametric smoothing. This study considers tail probability estimation and mean excess function estimation. This study assumes that the extreme value index of the underlying distribution is nonnegative. Specifically, the Hall class or the Weibull class of distributions is supposed in order to obtain the convergence rates of the estimators. This study investigates the nonparametric kernel type estimators, the fitting estimators to the generalized Pareto distribution and the plug-in estimators of the Hall distribution, which was proposed by Hall and Weissman (1997). In simulation studies the mean squared errors of the estimators in some finite sample cases are compared.
{"title":"On tail inference in iid settings with nonnegative extreme value index","authors":"Taku Moriyama","doi":"arxiv-2409.00906","DOIUrl":"https://doi.org/arxiv-2409.00906","url":null,"abstract":"In extreme value inference it is a fundamental problem how the target value\u0000is required to be extreme by the extreme value theory. In iid settings this\u0000study both theoretically and numerically compares tail estimators, which are\u0000based on either or both of the extreme value theory and the nonparametric\u0000smoothing. This study considers tail probability estimation and mean excess\u0000function estimation. This study assumes that the extreme value index of the underlying\u0000distribution is nonnegative. Specifically, the Hall class or the Weibull class\u0000of distributions is supposed in order to obtain the convergence rates of the\u0000estimators. This study investigates the nonparametric kernel type estimators,\u0000the fitting estimators to the generalized Pareto distribution and the plug-in\u0000estimators of the Hall distribution, which was proposed by Hall and Weissman\u0000(1997). In simulation studies the mean squared errors of the estimators in some\u0000finite sample cases are compared.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"61 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maria Rosaria Formica, Eugeny Ostrovsky, Leonid Sirota
We construct an optimal exponential tail decreasing confidence region for an unknown density of distribution in the Lebesgue-Riesz as well as in the uniform} norm, built on the sample of the random vectors based of the famous recursive Wolverton-Wagner density estimation.
{"title":"Confidence regions for the multidimensional density in the uniform norm based on the recursive Wolverton-Wagner estimation","authors":"Maria Rosaria Formica, Eugeny Ostrovsky, Leonid Sirota","doi":"arxiv-2409.01451","DOIUrl":"https://doi.org/arxiv-2409.01451","url":null,"abstract":"We construct an optimal exponential tail decreasing confidence region for an\u0000unknown density of distribution in the Lebesgue-Riesz as well as in the\u0000uniform} norm, built on the sample of the random vectors based of the famous\u0000recursive Wolverton-Wagner density estimation.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"75 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guanyi Chen, Jian Ding, Shuyang Gong, Zhangsong Li
Detection of correlation in a pair of random graphs is a fundamental statistical and computational problem that has been extensively studied in recent years. In this work, we consider a pair of correlated (sparse) stochastic block models $mathcal{S}(n,tfrac{lambda}{n};k,epsilon;s)$ that are subsampled from a common parent stochastic block model $mathcal S(n,tfrac{lambda}{n};k,epsilon)$ with $k=O(1)$ symmetric communities, average degree $lambda=O(1)$, divergence parameter $epsilon$, and subsampling probability $s$. For the detection problem of distinguishing this model from a pair of independent ErdH{o}s-R'enyi graphs with the same edge density $mathcal{G}(n,tfrac{lambda s}{n})$, we focus on tests based on emph{low-degree polynomials} of the entries of the adjacency matrices, and we determine the threshold that separates the easy and hard regimes. More precisely, we show that this class of tests can distinguish these two models if and only if $s> min { sqrt{alpha}, frac{1}{lambda epsilon^2} }$, where $alphaapprox 0.338$ is the Otter's constant and $frac{1}{lambda epsilon^2}$ is the Kesten-Stigum threshold. Our proof of low-degree hardness is based on a conditional variant of the low-degree likelihood calculation.
{"title":"A computational transition for detecting correlated stochastic block models by low-degree polynomials","authors":"Guanyi Chen, Jian Ding, Shuyang Gong, Zhangsong Li","doi":"arxiv-2409.00966","DOIUrl":"https://doi.org/arxiv-2409.00966","url":null,"abstract":"Detection of correlation in a pair of random graphs is a fundamental\u0000statistical and computational problem that has been extensively studied in\u0000recent years. In this work, we consider a pair of correlated (sparse)\u0000stochastic block models $mathcal{S}(n,tfrac{lambda}{n};k,epsilon;s)$ that\u0000are subsampled from a common parent stochastic block model $mathcal\u0000S(n,tfrac{lambda}{n};k,epsilon)$ with $k=O(1)$ symmetric communities,\u0000average degree $lambda=O(1)$, divergence parameter $epsilon$, and subsampling\u0000probability $s$. For the detection problem of distinguishing this model from a pair of\u0000independent ErdH{o}s-R'enyi graphs with the same edge density\u0000$mathcal{G}(n,tfrac{lambda s}{n})$, we focus on tests based on\u0000emph{low-degree polynomials} of the entries of the adjacency matrices, and we\u0000determine the threshold that separates the easy and hard regimes. More\u0000precisely, we show that this class of tests can distinguish these two models if\u0000and only if $s> min { sqrt{alpha}, frac{1}{lambda epsilon^2} }$, where\u0000$alphaapprox 0.338$ is the Otter's constant and $frac{1}{lambda\u0000epsilon^2}$ is the Kesten-Stigum threshold. Our proof of low-degree hardness\u0000is based on a conditional variant of the low-degree likelihood calculation.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Current statistics literature on statistical inference of random fields typically assumes that the fields are stationary or focuses on models of non-stationary Gaussian fields with parametric/semiparametric covariance families, which may not be sufficiently flexible to tackle complex modern-era random field data. This paper performs simultaneous nonparametric statistical inference for a general class of non-stationary and non-Gaussian random fields by modeling the fields as nonlinear systems with location-dependent transformations of an underlying `shift random field'. Asymptotic results, including concentration inequalities and Gaussian approximation theorems for high dimensional sparse linear forms of the random field, are derived. A computationally efficient locally weighted multiplier bootstrap algorithm is proposed and theoretically verified as a unified tool for the simultaneous inference of the aforementioned non-stationary non-Gaussian random field. Simulations and real-life data examples demonstrate good performances and broad applications of the proposed algorithm.
{"title":"Simultaneous Inference for Non-Stationary Random Fields, with Application to Gridded Data Analysis","authors":"Yunyi Zhang, Zhou Zhou","doi":"arxiv-2409.01220","DOIUrl":"https://doi.org/arxiv-2409.01220","url":null,"abstract":"Current statistics literature on statistical inference of random fields\u0000typically assumes that the fields are stationary or focuses on models of\u0000non-stationary Gaussian fields with parametric/semiparametric covariance\u0000families, which may not be sufficiently flexible to tackle complex modern-era\u0000random field data. This paper performs simultaneous nonparametric statistical\u0000inference for a general class of non-stationary and non-Gaussian random fields\u0000by modeling the fields as nonlinear systems with location-dependent\u0000transformations of an underlying `shift random field'. Asymptotic results,\u0000including concentration inequalities and Gaussian approximation theorems for\u0000high dimensional sparse linear forms of the random field, are derived. A\u0000computationally efficient locally weighted multiplier bootstrap algorithm is\u0000proposed and theoretically verified as a unified tool for the simultaneous\u0000inference of the aforementioned non-stationary non-Gaussian random field.\u0000Simulations and real-life data examples demonstrate good performances and broad\u0000applications of the proposed algorithm.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"24 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We introduce $textit{Stein transport}$, a novel methodology for Bayesian inference designed to efficiently push an ensemble of particles along a predefined curve of tempered probability distributions. The driving vector field is chosen from a reproducing kernel Hilbert space and can be derived either through a suitable kernel ridge regression formulation or as an infinitesimal optimal transport map in the Stein geometry. The update equations of Stein transport resemble those of Stein variational gradient descent (SVGD), but introduce a time-varying score function as well as specific weights attached to the particles. While SVGD relies on convergence in the long-time limit, Stein transport reaches its posterior approximation at finite time $t=1$. Studying the mean-field limit, we discuss the errors incurred by regularisation and finite-particle effects, and we connect Stein transport to birth-death dynamics and Fisher-Rao gradient flows. In a series of experiments, we show that in comparison to SVGD, Stein transport not only often reaches more accurate posterior approximations with a significantly reduced computational budget, but that it also effectively mitigates the variance collapse phenomenon commonly observed in SVGD.
{"title":"Stein transport for Bayesian inference","authors":"Nikolas Nüsken","doi":"arxiv-2409.01464","DOIUrl":"https://doi.org/arxiv-2409.01464","url":null,"abstract":"We introduce $textit{Stein transport}$, a novel methodology for Bayesian\u0000inference designed to efficiently push an ensemble of particles along a\u0000predefined curve of tempered probability distributions. The driving vector\u0000field is chosen from a reproducing kernel Hilbert space and can be derived\u0000either through a suitable kernel ridge regression formulation or as an\u0000infinitesimal optimal transport map in the Stein geometry. The update equations\u0000of Stein transport resemble those of Stein variational gradient descent (SVGD),\u0000but introduce a time-varying score function as well as specific weights\u0000attached to the particles. While SVGD relies on convergence in the long-time\u0000limit, Stein transport reaches its posterior approximation at finite time\u0000$t=1$. Studying the mean-field limit, we discuss the errors incurred by\u0000regularisation and finite-particle effects, and we connect Stein transport to\u0000birth-death dynamics and Fisher-Rao gradient flows. In a series of experiments,\u0000we show that in comparison to SVGD, Stein transport not only often reaches more\u0000accurate posterior approximations with a significantly reduced computational\u0000budget, but that it also effectively mitigates the variance collapse phenomenon\u0000commonly observed in SVGD.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"144 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}