We derive a variational representation for the log-normalizing constant of the posterior distribution in Bayesian linear regression with a uniform spherical prior and an i.i.d. Gaussian design. We work under the"proportional"asymptotic regime, where the number of observations and the number of features grow at a proportional rate. This rigorously establishes the Thouless-Anderson-Palmer (TAP) approximation arising from spin glass theory, and proves a conjecture of Krzakala et. al. (2014) in the special case of the spherical prior.
我们推导了均匀球面先验和i.i.d高斯设计的贝叶斯线性回归中后验分布的对数归一化常数的变分表示。我们在“比例”渐近状态下工作,其中观测值的数量和特征的数量以比例速率增长。这严格地建立了自旋玻璃理论产生的thoulless - anderson - palmer (TAP)近似,并证明了Krzakala et. al.(2014)在球形先验的特殊情况下的一个猜想。
{"title":"The TAP free energy for high-dimensional linear regression","authors":"Jia Qiu, Subhabrata Sen","doi":"10.1214/22-aap1874","DOIUrl":"https://doi.org/10.1214/22-aap1874","url":null,"abstract":"We derive a variational representation for the log-normalizing constant of the posterior distribution in Bayesian linear regression with a uniform spherical prior and an i.i.d. Gaussian design. We work under the\"proportional\"asymptotic regime, where the number of observations and the number of features grow at a proportional rate. This rigorously establishes the Thouless-Anderson-Palmer (TAP) approximation arising from spin glass theory, and proves a conjecture of Krzakala et. al. (2014) in the special case of the spherical prior.","PeriodicalId":50979,"journal":{"name":"Annals of Applied Probability","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2022-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44919691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Non-universal fluctuations of the empirical measure for isotropic stationary fields on S2×R","authors":"D. Marinucci, Maurizia Rossi, Anna Vidotto","doi":"10.1214/20-aap1648","DOIUrl":"https://doi.org/10.1214/20-aap1648","url":null,"abstract":"","PeriodicalId":50979,"journal":{"name":"Annals of Applied Probability","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45159704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We obtain several quantitative bounds on the mixing properties of an “ideal” Hamiltonian Monte Carlo (HMC) Markov chain for a strongly log-concave target distribution π on R. Our main result says that the HMC Markov chain generates a sample with Wasserstein error in roughly O(κ log(1/ )) steps, where the condition number κ = M2 m2 is the ratio of the maximum M2 and minimum m2 eigenvalues of the Hessian of − log(π). In particular, this mixing bound does not depend explicitly on the dimension d. These results significantly extend and improve previous quantitative bounds on the mixing of ideal HMC, and can be used to analyze more realistic HMC algorithms. The main ingredient of our argument is a proof that initially “parallel” Hamiltonian trajectories contract over much longer steps than would be predicted by previous heuristics based on the Jacobi manifold.
{"title":"Mixing of Hamiltonian Monte Carlo on strongly log-concave distributions: Continuous dynamics","authors":"Oren Mangoubi, Aaron Smith","doi":"10.1214/20-aap1640","DOIUrl":"https://doi.org/10.1214/20-aap1640","url":null,"abstract":"We obtain several quantitative bounds on the mixing properties of an “ideal” Hamiltonian Monte Carlo (HMC) Markov chain for a strongly log-concave target distribution π on R. Our main result says that the HMC Markov chain generates a sample with Wasserstein error in roughly O(κ log(1/ )) steps, where the condition number κ = M2 m2 is the ratio of the maximum M2 and minimum m2 eigenvalues of the Hessian of − log(π). In particular, this mixing bound does not depend explicitly on the dimension d. These results significantly extend and improve previous quantitative bounds on the mixing of ideal HMC, and can be used to analyze more realistic HMC algorithms. The main ingredient of our argument is a proof that initially “parallel” Hamiltonian trajectories contract over much longer steps than would be predicted by previous heuristics based on the Jacobi manifold.","PeriodicalId":50979,"journal":{"name":"Annals of Applied Probability","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48793816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This work develops new results for stochastic approximation algorithms. The emphases are on treating algorithms and limits with discontinuities. The main ingredients include the use of differential inclusions, set-valued analysis, and non-smooth analysis, and stochastic differential inclusions. Under broad conditions, it is shown that a suitably scaled sequence of the iterates has a differential inclusion limit. In addition, it is shown for the first time that a centered and scaled sequence of the iterates converges weakly to a stochastic differential inclusion limit. The results are then used to treat several application examples including Markov decision process, Lasso algorithms, Pegasos algorithms, support vector machine classification, and learning. Some numerical demonstrations are also provided.
{"title":"Stochastic approximation with discontinuous dynamics, differential inclusions, and applications","authors":"N. Nguyen, G. Yin","doi":"10.1214/22-aap1829","DOIUrl":"https://doi.org/10.1214/22-aap1829","url":null,"abstract":"This work develops new results for stochastic approximation algorithms. The emphases are on treating algorithms and limits with discontinuities. The main ingredients include the use of differential inclusions, set-valued analysis, and non-smooth analysis, and stochastic differential inclusions. Under broad conditions, it is shown that a suitably scaled sequence of the iterates has a differential inclusion limit. In addition, it is shown for the first time that a centered and scaled sequence of the iterates converges weakly to a stochastic differential inclusion limit. The results are then used to treat several application examples including Markov decision process, Lasso algorithms, Pegasos algorithms, support vector machine classification, and learning. Some numerical demonstrations are also provided.","PeriodicalId":50979,"journal":{"name":"Annals of Applied Probability","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2021-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47173572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We establish cutoff for a natural random walk (RW) on the set of perfect matchings (PMs). An $n$-PM is a pairing of $2n$ objects. The $k$-PM RW selects $k$ pairs uniformly at random, disassociates the corresponding $2k$ objects, then chooses a new pairing on these $2k$ objects uniformly at random. The equilibrium distribution is uniform over the set of all $n$-PM. We establish cutoff for the $k$-PM RW whenever $2 le k ll n$. If $k gg 1$, then the mixing time is $tfrac nk log n$ to leading order. The case $k = 2$ was established by Diaconis and Holmes (2002) by relating the $2$-PM RW to the random transpositions card shuffle and also by Ceccherini-Silberstein, Scarabotti and Tolli (2007, 2008) using representation theory. We are the first to handle $k>2$. Our argument builds on previous work of Berestycki, Schramm, c{S}eng"ul and Zeitouni (2005, 2011, 2019) regarding conjugacy-invariant RWs on the permutation group.
{"title":"Cutoff for rewiring dynamics on perfect matchings","authors":"Sam Olesker-Taylor","doi":"10.1214/22-aap1825","DOIUrl":"https://doi.org/10.1214/22-aap1825","url":null,"abstract":"We establish cutoff for a natural random walk (RW) on the set of perfect matchings (PMs). An $n$-PM is a pairing of $2n$ objects. The $k$-PM RW selects $k$ pairs uniformly at random, disassociates the corresponding $2k$ objects, then chooses a new pairing on these $2k$ objects uniformly at random. The equilibrium distribution is uniform over the set of all $n$-PM. We establish cutoff for the $k$-PM RW whenever $2 le k ll n$. If $k gg 1$, then the mixing time is $tfrac nk log n$ to leading order. The case $k = 2$ was established by Diaconis and Holmes (2002) by relating the $2$-PM RW to the random transpositions card shuffle and also by Ceccherini-Silberstein, Scarabotti and Tolli (2007, 2008) using representation theory. We are the first to handle $k>2$. Our argument builds on previous work of Berestycki, Schramm, c{S}eng\"ul and Zeitouni (2005, 2011, 2019) regarding conjugacy-invariant RWs on the permutation group.","PeriodicalId":50979,"journal":{"name":"Annals of Applied Probability","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2021-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41395173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We introduce a two-type first passage percolation competition model on infinite connected graphs as follows. Type 1 spreads through the edges of the graph at rate 1 from a single distinguished site, while all other sites are initially vacant. Once a site is occupied by type 1, it converts to type 2 at rate $rho>0$. Sites occupied by type 2 then spread at rate $lambda>0$ through vacant sites emph{and} sites occupied by type 1, whereas type 1 can only spread through vacant sites. If the set of sites occupied by type 1 is non-empty at all times, we say type 1 emph{survives}. In the case of a regular $d$-ary tree for $dgeq 3$, we show type 1 can survive when it is slower than type 2, provided $rho$ is small enough. This is in contrast to when the underlying graph is $mathbb{Z}^d$, where for any $rho>0$, type 1 dies out almost surely if $lambda>1$.
{"title":"Coexistence in competing first passage percolation with conversion","authors":"T. Finn, Alexandre O. Stauffer","doi":"10.1214/22-aap1792","DOIUrl":"https://doi.org/10.1214/22-aap1792","url":null,"abstract":"We introduce a two-type first passage percolation competition model on infinite connected graphs as follows. Type 1 spreads through the edges of the graph at rate 1 from a single distinguished site, while all other sites are initially vacant. Once a site is occupied by type 1, it converts to type 2 at rate $rho>0$. Sites occupied by type 2 then spread at rate $lambda>0$ through vacant sites emph{and} sites occupied by type 1, whereas type 1 can only spread through vacant sites. If the set of sites occupied by type 1 is non-empty at all times, we say type 1 emph{survives}. In the case of a regular $d$-ary tree for $dgeq 3$, we show type 1 can survive when it is slower than type 2, provided $rho$ is small enough. This is in contrast to when the underlying graph is $mathbb{Z}^d$, where for any $rho>0$, type 1 dies out almost surely if $lambda>1$.","PeriodicalId":50979,"journal":{"name":"Annals of Applied Probability","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2021-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49356799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We consider the mean-field limit of systems of particles with singular interactions of the type $-log|x|$ or $|x|^{-s}$, with $00$, the convergence is global in time, and it is the first such result valid for both conservative and gradient flows in a singular setting on $mathbb{R}^d$. The proof relies on an adaptation of an argument of Carlen-Loss to show a decay rate of the solution to the limiting equation, and on an improvement of the modulated-energy method developed in arXiv:1508.03377, arXiv:1803.08345, arXiv:2107.02592 making it so that all prefactors in the time derivative of the modulated energy are controlled by a decaying bound on the limiting solution.
{"title":"Global-in-time mean-field convergence for singular Riesz-type diffusive flows","authors":"M. Rosenzweig, S. Serfaty","doi":"10.1214/22-aap1833","DOIUrl":"https://doi.org/10.1214/22-aap1833","url":null,"abstract":"We consider the mean-field limit of systems of particles with singular interactions of the type $-log|x|$ or $|x|^{-s}$, with $0<s<d-2$, and with an additive noise in dimensions $d geq 3$. We use a modulated-energy approach to prove a quantitative convergence rate to the solution of the corresponding limiting PDE. When $s>0$, the convergence is global in time, and it is the first such result valid for both conservative and gradient flows in a singular setting on $mathbb{R}^d$. The proof relies on an adaptation of an argument of Carlen-Loss to show a decay rate of the solution to the limiting equation, and on an improvement of the modulated-energy method developed in arXiv:1508.03377, arXiv:1803.08345, arXiv:2107.02592 making it so that all prefactors in the time derivative of the modulated energy are controlled by a decaying bound on the limiting solution.","PeriodicalId":50979,"journal":{"name":"Annals of Applied Probability","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2021-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44266923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We study the cyclic cellular automaton (CCA) and the Greenberg-Hastings model (GHM) with $kappage 3$ colors and contact threshold $thetage 2$ on the infinite $(d+1)$-regular tree, $T_d$. When the initial state has the uniform product distribution, we show that these dynamical systems exhibit at least two distinct phases. For sufficiently large $d$, we show that if $kappa(theta-1) le d - O(sqrt{dkappa ln(d)})$, then every vertex almost surely changes its color infinitely often, while if $kappatheta ge d + O(kappasqrt{dln(d)})$, then every vertex almost surely changes its color only finitely many times. Roughly, this implies that as $dto infty$, there is a phase transition where $kappatheta/d = 1$. For the GHM dynamics, in the scenario where every vertex changes color finitely many times, we moreover give an exponential tail bound for the distribution of the time of the last color change at a given vertex.
我们研究了无限$(d+1)$ -规则树$T_d$上具有$kappage 3$颜色和接触阈值$thetage 2$的循环元胞自动机(CCA)和Greenberg-Hastings模型(GHM)。当初始状态具有均匀积分布时,我们证明了这些动力系统至少表现出两个不同的相。对于足够大的$d$,我们证明如果$kappa(theta-1) le d - O(sqrt{dkappa ln(d)})$,那么每个顶点几乎肯定会无限次地改变其颜色,而如果$kappatheta ge d + O(kappasqrt{dln(d)})$,那么每个顶点几乎肯定只会有限次地改变其颜色。粗略地说,这意味着$dto infty$存在一个相变,其中$kappatheta/d = 1$。对于GHM动力学,在每个顶点改变颜色有限多次的情况下,我们进一步给出了给定顶点最后一次改变颜色的时间分布的指数尾界。
{"title":"Cyclic cellular automata and Greenberg–Hastings models on regular trees","authors":"J. Bello, David J Sivakoff","doi":"10.1214/22-aap1885","DOIUrl":"https://doi.org/10.1214/22-aap1885","url":null,"abstract":"We study the cyclic cellular automaton (CCA) and the Greenberg-Hastings model (GHM) with $kappage 3$ colors and contact threshold $thetage 2$ on the infinite $(d+1)$-regular tree, $T_d$. When the initial state has the uniform product distribution, we show that these dynamical systems exhibit at least two distinct phases. For sufficiently large $d$, we show that if $kappa(theta-1) le d - O(sqrt{dkappa ln(d)})$, then every vertex almost surely changes its color infinitely often, while if $kappatheta ge d + O(kappasqrt{dln(d)})$, then every vertex almost surely changes its color only finitely many times. Roughly, this implies that as $dto infty$, there is a phase transition where $kappatheta/d = 1$. For the GHM dynamics, in the scenario where every vertex changes color finitely many times, we moreover give an exponential tail bound for the distribution of the time of the last color change at a given vertex.","PeriodicalId":50979,"journal":{"name":"Annals of Applied Probability","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2021-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46716962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we consider a class of {it conditional McKean-Vlasov SDEs} (CMVSDE for short). Such an SDE can be considered as an extended version of McKean-Vlasov SDEs with common noises, as well as the general version of the so-called {it conditional mean-field SDEs} (CMFSDE) studied previously by the authors [1, 14], but with some fundamental differences. In particular, due to the lack of compactness of the iterated conditional laws, the existing arguments of Schauder's fixed point theorem do not seem to apply in this situation, and the heavy nonlinearity on the conditional laws caused by change of probability measure adds more technical subtleties. Under some structure assumptions on the coefficients of the observation equation, we prove the well-posedness of solution in the weak sense along a more direct approach. Our result is the first that deals with McKean-Vlasov type SDEs involving state-dependent conditional laws.
在本文中,我们考虑一类{ it conditional McKean Vlasov SDE}(简称CMVSDE)。这种SDE可以被认为是具有常见噪声的McKean Vlasov SDE的扩展版本,以及作者[1,14]之前研究的所谓的条件平均场SDE(CMFSDE)的一般版本,但有一些基本差异。特别是,由于迭代条件律缺乏紧致性,Schauder不动点定理的现有论点似乎不适用于这种情况,并且由于概率测度的变化导致条件律上的严重非线性增加了更多的技术细节。在观测方程系数的一些结构假设下,我们用更直接的方法证明了弱意义下解的适定性。我们的结果是第一个处理涉及状态相关条件律的McKean-Vlasov型SDE。
{"title":"A general conditional McKean–Vlasov stochastic differential equation","authors":"R. Buckdahn, Juan Li, Jin Ma","doi":"10.1214/22-aap1858","DOIUrl":"https://doi.org/10.1214/22-aap1858","url":null,"abstract":"In this paper we consider a class of {it conditional McKean-Vlasov SDEs} (CMVSDE for short). Such an SDE can be considered as an extended version of McKean-Vlasov SDEs with common noises, as well as the general version of the so-called {it conditional mean-field SDEs} (CMFSDE) studied previously by the authors [1, 14], but with some fundamental differences. In particular, due to the lack of compactness of the iterated conditional laws, the existing arguments of Schauder's fixed point theorem do not seem to apply in this situation, and the heavy nonlinearity on the conditional laws caused by change of probability measure adds more technical subtleties. Under some structure assumptions on the coefficients of the observation equation, we prove the well-posedness of solution in the weak sense along a more direct approach. Our result is the first that deals with McKean-Vlasov type SDEs involving state-dependent conditional laws.","PeriodicalId":50979,"journal":{"name":"Annals of Applied Probability","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2021-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44679489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We establish a quantitative version of the Tracy--Widom law for the largest eigenvalue of high dimensional sample covariance matrices. To be precise, we show that the fluctuations of the largest eigenvalue of a sample covariance matrix $X^*X$ converge to its Tracy--Widom limit at a rate nearly $N^{-1/3}$, where $X$ is an $M times N$ random matrix whose entries are independent real or complex random variables, assuming that both $M$ and $N$ tend to infinity at a constant rate. This result improves the previous estimate $N^{-2/9}$ obtained by Wang [73]. Our proof relies on a Green function comparison method [27] using iterative cumulant expansions, the local laws for the Green function and asymptotic properties of the correlation kernel of the white Wishart ensemble.
{"title":"Convergence rate to the Tracy–Widom laws for the largest eigenvalue of sample covariance matrices","authors":"Kevin Schnelli, Yuanyuan Xu","doi":"10.1214/22-aap1826","DOIUrl":"https://doi.org/10.1214/22-aap1826","url":null,"abstract":"We establish a quantitative version of the Tracy--Widom law for the largest eigenvalue of high dimensional sample covariance matrices. To be precise, we show that the fluctuations of the largest eigenvalue of a sample covariance matrix $X^*X$ converge to its Tracy--Widom limit at a rate nearly $N^{-1/3}$, where $X$ is an $M times N$ random matrix whose entries are independent real or complex random variables, assuming that both $M$ and $N$ tend to infinity at a constant rate. This result improves the previous estimate $N^{-2/9}$ obtained by Wang [73]. Our proof relies on a Green function comparison method [27] using iterative cumulant expansions, the local laws for the Green function and asymptotic properties of the correlation kernel of the white Wishart ensemble.","PeriodicalId":50979,"journal":{"name":"Annals of Applied Probability","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2021-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44875604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}