{"title":"Faster high-accuracy log-concave sampling via algorithmic warm starts","authors":"Jason M. Altschuler, Sinho Chewi","doi":"10.1145/3653446","DOIUrl":null,"url":null,"abstract":"<p>It is a fundamental problem to understand the complexity of high-accuracy sampling from a strongly log-concave density <i>π</i> on \\(\\mathbb {R}^d \\). Indeed, in practice, high-accuracy samplers such as the Metropolis-adjusted Langevin algorithm (MALA) remain the de facto gold standard; and in theory, via the proximal sampler reduction, it is understood that such samplers are key for sampling even beyond log-concavity (in particular, for sampling under isoperimetric assumptions). This paper improves the dimension dependence of this sampling problem to \\(\\widetilde{O}(d^{1/2}) \\). The previous best result for MALA was \\(\\widetilde{O}(d) \\). This closes the long line of work on the complexity of MALA, and moreover leads to state-of-the-art guarantees for high-accuracy sampling under strong log-concavity and beyond (thanks to the aforementioned reduction). Our starting point is that the complexity of MALA improves to \\(\\widetilde{O}(d^{1/2}) \\), but only under a <i>warm start</i> (an initialization with constant Rényi divergence w.r.t. <i>π</i>). Previous algorithms for finding a warm start took <i>O</i>(<i>d</i>) time and thus dominated the computational effort of sampling. Our main technical contribution resolves this gap by establishing the first \\(\\widetilde{O}(d^{1/2}) \\) Rényi mixing rates for the discretized underdamped Langevin diffusion. For this, we develop new differential-privacy-inspired techniques based on Rényi divergences with Orlicz–Wasserstein shifts, which allow us to sidestep longstanding challenges for proving fast convergence of hypocoercive differential equations.</p>","PeriodicalId":50022,"journal":{"name":"Journal of the ACM","volume":"43 1","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the ACM","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3653446","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
It is a fundamental problem to understand the complexity of high-accuracy sampling from a strongly log-concave density π on \(\mathbb {R}^d \). Indeed, in practice, high-accuracy samplers such as the Metropolis-adjusted Langevin algorithm (MALA) remain the de facto gold standard; and in theory, via the proximal sampler reduction, it is understood that such samplers are key for sampling even beyond log-concavity (in particular, for sampling under isoperimetric assumptions). This paper improves the dimension dependence of this sampling problem to \(\widetilde{O}(d^{1/2}) \). The previous best result for MALA was \(\widetilde{O}(d) \). This closes the long line of work on the complexity of MALA, and moreover leads to state-of-the-art guarantees for high-accuracy sampling under strong log-concavity and beyond (thanks to the aforementioned reduction). Our starting point is that the complexity of MALA improves to \(\widetilde{O}(d^{1/2}) \), but only under a warm start (an initialization with constant Rényi divergence w.r.t. π). Previous algorithms for finding a warm start took O(d) time and thus dominated the computational effort of sampling. Our main technical contribution resolves this gap by establishing the first \(\widetilde{O}(d^{1/2}) \) Rényi mixing rates for the discretized underdamped Langevin diffusion. For this, we develop new differential-privacy-inspired techniques based on Rényi divergences with Orlicz–Wasserstein shifts, which allow us to sidestep longstanding challenges for proving fast convergence of hypocoercive differential equations.
期刊介绍:
The best indicator of the scope of the journal is provided by the areas covered by its Editorial Board. These areas change from time to time, as the field evolves. The following areas are currently covered by a member of the Editorial Board: Algorithms and Combinatorial Optimization; Algorithms and Data Structures; Algorithms, Combinatorial Optimization, and Games; Artificial Intelligence; Complexity Theory; Computational Biology; Computational Geometry; Computer Graphics and Computer Vision; Computer-Aided Verification; Cryptography and Security; Cyber-Physical, Embedded, and Real-Time Systems; Database Systems and Theory; Distributed Computing; Economics and Computation; Information Theory; Logic and Computation; Logic, Algorithms, and Complexity; Machine Learning and Computational Learning Theory; Networking; Parallel Computing and Architecture; Programming Languages; Quantum Computing; Randomized Algorithms and Probabilistic Analysis of Algorithms; Scientific Computing and High Performance Computing; Software Engineering; Web Algorithms and Data Mining