Reinforcement Learning for Adaptive MCMC

arXiv - STAT - Computation Pub Date : 2024-05-22 DOI:arxiv-2405.13574

Congye Wang, Wilson Chen, Heishiro Kanagawa, Chris. J. Oates

引用次数: 0

Abstract

An informal observation, made by several authors, is that the adaptive design of a Markov transition kernel has the flavour of a reinforcement learning task. Yet, to-date it has remained unclear how to actually exploit modern reinforcement learning technologies for adaptive MCMC. The aim of this paper is to set out a general framework, called Reinforcement Learning Metropolis--Hastings, that is theoretically supported and empirically validated. Our principal focus is on learning fast-mixing Metropolis--Hastings transition kernels, which we cast as deterministic policies and optimise via a policy gradient. Control of the learning rate provably ensures conditions for ergodicity are satisfied. The methodology is used to construct a gradient-free sampler that out-performs a popular gradient-free adaptive Metropolis--Hastings algorithm on $\approx 90 \%$ of tasks in the PosteriorDB benchmark.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

自适应 MCMC 的强化学习

然而，迄今为止，如何将现代强化学习技术用于自适应 MCMC 仍然是个未知数。本文的目的是建立一个名为 "强化学习大都会--哈斯廷斯"（Reinforcement LearningMetropolis--Hastings）的总体框架，该框架具有理论支持和经验验证。我们的主要重点是学习快速混合的大都会--哈斯廷斯过渡核，并将其作为确定性策略，通过策略梯度进行优化。对学习率的控制可确保满足正交性条件。该方法被用于构建一个梯度自由采样器，在PosteriorDB基准测试中，该采样器在大约90%的任务上优于流行的无梯度自适应Metropolis--Hastings算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

arXiv - STAT - Computation

自引率

0.00%

发文量