{"title":"最大边际似然估计的多尺度视角","authors":"O. Deniz Akyildiz, Iain Souttar, Michela Ottobre","doi":"arxiv-2406.04187","DOIUrl":null,"url":null,"abstract":"In this paper, we provide a multiscale perspective on the problem of maximum\nmarginal likelihood estimation. We consider and analyse a diffusion-based\nmaximum marginal likelihood estimation scheme using ideas from multiscale\ndynamics. Our perspective is based on stochastic averaging; we make an explicit\nconnection between ideas in applied probability and parameter inference in\ncomputational statistics. In particular, we consider a general class of coupled\nLangevin diffusions for joint inference of latent variables and parameters in\nstatistical models, where the latent variables are sampled from a fast Langevin\nprocess (which acts as a sampler), and the parameters are updated using a slow\nLangevin process (which acts as an optimiser). We show that the resulting\nsystem of stochastic differential equations (SDEs) can be viewed as a two-time\nscale system. To demonstrate the utility of such a perspective, we show that\nthe averaged parameter dynamics obtained in the limit of scale separation can\nbe used to estimate the optimal parameter, within the strongly convex setting.\nWe do this by using recent uniform-in-time non-asymptotic averaging bounds.\nFinally, we conclude by showing that the slow-fast algorithm we consider here,\ntermed Slow-Fast Langevin Algorithm, performs on par with state-of-the-art\nmethods on a variety of examples. We believe that the stochastic averaging\napproach we provide in this paper enables us to look at these algorithms from a\nfresh angle, as well as unlocking the path to develop and analyse new methods\nusing well-established averaging principles.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Multiscale Perspective on Maximum Marginal Likelihood Estimation\",\"authors\":\"O. Deniz Akyildiz, Iain Souttar, Michela Ottobre\",\"doi\":\"arxiv-2406.04187\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we provide a multiscale perspective on the problem of maximum\\nmarginal likelihood estimation. We consider and analyse a diffusion-based\\nmaximum marginal likelihood estimation scheme using ideas from multiscale\\ndynamics. Our perspective is based on stochastic averaging; we make an explicit\\nconnection between ideas in applied probability and parameter inference in\\ncomputational statistics. In particular, we consider a general class of coupled\\nLangevin diffusions for joint inference of latent variables and parameters in\\nstatistical models, where the latent variables are sampled from a fast Langevin\\nprocess (which acts as a sampler), and the parameters are updated using a slow\\nLangevin process (which acts as an optimiser). We show that the resulting\\nsystem of stochastic differential equations (SDEs) can be viewed as a two-time\\nscale system. To demonstrate the utility of such a perspective, we show that\\nthe averaged parameter dynamics obtained in the limit of scale separation can\\nbe used to estimate the optimal parameter, within the strongly convex setting.\\nWe do this by using recent uniform-in-time non-asymptotic averaging bounds.\\nFinally, we conclude by showing that the slow-fast algorithm we consider here,\\ntermed Slow-Fast Langevin Algorithm, performs on par with state-of-the-art\\nmethods on a variety of examples. We believe that the stochastic averaging\\napproach we provide in this paper enables us to look at these algorithms from a\\nfresh angle, as well as unlocking the path to develop and analyse new methods\\nusing well-established averaging principles.\",\"PeriodicalId\":501215,\"journal\":{\"name\":\"arXiv - STAT - Computation\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - STAT - Computation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2406.04187\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2406.04187","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Multiscale Perspective on Maximum Marginal Likelihood Estimation
In this paper, we provide a multiscale perspective on the problem of maximum
marginal likelihood estimation. We consider and analyse a diffusion-based
maximum marginal likelihood estimation scheme using ideas from multiscale
dynamics. Our perspective is based on stochastic averaging; we make an explicit
connection between ideas in applied probability and parameter inference in
computational statistics. In particular, we consider a general class of coupled
Langevin diffusions for joint inference of latent variables and parameters in
statistical models, where the latent variables are sampled from a fast Langevin
process (which acts as a sampler), and the parameters are updated using a slow
Langevin process (which acts as an optimiser). We show that the resulting
system of stochastic differential equations (SDEs) can be viewed as a two-time
scale system. To demonstrate the utility of such a perspective, we show that
the averaged parameter dynamics obtained in the limit of scale separation can
be used to estimate the optimal parameter, within the strongly convex setting.
We do this by using recent uniform-in-time non-asymptotic averaging bounds.
Finally, we conclude by showing that the slow-fast algorithm we consider here,
termed Slow-Fast Langevin Algorithm, performs on par with state-of-the-art
methods on a variety of examples. We believe that the stochastic averaging
approach we provide in this paper enables us to look at these algorithms from a
fresh angle, as well as unlocking the path to develop and analyse new methods
using well-established averaging principles.