On the curved exponential family in the Stochastic Approximation Expectation Maximization Algorithm

IF 0.6 4区数学 Q4 STATISTICS & PROBABILITY Esaim-Probability and Statistics Pub Date : 2021-01-01 DOI:10.1051/ps/2021015

S. Allassonnière Vianney Debavelaere

{"title":"On the curved exponential family in the Stochastic Approximation Expectation Maximization Algorithm","authors":"Vianney Debavelaere, S. Allassonnière","doi":"10.1051/ps/2021015","DOIUrl":null,"url":null,"abstract":"The Expectation-Maximization Algorithm (EM) is a widely used method allowing to estimate the maximum likelihood of models involving latent variables. When the Expectation step cannot be computed easily, one can use stochastic versions of the EM such as the Stochastic Approximation EM. This algorithm, however, has the drawback to require the joint likelihood to belong to the curved exponential family. To overcome this problem, [16] introduced a rewriting of the model which “exponentializes” it by considering the parameter as an additional latent variable following a Normal distribution centered on the newly defined parameters and with fixed variance. The likelihood of this new exponentialized model now belongs to the curved exponential family. Although often used, there is no guarantee that the estimated mean is close to the maximum likelihood estimate of the initial model. In this paper, we quantify the error done in this estimation while considering the exponentialized model instead of the initial one. By verifying those results on an example, we see that a trade-off must be made between the speed of convergence and the tolerated error. Finally, we propose a new algorithm allowing a better estimation of the parameter in a reasonable computation time to reduce the bias.","PeriodicalId":51249,"journal":{"name":"Esaim-Probability and Statistics","volume":"41 1","pages":""},"PeriodicalIF":0.6000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Esaim-Probability and Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1051/ps/2021015","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}

引用次数: 7

Abstract

The Expectation-Maximization Algorithm (EM) is a widely used method allowing to estimate the maximum likelihood of models involving latent variables. When the Expectation step cannot be computed easily, one can use stochastic versions of the EM such as the Stochastic Approximation EM. This algorithm, however, has the drawback to require the joint likelihood to belong to the curved exponential family. To overcome this problem, [16] introduced a rewriting of the model which “exponentializes” it by considering the parameter as an additional latent variable following a Normal distribution centered on the newly defined parameters and with fixed variance. The likelihood of this new exponentialized model now belongs to the curved exponential family. Although often used, there is no guarantee that the estimated mean is close to the maximum likelihood estimate of the initial model. In this paper, we quantify the error done in this estimation while considering the exponentialized model instead of the initial one. By verifying those results on an example, we see that a trade-off must be made between the speed of convergence and the tolerated error. Finally, we propose a new algorithm allowing a better estimation of the parameter in a reasonable computation time to reduce the bias.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

关于曲线指数族的随机逼近期望最大化算法

期望最大化算法(EM)是一种广泛使用的方法，用于估计包含潜在变量的模型的最大似然。当期望步长不容易计算时，可以使用EM的随机版本，如随机逼近EM。然而，该算法的缺点是要求联合似然属于曲线指数族。为了克服这个问题，[16]引入了对模型的重写，通过将参数视为遵循以新定义参数为中心且具有固定方差的正态分布的附加潜在变量，将其“指数化”。这种新的指数化模型的似然现在属于曲线指数族。虽然经常使用，但不能保证估计的平均值接近初始模型的最大似然估计。在本文中，我们在考虑指数化模型而不是初始模型的情况下，量化了这种估计中的误差。通过在一个例子上验证这些结果，我们看到必须在收敛速度和可容忍误差之间做出权衡。最后，我们提出了一种新的算法，可以在合理的计算时间内更好地估计参数，以减少偏差。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Esaim-Probability and Statistics STATISTICS & PROBABILITY-

CiteScore

1.00

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： The journal publishes original research and survey papers in the area of Probability and Statistics. It covers theoretical and practical aspects, in any field of these domains. Of particular interest are methodological developments with application in other scientific areas, for example Biology and Genetics, Information Theory, Finance, Bioinformatics, Random structures and Random graphs, Econometrics, Physics. Long papers are very welcome. Indeed, we intend to develop the journal in the direction of applications and to open it to various fields where random mathematical modelling is important. In particular we will call (survey) papers in these areas, in order to make the random community aware of important problems of both theoretical and practical interest. We all know that many recent fascinating developments in Probability and Statistics are coming from "the outside" and we think that ESAIM: P&S should be a good entry point for such exchanges. Of course this does not mean that the journal will be only devoted to practical aspects.