Importance sampling for model-based reinforcement learning

2012 20th Signal Processing and Communications Applications Conference (SIU) Pub Date : 2012-04-18 DOI:10.1109/SIU.2012.6204703

Orhan Sonmez, A. Cemgil

引用次数: 1

Abstract

Most of the state-of-the-art reinforcement learning algorithms are based on Bellman equations and make use of fixed-point iteration methods to converge to suboptimal solutions. However, some of the recent approaches transform the reinforcement learning problem into an equivalent likelihood maximization problem with using appropriate graphical models. Hence, it allows the adoption of probabilistic inference methods. Here, we propose an expectation-maximization method that employs importance sampling in its E-step in order to estimate the likelihood and then to determine the optimal policy.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于模型的强化学习的重要性抽样

大多数最先进的强化学习算法都是基于Bellman方程，并利用不动点迭代方法收敛到次优解。然而，最近的一些方法通过使用适当的图形模型将强化学习问题转化为等效的似然最大化问题。因此，它允许采用概率推理方法。在这里，我们提出了一种期望最大化方法，该方法在其e步中使用重要抽样来估计可能性，然后确定最优策略。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2012 20th Signal Processing and Communications Applications Conference (SIU)

自引率

0.00%

发文量