Information-Theoretic Generalization Bounds for Batch Reinforcement Learning.

IF 2.1 3区物理与天体物理 Q2 PHYSICS, MULTIDISCIPLINARY Entropy Pub Date : 2024-11-18 DOI:10.3390/e26110995

Xingtu Liu

引用次数: 0

Abstract

We analyze the generalization properties of batch reinforcement learning (batch RL) with value function approximation from an information-theoretic perspective. We derive generalization bounds for batch RL using (conditional) mutual information. In addition, we demonstrate how to establish a connection between certain structural assumptions on the value function space and conditional mutual information. As a by-product, we derive a high-probability generalization bound via conditional mutual information, which was left open and may be of independent interest.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

批量强化学习的信息论泛化界限

我们从信息论的角度分析了带值函数近似的批量强化学习（batch RL）的泛化特性。我们利用（条件）互信息推导出批量 RL 的泛化边界。此外，我们还演示了如何在值函数空间的某些结构假设与条件互信息之间建立联系。作为一个副产品，我们通过条件互信息推导出了一个高概率广义边界，这个边界是开放的，可能具有独立的意义。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Entropy PHYSICS, MULTIDISCIPLINARY-

CiteScore

4.90

自引率

11.10%

发文量

1580

审稿时长

21.05 days

期刊介绍： Entropy (ISSN 1099-4300), an international and interdisciplinary journal of entropy and information studies, publishes reviews, regular research papers and short notes. Our aim is to encourage scientists to publish as much as possible their theoretical and experimental details. There is no restriction on the length of the papers. If there are computation and the experiment, the details must be provided so that the results can be reproduced.