Quantitative propagation of chaos for mean field Markov decision process with common noise

IF 1.3 3区 数学 Q2 STATISTICS & PROBABILITY Electronic Journal of Probability Pub Date : 2022-07-26 DOI:10.1214/23-ejp978
M'ed'eric Motte, H. Pham
{"title":"Quantitative propagation of chaos for mean field Markov decision process with common noise","authors":"M'ed'eric Motte, H. Pham","doi":"10.1214/23-ejp978","DOIUrl":null,"url":null,"abstract":"We investigate propagation of chaos for mean field Markov Decision Process with common noise (CMKV-MDP), and when the optimization is performed over randomized open-loop controls on infinite horizon. We first state a rate of convergence of order $M_N^\\gamma$, where $M_N$ is the mean rate of convergence in Wasserstein distance of the empirical measure, and $\\gamma \\in (0,1]$ is an explicit constant, in the limit of the value functions of $N$-agent control problem with asymmetric open-loop controls, towards the value function of CMKV-MDP. Furthermore, we show how to explicitly construct $(\\epsilon+\\mathcal{O}(M_N^\\gamma))$-optimal policies for the $N$-agent model from $\\epsilon$-optimal policies for the CMKV-MDP. Our approach relies on sharp comparison between the Bellman operators in the $N$-agent problem and the CMKV-MDP, and fine coupling of empirical measures.","PeriodicalId":50538,"journal":{"name":"Electronic Journal of Probability","volume":" ","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronic Journal of Probability","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1214/23-ejp978","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 3

Abstract

We investigate propagation of chaos for mean field Markov Decision Process with common noise (CMKV-MDP), and when the optimization is performed over randomized open-loop controls on infinite horizon. We first state a rate of convergence of order $M_N^\gamma$, where $M_N$ is the mean rate of convergence in Wasserstein distance of the empirical measure, and $\gamma \in (0,1]$ is an explicit constant, in the limit of the value functions of $N$-agent control problem with asymmetric open-loop controls, towards the value function of CMKV-MDP. Furthermore, we show how to explicitly construct $(\epsilon+\mathcal{O}(M_N^\gamma))$-optimal policies for the $N$-agent model from $\epsilon$-optimal policies for the CMKV-MDP. Our approach relies on sharp comparison between the Bellman operators in the $N$-agent problem and the CMKV-MDP, and fine coupling of empirical measures.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
具有公共噪声的平均场Markov决策过程混沌的定量传播
我们研究了具有公共噪声的平均场马尔可夫决策过程(CMKV-MDP)的混沌传播,以及在无限时域上对随机开环控制进行优化时的混沌传播。我们首先给出$M_N^\gamma$阶的收敛速度,其中$M_N$是经验测度在Wasserstein距离上的平均收敛速度,并且$\gamma\in(0,1]$是一个显式常数,在具有非对称开环控制的$N$-agent控制问题的值函数的极限下,对于CMKV-MDP的值函数。此外,我们展示了如何从CMKV-MDP$\epsilon$-最优策略显式构造$N$-agent模型的$(\epsilon\mathcal{O}(M_N^\gamma))$-最优政策。我们的方法依赖于$N$-agent问题中的Bellman算子和CMKV-MDP之间的尖锐比较,以及经验测度的精细耦合。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Electronic Journal of Probability
Electronic Journal of Probability 数学-统计学与概率论
CiteScore
1.80
自引率
7.10%
发文量
119
审稿时长
4-8 weeks
期刊介绍: The Electronic Journal of Probability publishes full-size research articles in probability theory. The Electronic Communications in Probability (ECP), a sister journal of EJP, publishes short notes and research announcements in probability theory. Both ECP and EJP are official journals of the Institute of Mathematical Statistics and the Bernoulli society.
期刊最新文献
A Palm space approach to non-linear Hawkes processes Stochastic evolution equations with Wick-polynomial nonlinearities Corrigendum to: The sum of powers of subtree sizes for conditioned Galton–Watson trees Stochastic sewing in Banach spaces Convergence rate for geometric statistics of point processes having fast decay of dependence
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1