Unified continuous-time q-learning for mean-field game and mean-field control problems

arXiv - QuantFin - Computational Finance Pub Date : 2024-07-05 DOI:arxiv-2407.04521

Xiaoli Wei, Xiang Yu, Fengyi Yuan

{"title":"Unified continuous-time q-learning for mean-field game and mean-field control problems","authors":"Xiaoli Wei, Xiang Yu, Fengyi Yuan","doi":"arxiv-2407.04521","DOIUrl":null,"url":null,"abstract":"This paper studies the continuous-time q-learning in the mean-field\njump-diffusion models from the representative agent's perspective. To overcome\nthe challenge when the population distribution may not be directly observable,\nwe introduce the integrated q-function in decoupled form (decoupled\nIq-function) and establish its martingale characterization together with the\nvalue function, which provides a unified policy evaluation rule for both\nmean-field game (MFG) and mean-field control (MFC) problems. Moreover,\ndepending on the task to solve the MFG or MFC problem, we can employ the\ndecoupled Iq-function by different means to learn the mean-field equilibrium\npolicy or the mean-field optimal policy respectively. As a result, we devise a\nunified q-learning algorithm for both MFG and MFC problems by utilizing all\ntest policies stemming from the mean-field interactions. For several examples\nin the jump-diffusion setting, within and beyond the LQ framework, we can\nobtain the exact parameterization of the decoupled Iq-functions and the value\nfunctions, and illustrate our algorithm from the representative agent's\nperspective with satisfactory performance.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"37 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuantFin - Computational Finance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.04521","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

This paper studies the continuous-time q-learning in the mean-field jump-diffusion models from the representative agent's perspective. To overcome the challenge when the population distribution may not be directly observable, we introduce the integrated q-function in decoupled form (decoupled Iq-function) and establish its martingale characterization together with the value function, which provides a unified policy evaluation rule for both mean-field game (MFG) and mean-field control (MFC) problems. Moreover, depending on the task to solve the MFG or MFC problem, we can employ the decoupled Iq-function by different means to learn the mean-field equilibrium policy or the mean-field optimal policy respectively. As a result, we devise a unified q-learning algorithm for both MFG and MFC problems by utilizing all test policies stemming from the mean-field interactions. For several examples in the jump-diffusion setting, within and beyond the LQ framework, we can obtain the exact parameterization of the decoupled Iq-functions and the value functions, and illustrate our algorithm from the representative agent's perspective with satisfactory performance.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

均场博弈和均场控制问题的统一连续时间 q-learning

本文从代表代理的角度研究了均值-场跳跃-扩散模型中的连续时间q-学习。为了克服当种群分布可能无法直接观测时的挑战，我们引入了解耦形式的集成 q 函数（解耦 q 函数），并将其与值函数一起建立了马丁格尔特性，从而为均场博弈（MFG）和均场控制（MFC）问题提供了统一的策略评估规则。此外，根据求解 MFG 或 MFC 问题的任务不同，我们可以通过不同的方法利用解耦 Iq 函数来分别学习均值场均衡策略或均值场最优策略。因此，我们利用均值场相互作用产生的所有检验策略，为 MFG 和 MFC 问题设计了一种统一的 q-learning 算法。对于跳跃扩散设置中的几个例子，在 LQ 框架之内和之外，我们可以获得解耦 Iq 函数和价值函数的精确参数化，并从代表代理的角度说明了我们的算法，结果令人满意。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

arXiv - QuantFin - Computational Finance

自引率

0.00%

发文量