学习均衡均值方差策略

IF 1.6 3区 经济学 Q3 BUSINESS, FINANCE Mathematical Finance Pub Date : 2023-06-04 DOI:10.1111/mafi.12402
Min Dai, Yuchao Dong, Yanwei Jia
{"title":"学习均衡均值方差策略","authors":"Min Dai,&nbsp;Yuchao Dong,&nbsp;Yanwei Jia","doi":"10.1111/mafi.12402","DOIUrl":null,"url":null,"abstract":"<p>We study a dynamic mean-variance portfolio optimization problem under the reinforcement learning framework, where an entropy regularizer is introduced to induce exploration. Due to the time–inconsistency involved in a mean-variance criterion, we aim to learn an equilibrium policy. Under an incomplete market setting, we obtain a semi-analytical, exploratory, equilibrium mean-variance policy that turns out to follow a Gaussian distribution. We then focus on a Gaussian mean return model and propose a reinforcement learning algorithm to find the equilibrium policy. Thanks to a thoroughly designed policy iteration procedure in our algorithm, we prove the convergence of our algorithm under mild conditions, despite that dynamic programming principle and the usual policy improvement theorem failing to hold for an equilibrium policy. Numerical experiments are given to demonstrate our algorithm. The design and implementation of our reinforcement learning algorithm apply to a general market setup.</p>","PeriodicalId":49867,"journal":{"name":"Mathematical Finance","volume":"33 4","pages":"1166-1212"},"PeriodicalIF":1.6000,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Learning equilibrium mean-variance strategy\",\"authors\":\"Min Dai,&nbsp;Yuchao Dong,&nbsp;Yanwei Jia\",\"doi\":\"10.1111/mafi.12402\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>We study a dynamic mean-variance portfolio optimization problem under the reinforcement learning framework, where an entropy regularizer is introduced to induce exploration. Due to the time–inconsistency involved in a mean-variance criterion, we aim to learn an equilibrium policy. Under an incomplete market setting, we obtain a semi-analytical, exploratory, equilibrium mean-variance policy that turns out to follow a Gaussian distribution. We then focus on a Gaussian mean return model and propose a reinforcement learning algorithm to find the equilibrium policy. Thanks to a thoroughly designed policy iteration procedure in our algorithm, we prove the convergence of our algorithm under mild conditions, despite that dynamic programming principle and the usual policy improvement theorem failing to hold for an equilibrium policy. Numerical experiments are given to demonstrate our algorithm. The design and implementation of our reinforcement learning algorithm apply to a general market setup.</p>\",\"PeriodicalId\":49867,\"journal\":{\"name\":\"Mathematical Finance\",\"volume\":\"33 4\",\"pages\":\"1166-1212\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2023-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Mathematical Finance\",\"FirstCategoryId\":\"96\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/mafi.12402\",\"RegionNum\":3,\"RegionCategory\":\"经济学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"BUSINESS, FINANCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mathematical Finance","FirstCategoryId":"96","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/mafi.12402","RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BUSINESS, FINANCE","Score":null,"Total":0}
引用次数: 7

摘要

我们研究了一个在强化学习框架下的动态中方差投资组合优化问题,其中引入了熵正则化子来进行探索。由于均值方差标准中涉及时间不一致性,我们的目标是学习均衡策略。在不完全市场环境下,我们得到了一个半分析的、探索性的、均衡的均方差策略,它遵循高斯分布。然后,我们关注高斯平均收益模型,并提出了一种强化学习算法来寻找均衡策略。由于我们的算法中有一个彻底设计的策略迭代过程,我们证明了我们的算法在温和条件下的收敛性,尽管动态规划原理和通常的策略改进定理不能适用于均衡策略。数值实验证明了我们的算法。我们的强化学习算法的设计和实现适用于一般的市场设置。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Learning equilibrium mean-variance strategy

We study a dynamic mean-variance portfolio optimization problem under the reinforcement learning framework, where an entropy regularizer is introduced to induce exploration. Due to the time–inconsistency involved in a mean-variance criterion, we aim to learn an equilibrium policy. Under an incomplete market setting, we obtain a semi-analytical, exploratory, equilibrium mean-variance policy that turns out to follow a Gaussian distribution. We then focus on a Gaussian mean return model and propose a reinforcement learning algorithm to find the equilibrium policy. Thanks to a thoroughly designed policy iteration procedure in our algorithm, we prove the convergence of our algorithm under mild conditions, despite that dynamic programming principle and the usual policy improvement theorem failing to hold for an equilibrium policy. Numerical experiments are given to demonstrate our algorithm. The design and implementation of our reinforcement learning algorithm apply to a general market setup.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Mathematical Finance
Mathematical Finance 数学-数学跨学科应用
CiteScore
4.10
自引率
6.20%
发文量
27
审稿时长
>12 weeks
期刊介绍: Mathematical Finance seeks to publish original research articles focused on the development and application of novel mathematical and statistical methods for the analysis of financial problems. The journal welcomes contributions on new statistical methods for the analysis of financial problems. Empirical results will be appropriate to the extent that they illustrate a statistical technique, validate a model or provide insight into a financial problem. Papers whose main contribution rests on empirical results derived with standard approaches will not be considered.
期刊最新文献
Issue Information Designing stablecoins Systemic risk in markets with multiple central counterparties Joint calibration to SPX and VIX options with signature‐based models Dynamic equilibrium with insider information and general uninformed agent utility
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1