Reinforcement learning for continuous-time mean-variance portfolio selection in a regime-switching market

IF 1.9 3区 经济学 Q2 ECONOMICS Journal of Economic Dynamics & Control Pub Date : 2023-11-10 DOI:10.1016/j.jedc.2023.104787
Bo Wu, Lingfei Li
{"title":"Reinforcement learning for continuous-time mean-variance portfolio selection in a regime-switching market","authors":"Bo Wu,&nbsp;Lingfei Li","doi":"10.1016/j.jedc.2023.104787","DOIUrl":null,"url":null,"abstract":"<div><p><span>We propose a reinforcement learning (RL) approach to solve the continuous-time mean-variance portfolio selection problem in a regime-switching market, where the market regime is unobservable. To encourage exploration for learning, we formulate an exploratory stochastic control problem with an entropy-regularized mean-variance objective. We obtain semi-analytical representations of the optimal value function and optimal policy, which involve unknown solutions to two linear parabolic </span>partial differential equations<span> (PDEs). We utilize these representations to parametrize the value function and policy for learning with the unknown solutions to the PDEs approximated based on polynomials. We develop an actor-critic RL algorithm to learn the optimal policy through interactions with the market environment. The algorithm carries out filtering to obtain the belief probability of the market regime and performs policy evaluation and policy gradient updates alternately. Empirical results demonstrate the advantages of our RL algorithm in relatively long-term investment problems over the classical control approach and an RL algorithm developed for the continuous-time mean-variance problem without considering regime switches.</span></p></div>","PeriodicalId":48314,"journal":{"name":"Journal of Economic Dynamics & Control","volume":null,"pages":null},"PeriodicalIF":1.9000,"publicationDate":"2023-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Economic Dynamics & Control","FirstCategoryId":"96","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165188923001938","RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ECONOMICS","Score":null,"Total":0}
引用次数: 0

Abstract

We propose a reinforcement learning (RL) approach to solve the continuous-time mean-variance portfolio selection problem in a regime-switching market, where the market regime is unobservable. To encourage exploration for learning, we formulate an exploratory stochastic control problem with an entropy-regularized mean-variance objective. We obtain semi-analytical representations of the optimal value function and optimal policy, which involve unknown solutions to two linear parabolic partial differential equations (PDEs). We utilize these representations to parametrize the value function and policy for learning with the unknown solutions to the PDEs approximated based on polynomials. We develop an actor-critic RL algorithm to learn the optimal policy through interactions with the market environment. The algorithm carries out filtering to obtain the belief probability of the market regime and performs policy evaluation and policy gradient updates alternately. Empirical results demonstrate the advantages of our RL algorithm in relatively long-term investment problems over the classical control approach and an RL algorithm developed for the continuous-time mean-variance problem without considering regime switches.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
制度切换市场中连续时间均值方差投资组合选择的强化学习
我们提出了一种强化学习(RL)方法来解决制度切换市场中连续时间均值方差投资组合选择问题,其中市场制度是不可观察的。为了鼓励对学习的探索,我们提出了一个具有熵正则化均值方差目标的探索性随机控制问题。本文给出了两个线性抛物型偏微分方程的未知解的最优值函数和最优策略的半解析表示。我们利用这些表征来参数化基于多项式逼近的偏微分方程的未知解的值函数和学习策略。我们开发了一个actor-critic RL算法,通过与市场环境的相互作用来学习最优策略。该算法通过过滤得到市场制度的相信概率,并交替进行政策评估和政策梯度更新。实证结果表明,我们的RL算法在相对长期投资问题上优于经典控制方法和针对连续时间均值-方差问题开发的不考虑状态切换的RL算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
3.10
自引率
10.50%
发文量
199
期刊介绍: The journal provides an outlet for publication of research concerning all theoretical and empirical aspects of economic dynamics and control as well as the development and use of computational methods in economics and finance. Contributions regarding computational methods may include, but are not restricted to, artificial intelligence, databases, decision support systems, genetic algorithms, modelling languages, neural networks, numerical algorithms for optimization, control and equilibria, parallel computing and qualitative reasoning.
期刊最新文献
Closed-form approximations of moments and densities of continuous–time Markov models Capital misallocation and economic development in a dynamic open economy Commodity prices and production networks in small open economies How do households respond to income shocks? Unconventional policies in state-dependent liquidity traps
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1