Objective Driven Portfolio Construction Using Reinforcement Learning

Tina Wang, Jithin Pradeep, Jerry Zikun Chen
{"title":"Objective Driven Portfolio Construction Using Reinforcement Learning","authors":"Tina Wang, Jithin Pradeep, Jerry Zikun Chen","doi":"10.1145/3533271.3561764","DOIUrl":null,"url":null,"abstract":"Recent advancement in reinforcement learning has enabled robust data-driven direct optimization on the investor’s objectives without estimating the stock movements as in the traditional two-step approach [8]. Given diverse investment styles, a single trading strategy cannot serve different investor objectives. We propose an objective function formulation to augment the direct optimization approach in AlphaPortfolio (Cong et al. [6]). In addition to simple baseline Sharpe ratio used in AlphaPortfolio, we add three investor’s objectives for (i) achieving excess alpha by maximizing the information ratio; (ii) mitigating downside risks through optimizing maximum drawdown-adjusted return; and (iii) reducing transaction costs via restricting the turnover rate. We also introduce four new features: momentum, short-term reversal, drawdown, and maximum drawdown to the framework. Our objective function formulation allows for controlling the trade-off between both maximum drawdown and turnover with respect to realized return, creating flexible trading strategies for various risk appetites. The maximum drawdown efficient frontier curve, derived using a range of values of hyper-parameter α, reflects the similar concave relationship as observed in the theoretical study by Chekhlov et al. [5]. To improve the interpretability of the deep neural network and drive insights into traditional factor investment, we further explore the drivers that contribute to the top and bottom performing firms by running regression analysis using Random Forest, which achieves R2 of approximately 0.8 in producing the same winner scores as our model. Finally, to uncover the balance between profits and diversification, we investigate the impact of the trading size on strategy behaviors.","PeriodicalId":134888,"journal":{"name":"Proceedings of the Third ACM International Conference on AI in Finance","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Third ACM International Conference on AI in Finance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3533271.3561764","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Recent advancement in reinforcement learning has enabled robust data-driven direct optimization on the investor’s objectives without estimating the stock movements as in the traditional two-step approach [8]. Given diverse investment styles, a single trading strategy cannot serve different investor objectives. We propose an objective function formulation to augment the direct optimization approach in AlphaPortfolio (Cong et al. [6]). In addition to simple baseline Sharpe ratio used in AlphaPortfolio, we add three investor’s objectives for (i) achieving excess alpha by maximizing the information ratio; (ii) mitigating downside risks through optimizing maximum drawdown-adjusted return; and (iii) reducing transaction costs via restricting the turnover rate. We also introduce four new features: momentum, short-term reversal, drawdown, and maximum drawdown to the framework. Our objective function formulation allows for controlling the trade-off between both maximum drawdown and turnover with respect to realized return, creating flexible trading strategies for various risk appetites. The maximum drawdown efficient frontier curve, derived using a range of values of hyper-parameter α, reflects the similar concave relationship as observed in the theoretical study by Chekhlov et al. [5]. To improve the interpretability of the deep neural network and drive insights into traditional factor investment, we further explore the drivers that contribute to the top and bottom performing firms by running regression analysis using Random Forest, which achieves R2 of approximately 0.8 in producing the same winner scores as our model. Finally, to uncover the balance between profits and diversification, we investigate the impact of the trading size on strategy behaviors.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于强化学习的目标驱动组合构建
强化学习的最新进展已经实现了对投资者目标的稳健数据驱动的直接优化,而无需像传统的两步方法那样估计股票走势。考虑到多样化的投资风格,单一的交易策略无法满足不同投资者的目标。我们提出了一个目标函数公式来增强AlphaPortfolio中的直接优化方法(Cong et al.[6])。除了在AlphaPortfolio中使用的简单基准夏普比率之外,我们还增加了三个投资者的目标:(i)通过最大化信息比率来实现超额阿尔法;(ii)通过优化最大回调收益来降低下行风险;(三)通过限制换手率降低交易成本。我们还向框架引入了四个新特性:动量、短期反转、回调和最大回调。我们的目标函数公式允许在实现回报方面控制最大回撤和营业额之间的权衡,为各种风险偏好创建灵活的交易策略。利用超参数α值范围推导出的最大降压有效边界曲线,反映了Chekhlov等人在理论研究中观察到的类似凹关系。为了提高深度神经网络的可解释性并推动对传统要素投资的见解,我们通过使用随机森林(Random Forest)进行回归分析,进一步探索了对表现最好和最差的公司做出贡献的驱动因素,在产生与我们的模型相同的赢家得分时,其R2约为0.8。最后,为了揭示利润与多元化之间的平衡,我们研究了交易规模对策略行为的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Core Matrix Regression and Prediction with Regularization Risk-Aware Linear Bandits with Application in Smart Order Routing Addressing Extreme Market Responses Using Secure Aggregation Addressing Non-Stationarity in FX Trading with Online Model Selection of Offline RL Experts Objective Driven Portfolio Construction Using Reinforcement Learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1