A Deep Reinforcement Learning Approach for Portfolio Management in Non-Short-Selling Market

IF 1.1 4区 工程技术 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC IET Signal Processing Pub Date : 2024-07-18 DOI:10.1049/2024/5399392
Ruidan Su, Chun Chi, Shikui Tu, Lei Xu
{"title":"A Deep Reinforcement Learning Approach for Portfolio Management in Non-Short-Selling Market","authors":"Ruidan Su,&nbsp;Chun Chi,&nbsp;Shikui Tu,&nbsp;Lei Xu","doi":"10.1049/2024/5399392","DOIUrl":null,"url":null,"abstract":"<div>\n <p>Reinforcement learning (RL) has been applied to financial portfolio management in recent years. Current studies mostly focus on profit accumulation without much consideration of risk. Some risk-return balanced studies extract features from price and volume data only, which is highly correlated and missing representation of risk features. To tackle these problems, we propose a weight control unit (WCU) to effectively manage the position of portfolio management in different market statuses. A loss penalty term is also designed in the reward function to prevent sharp drawdown during trading. Moreover, stock spatial interrelation representing the correlation between two different stocks is captured by a graph convolution network based on fundamental data. Temporal interrelation is also captured by a temporal convolutional network based on new factors designed with price and volume data. Both spatial and temporal interrelation work for better feature extraction from historical data and also make the model more interpretable. Finally, a deep deterministic policy gradient actor–critic RL is applied to explore optimal policy in portfolio management. We conduct our approach in a challenging non-short-selling market, and the experiment results show that our method outperforms the state-of-the-art methods in both profit and risk criteria. Specifically, with 6.72% improvement on an annualized rate of return, 7.72% decrease in maximum drawdown, and a better annualized Sharpe ratio of 0.112. Also, the loss penalty and WCU provide new aspects for future work in risk control.</p>\n </div>","PeriodicalId":56301,"journal":{"name":"IET Signal Processing","volume":null,"pages":null},"PeriodicalIF":1.1000,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/5399392","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/2024/5399392","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Reinforcement learning (RL) has been applied to financial portfolio management in recent years. Current studies mostly focus on profit accumulation without much consideration of risk. Some risk-return balanced studies extract features from price and volume data only, which is highly correlated and missing representation of risk features. To tackle these problems, we propose a weight control unit (WCU) to effectively manage the position of portfolio management in different market statuses. A loss penalty term is also designed in the reward function to prevent sharp drawdown during trading. Moreover, stock spatial interrelation representing the correlation between two different stocks is captured by a graph convolution network based on fundamental data. Temporal interrelation is also captured by a temporal convolutional network based on new factors designed with price and volume data. Both spatial and temporal interrelation work for better feature extraction from historical data and also make the model more interpretable. Finally, a deep deterministic policy gradient actor–critic RL is applied to explore optimal policy in portfolio management. We conduct our approach in a challenging non-short-selling market, and the experiment results show that our method outperforms the state-of-the-art methods in both profit and risk criteria. Specifically, with 6.72% improvement on an annualized rate of return, 7.72% decrease in maximum drawdown, and a better annualized Sharpe ratio of 0.112. Also, the loss penalty and WCU provide new aspects for future work in risk control.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
非卖方市场投资组合管理的深度强化学习方法
近年来,强化学习(RL)已被应用于金融投资组合管理。目前的研究大多侧重于利润积累,而没有过多考虑风险。一些风险收益平衡的研究仅从价格和成交量数据中提取特征,这与风险特征高度相关且缺乏代表性。针对这些问题,我们提出了权重控制单元(WCU),以有效管理不同市场状态下的投资组合管理位置。我们还在奖励函数中设计了损失惩罚项,以防止在交易过程中出现大幅缩水。此外,基于基本面数据的图卷积网络可捕捉代表两只不同股票之间相关性的股票空间相互关系。时间上的相互关系也是通过基于价格和成交量数据设计的新因子的时间卷积网络来捕捉的。空间和时间上的相互关系有助于从历史数据中更好地提取特征,同时也使模型更具可解释性。最后,我们还应用了深度确定性政策梯度行动者批判 RL 来探索投资组合管理中的最优政策。实验结果表明,我们的方法在利润和风险标准方面都优于最先进的方法。具体来说,年化收益率提高了 6.72%,最大缩水率降低了 7.72%,年化夏普比率达到 0.112。此外,损失惩罚和 WCU 也为今后的风险控制工作提供了新的思路。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IET Signal Processing
IET Signal Processing 工程技术-工程:电子与电气
CiteScore
3.80
自引率
5.90%
发文量
83
审稿时长
9.5 months
期刊介绍: IET Signal Processing publishes research on a diverse range of signal processing and machine learning topics, covering a variety of applications, disciplines, modalities, and techniques in detection, estimation, inference, and classification problems. The research published includes advances in algorithm design for the analysis of single and high-multi-dimensional data, sparsity, linear and non-linear systems, recursive and non-recursive digital filters and multi-rate filter banks, as well a range of topics that span from sensor array processing, deep convolutional neural network based approaches to the application of chaos theory, and far more. Topics covered by scope include, but are not limited to: advances in single and multi-dimensional filter design and implementation linear and nonlinear, fixed and adaptive digital filters and multirate filter banks statistical signal processing techniques and analysis classical, parametric and higher order spectral analysis signal transformation and compression techniques, including time-frequency analysis system modelling and adaptive identification techniques machine learning based approaches to signal processing Bayesian methods for signal processing, including Monte-Carlo Markov-chain and particle filtering techniques theory and application of blind and semi-blind signal separation techniques signal processing techniques for analysis, enhancement, coding, synthesis and recognition of speech signals direction-finding and beamforming techniques for audio and electromagnetic signals analysis techniques for biomedical signals baseband signal processing techniques for transmission and reception of communication signals signal processing techniques for data hiding and audio watermarking sparse signal processing and compressive sensing Special Issue Call for Papers: Intelligent Deep Fuzzy Model for Signal Processing - https://digital-library.theiet.org/files/IET_SPR_CFP_IDFMSP.pdf
期刊最新文献
The Effect of Antenna Place Codes for Reducing Sidelobes of SIAR and Frequency Diverse Array Sensors A Variational Bayesian Truncated Adaptive Filter for Uncertain Systems with Inequality Constraints A Novel Approach of Optimal Signal Streaming Analysis Implicated Supervised Feedforward Neural Networks Energy Sharing and Performance Bounds in MIMO DFRC Systems: A Trade-Off Analysis A Labeled Multi-Bernoulli Filter Based on Maximum Likelihood Recursive Updating
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1