{"title":"用于投资组合选择的深度强化学习","authors":"Yifu Jiang , Jose Olmo , Majed Atwi","doi":"10.1016/j.gfj.2024.101016","DOIUrl":null,"url":null,"abstract":"<div><p>This study proposes an advanced model-free deep reinforcement learning (DRL) framework to construct optimal portfolio strategies in dynamic, complex, and large-dimensional financial markets. Investors' risk aversion and transaction cost constraints are embedded in an extended Markowitz's mean-variance reward function by employing a twin-delayed deep deterministic policy gradient (TD3) algorithm. This study designs a DRL-TD3-based risk and transaction cost-sensitive portfolio that combines advanced exploration strategies and dynamic policy updates. The proposed portfolio method effectively addresses the challenges posed by high-dimensional state and action spaces in complex financial markets. This methodology provides two optimal portfolios by flexibly controlling transaction and risk costs with (i) the constituents of the Dow Jones Industrial Average and (ii) the constituents of the S&P100 index. Results demonstrate a strong portfolio performance of the proposed DRL portfolio compared to those of several competitors from the traditional and DRL literatures.</p></div>","PeriodicalId":46907,"journal":{"name":"Global Finance Journal","volume":"62 ","pages":"Article 101016"},"PeriodicalIF":5.5000,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1044028324000887/pdfft?md5=1e104ca35ccd1fee383f1ec3c00e3882&pid=1-s2.0-S1044028324000887-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Deep reinforcement learning for portfolio selection\",\"authors\":\"Yifu Jiang , Jose Olmo , Majed Atwi\",\"doi\":\"10.1016/j.gfj.2024.101016\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>This study proposes an advanced model-free deep reinforcement learning (DRL) framework to construct optimal portfolio strategies in dynamic, complex, and large-dimensional financial markets. Investors' risk aversion and transaction cost constraints are embedded in an extended Markowitz's mean-variance reward function by employing a twin-delayed deep deterministic policy gradient (TD3) algorithm. This study designs a DRL-TD3-based risk and transaction cost-sensitive portfolio that combines advanced exploration strategies and dynamic policy updates. The proposed portfolio method effectively addresses the challenges posed by high-dimensional state and action spaces in complex financial markets. This methodology provides two optimal portfolios by flexibly controlling transaction and risk costs with (i) the constituents of the Dow Jones Industrial Average and (ii) the constituents of the S&P100 index. Results demonstrate a strong portfolio performance of the proposed DRL portfolio compared to those of several competitors from the traditional and DRL literatures.</p></div>\",\"PeriodicalId\":46907,\"journal\":{\"name\":\"Global Finance Journal\",\"volume\":\"62 \",\"pages\":\"Article 101016\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2024-07-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S1044028324000887/pdfft?md5=1e104ca35ccd1fee383f1ec3c00e3882&pid=1-s2.0-S1044028324000887-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Global Finance Journal\",\"FirstCategoryId\":\"96\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1044028324000887\",\"RegionNum\":2,\"RegionCategory\":\"经济学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BUSINESS, FINANCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Global Finance Journal","FirstCategoryId":"96","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1044028324000887","RegionNum":2,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BUSINESS, FINANCE","Score":null,"Total":0}
Deep reinforcement learning for portfolio selection
This study proposes an advanced model-free deep reinforcement learning (DRL) framework to construct optimal portfolio strategies in dynamic, complex, and large-dimensional financial markets. Investors' risk aversion and transaction cost constraints are embedded in an extended Markowitz's mean-variance reward function by employing a twin-delayed deep deterministic policy gradient (TD3) algorithm. This study designs a DRL-TD3-based risk and transaction cost-sensitive portfolio that combines advanced exploration strategies and dynamic policy updates. The proposed portfolio method effectively addresses the challenges posed by high-dimensional state and action spaces in complex financial markets. This methodology provides two optimal portfolios by flexibly controlling transaction and risk costs with (i) the constituents of the Dow Jones Industrial Average and (ii) the constituents of the S&P100 index. Results demonstrate a strong portfolio performance of the proposed DRL portfolio compared to those of several competitors from the traditional and DRL literatures.
期刊介绍:
Global Finance Journal provides a forum for the exchange of ideas and techniques among academicians and practitioners and, thereby, advances applied research in global financial management. Global Finance Journal publishes original, creative, scholarly research that integrates theory and practice and addresses a readership in both business and academia. Articles reflecting pragmatic research are sought in areas such as financial management, investment, banking and financial services, accounting, and taxation. Global Finance Journal welcomes contributions from scholars in both the business and academic community and encourages collaborative research from this broad base worldwide.