Antonio Riva, L. Bisi, P. Liotet, Luca Sabbioni, Edoardo Vittori, Marco Pinciroli, Michele Trapletti, Marcello Restelli
{"title":"利用离线RL专家的在线模型选择解决外汇交易中的非平稳性问题","authors":"Antonio Riva, L. Bisi, P. Liotet, Luca Sabbioni, Edoardo Vittori, Marco Pinciroli, Michele Trapletti, Marcello Restelli","doi":"10.1145/3533271.3561780","DOIUrl":null,"url":null,"abstract":"Reinforcement learning has proven to be successful in obtaining profitable trading policies; however, the effectiveness of such strategies is strongly conditioned to market stationarity. This hypothesis is challenged by the regime switches frequently experienced by practitioners; thus, when many models are available, validation may become a difficult task. We propose to overcome the issue by explicitly modeling the trading task as a non-stationary reinforcement learning problem. Nevertheless, state-of-the-art RL algorithms for this setting usually require task distribution or dynamics to be predictable, an assumption that can hardly be true in the financial framework. In this work, we propose, instead, a method for the dynamic selection of the best RL agent which is only driven by profit performance. Our modular two-layer approach allows choosing the best strategy among a set of RL models through an online-learning algorithm. While we could select any combination of algorithms in principle, our solution employs two state-of-the-art algorithms: Fitted Q-Iteration (FQI) for the RL layer and Optimistic Adapt ML-Prod (OAMP) for the online learning one. The proposed approach is tested on two simulated FX trading tasks, using actual historical data for the AUS/USD and GBP/USD currency pairs.","PeriodicalId":134888,"journal":{"name":"Proceedings of the Third ACM International Conference on AI in Finance","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Addressing Non-Stationarity in FX Trading with Online Model Selection of Offline RL Experts\",\"authors\":\"Antonio Riva, L. Bisi, P. Liotet, Luca Sabbioni, Edoardo Vittori, Marco Pinciroli, Michele Trapletti, Marcello Restelli\",\"doi\":\"10.1145/3533271.3561780\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Reinforcement learning has proven to be successful in obtaining profitable trading policies; however, the effectiveness of such strategies is strongly conditioned to market stationarity. This hypothesis is challenged by the regime switches frequently experienced by practitioners; thus, when many models are available, validation may become a difficult task. We propose to overcome the issue by explicitly modeling the trading task as a non-stationary reinforcement learning problem. Nevertheless, state-of-the-art RL algorithms for this setting usually require task distribution or dynamics to be predictable, an assumption that can hardly be true in the financial framework. In this work, we propose, instead, a method for the dynamic selection of the best RL agent which is only driven by profit performance. Our modular two-layer approach allows choosing the best strategy among a set of RL models through an online-learning algorithm. While we could select any combination of algorithms in principle, our solution employs two state-of-the-art algorithms: Fitted Q-Iteration (FQI) for the RL layer and Optimistic Adapt ML-Prod (OAMP) for the online learning one. The proposed approach is tested on two simulated FX trading tasks, using actual historical data for the AUS/USD and GBP/USD currency pairs.\",\"PeriodicalId\":134888,\"journal\":{\"name\":\"Proceedings of the Third ACM International Conference on AI in Finance\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Third ACM International Conference on AI in Finance\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3533271.3561780\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Third ACM International Conference on AI in Finance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3533271.3561780","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Addressing Non-Stationarity in FX Trading with Online Model Selection of Offline RL Experts
Reinforcement learning has proven to be successful in obtaining profitable trading policies; however, the effectiveness of such strategies is strongly conditioned to market stationarity. This hypothesis is challenged by the regime switches frequently experienced by practitioners; thus, when many models are available, validation may become a difficult task. We propose to overcome the issue by explicitly modeling the trading task as a non-stationary reinforcement learning problem. Nevertheless, state-of-the-art RL algorithms for this setting usually require task distribution or dynamics to be predictable, an assumption that can hardly be true in the financial framework. In this work, we propose, instead, a method for the dynamic selection of the best RL agent which is only driven by profit performance. Our modular two-layer approach allows choosing the best strategy among a set of RL models through an online-learning algorithm. While we could select any combination of algorithms in principle, our solution employs two state-of-the-art algorithms: Fitted Q-Iteration (FQI) for the RL layer and Optimistic Adapt ML-Prod (OAMP) for the online learning one. The proposed approach is tested on two simulated FX trading tasks, using actual historical data for the AUS/USD and GBP/USD currency pairs.