Combining supervised and unsupervised learning methods to predict financial market movements

Gabriel Rodrigues Palma, Mariusz Skoczeń, Phil Maguire
{"title":"Combining supervised and unsupervised learning methods to predict financial market movements","authors":"Gabriel Rodrigues Palma, Mariusz Skoczeń, Phil Maguire","doi":"arxiv-2409.03762","DOIUrl":null,"url":null,"abstract":"The decisions traders make to buy or sell an asset depend on various\nanalyses, with expertise required to identify patterns that can be exploited\nfor profit. In this paper we identify novel features extracted from emergent\nand well-established financial markets using linear models and Gaussian Mixture\nModels (GMM) with the aim of finding profitable opportunities. We used\napproximately six months of data consisting of minute candles from the Bitcoin,\nPepecoin, and Nasdaq markets to derive and compare the proposed novel features\nwith commonly used ones. These features were extracted based on the previous 59\nminutes for each market and used to identify predictions for the hour ahead. We\nexplored the performance of various machine learning strategies, such as Random\nForests (RF) and K-Nearest Neighbours (KNN) to classify market movements. A\nnaive random approach to selecting trading decisions was used as a benchmark,\nwith outcomes assumed to be equally likely. We used a temporal cross-validation\napproach using test sets of 40%, 30% and 20% of total hours to evaluate the\nlearning algorithms' performances. Our results showed that filtering the time\nseries facilitates algorithms' generalisation. The GMM filtering approach\nrevealed that the KNN and RF algorithms produced higher average returns than\nthe random algorithm.","PeriodicalId":501139,"journal":{"name":"arXiv - QuantFin - Statistical Finance","volume":"9 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuantFin - Statistical Finance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.03762","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The decisions traders make to buy or sell an asset depend on various analyses, with expertise required to identify patterns that can be exploited for profit. In this paper we identify novel features extracted from emergent and well-established financial markets using linear models and Gaussian Mixture Models (GMM) with the aim of finding profitable opportunities. We used approximately six months of data consisting of minute candles from the Bitcoin, Pepecoin, and Nasdaq markets to derive and compare the proposed novel features with commonly used ones. These features were extracted based on the previous 59 minutes for each market and used to identify predictions for the hour ahead. We explored the performance of various machine learning strategies, such as Random Forests (RF) and K-Nearest Neighbours (KNN) to classify market movements. A naive random approach to selecting trading decisions was used as a benchmark, with outcomes assumed to be equally likely. We used a temporal cross-validation approach using test sets of 40%, 30% and 20% of total hours to evaluate the learning algorithms' performances. Our results showed that filtering the time series facilitates algorithms' generalisation. The GMM filtering approach revealed that the KNN and RF algorithms produced higher average returns than the random algorithm.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
结合监督和非监督学习方法预测金融市场动向
交易者买入或卖出资产的决定取决于各种分析,需要专业知识来识别可利用的盈利模式。在本文中,我们使用线性模型和高斯混杂模型(GMM)从新兴和成熟的金融市场中识别出新的特征,目的是寻找盈利机会。我们使用了大约六个月的数据,包括比特币、佩佩币和纳斯达克市场的分钟蜡烛图,得出了所提出的新特征,并将其与常用特征进行了比较。这些特征是根据每个市场的前 59 分钟提取的,并用于识别对未来一小时的预测。我们探索了各种机器学习策略的性能,如随机森林(RF)和 K-Nearest Neighbours(KNN),以对市场走势进行分类。我们将选择交易决策的随机方法作为基准,假定结果的可能性相同。我们采用时间交叉验证方法,使用总小时数的 40%、30% 和 20% 的测试集来评估学习算法的性能。结果表明,过滤时间序列有助于算法的泛化。GMM 过滤方法表明,KNN 和 RF 算法比随机算法产生了更高的平均收益。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Macroscopic properties of equity markets: stylized facts and portfolio performance Tuning into Climate Risks: Extracting Innovation from TV News for Clean Energy Firms On the macroeconomic fundamentals of long-term volatilities and dynamic correlations in COMEX copper futures Market information of the fractional stochastic regularity model Critical Dynamics of Random Surfaces
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1