探索在基于机器学习的股票交易中使用多粒度级别的数据

Jacopo Fior, Luca Cagliero
{"title":"探索在基于机器学习的股票交易中使用多粒度级别的数据","authors":"Jacopo Fior, Luca Cagliero","doi":"10.1109/ICDMW51313.2020.00053","DOIUrl":null,"url":null,"abstract":"In the last decade the Artificial Intelligence and Data Science communities have paid an increasing attention to the problem of forecasting stock market movements. The abundance of stock-related data, including price series, news articles, financial reports, and social content has leveraged the use of Machine Learning techniques to drive quantitative stock trading. In this field, a huge body of work has been devoted to identifying the most predictive features and to select the best performing algorithms. However, since algorithm performance is heavily affected by the granularity of the analyzed time series as well as by the amount of historical data used to train the ML models, identifying the most appropriate time granularity and ML pipeline can be challenging. This paper studies the relationship between the granularity of time series data and ML performance. It compares also the performance of established ML pipelines in order to evaluate the pros and cons of periodically retraining the ML models. Furthermore, it performs a step beyond towards the integration of ML into real trading systems by studying how to conveniently set up the most established trading system characteristics. The results provide preliminary empirical evidences on how to profitably trade U.S. NASDAQ-100 stocks and leave room for further investigations.","PeriodicalId":426846,"journal":{"name":"2020 International Conference on Data Mining Workshops (ICDMW)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Exploring the Use of Data at Multiple Granularity Levels in Machine Learning-Based Stock Trading\",\"authors\":\"Jacopo Fior, Luca Cagliero\",\"doi\":\"10.1109/ICDMW51313.2020.00053\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the last decade the Artificial Intelligence and Data Science communities have paid an increasing attention to the problem of forecasting stock market movements. The abundance of stock-related data, including price series, news articles, financial reports, and social content has leveraged the use of Machine Learning techniques to drive quantitative stock trading. In this field, a huge body of work has been devoted to identifying the most predictive features and to select the best performing algorithms. However, since algorithm performance is heavily affected by the granularity of the analyzed time series as well as by the amount of historical data used to train the ML models, identifying the most appropriate time granularity and ML pipeline can be challenging. This paper studies the relationship between the granularity of time series data and ML performance. It compares also the performance of established ML pipelines in order to evaluate the pros and cons of periodically retraining the ML models. Furthermore, it performs a step beyond towards the integration of ML into real trading systems by studying how to conveniently set up the most established trading system characteristics. The results provide preliminary empirical evidences on how to profitably trade U.S. NASDAQ-100 stocks and leave room for further investigations.\",\"PeriodicalId\":426846,\"journal\":{\"name\":\"2020 International Conference on Data Mining Workshops (ICDMW)\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Data Mining Workshops (ICDMW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDMW51313.2020.00053\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Data Mining Workshops (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW51313.2020.00053","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

在过去的十年中,人工智能和数据科学界越来越关注预测股市走势的问题。大量的股票相关数据,包括价格序列、新闻文章、财务报告和社交内容,利用机器学习技术来推动定量股票交易。在这个领域,已经有大量的工作致力于识别最具预测性的特征并选择表现最好的算法。然而,由于算法性能受到分析时间序列的粒度以及用于训练机器学习模型的历史数据量的严重影响,因此确定最合适的时间粒度和机器学习管道可能具有挑战性。本文研究了时间序列数据粒度与机器学习性能之间的关系。它还比较了已建立的机器学习管道的性能,以评估定期重新训练机器学习模型的利弊。此外,它通过研究如何方便地设置最成熟的交易系统特征,向将ML集成到真实的交易系统迈出了一步。研究结果为如何交易美国纳斯达克100指数成分股获利提供了初步的实证证据,并为进一步研究留下了空间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Exploring the Use of Data at Multiple Granularity Levels in Machine Learning-Based Stock Trading
In the last decade the Artificial Intelligence and Data Science communities have paid an increasing attention to the problem of forecasting stock market movements. The abundance of stock-related data, including price series, news articles, financial reports, and social content has leveraged the use of Machine Learning techniques to drive quantitative stock trading. In this field, a huge body of work has been devoted to identifying the most predictive features and to select the best performing algorithms. However, since algorithm performance is heavily affected by the granularity of the analyzed time series as well as by the amount of historical data used to train the ML models, identifying the most appropriate time granularity and ML pipeline can be challenging. This paper studies the relationship between the granularity of time series data and ML performance. It compares also the performance of established ML pipelines in order to evaluate the pros and cons of periodically retraining the ML models. Furthermore, it performs a step beyond towards the integration of ML into real trading systems by studying how to conveniently set up the most established trading system characteristics. The results provide preliminary empirical evidences on how to profitably trade U.S. NASDAQ-100 stocks and leave room for further investigations.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Synthetic Data by Principal Component Analysis Deep Contextualized Word Embedding for Text-based Online User Profiling to Detect Social Bots on Twitter Integration of Fuzzy and Deep Learning in Three-Way Decisions Mining Heterogeneous Data for Formulation Design Restructuring of Hoeffding Trees for Trapezoidal Data Streams
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1