Denoised Labels for Financial Time Series Data via Self-Supervised Learning

Proceedings of the Third ACM International Conference on AI in Finance Pub Date : 2021-12-19 DOI:10.1145/3533271.3561687

Yanqing Ma, Carmine Ventre, M. Polukarov

{"title":"Denoised Labels for Financial Time Series Data via Self-Supervised Learning","authors":"Yanqing Ma, Carmine Ventre, M. Polukarov","doi":"10.1145/3533271.3561687","DOIUrl":null,"url":null,"abstract":"The introduction of electronic trading platforms effectively changed the organisation of traditional systemic trading from quote-driven markets into order-driven markets. Its convenience led to an exponentially increasing amount of financial data, which is however hard to use for the prediction of future prices, due to the low signal-to-noise ratio and the non-stationarity of financial time series. Simpler classification tasks — where the goal is to predict the directions of future price movement via supervised learning algorithms — need sufficiently reliable labels to generalise well. Labelling financial data is however less well defined than in other domains: did the price go up because of noise or a signal? The existing labelling methods have limited countermeasures against the noise, as well as limited effects in improving learning algorithms. This work takes inspiration from image classification in trading [6] and the success of self-supervised learning in computer vision (e.g., [16]). We investigate the idea of applying these techniques to financial time series to reduce the noise exposure and hence generate correct labels. We look at label generation as the pretext task of a self-supervised learning approach and compare the naive (and noisy) labels, commonly used in the literature, with the labels generated by a denoising autoencoder for the same downstream classification task. Our results demonstrate that these denoised labels improve the performances of the downstream learning algorithm, for both small and large datasets, while preserving the market trends. These findings suggest that with our proposed techniques, self-supervised learning constitutes a powerful framework for generating “better” financial labels that are useful for studying the underlying patterns of the market.","PeriodicalId":134888,"journal":{"name":"Proceedings of the Third ACM International Conference on AI in Finance","volume":"157 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Third ACM International Conference on AI in Finance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3533271.3561687","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

The introduction of electronic trading platforms effectively changed the organisation of traditional systemic trading from quote-driven markets into order-driven markets. Its convenience led to an exponentially increasing amount of financial data, which is however hard to use for the prediction of future prices, due to the low signal-to-noise ratio and the non-stationarity of financial time series. Simpler classification tasks — where the goal is to predict the directions of future price movement via supervised learning algorithms — need sufficiently reliable labels to generalise well. Labelling financial data is however less well defined than in other domains: did the price go up because of noise or a signal? The existing labelling methods have limited countermeasures against the noise, as well as limited effects in improving learning algorithms. This work takes inspiration from image classification in trading [6] and the success of self-supervised learning in computer vision (e.g., [16]). We investigate the idea of applying these techniques to financial time series to reduce the noise exposure and hence generate correct labels. We look at label generation as the pretext task of a self-supervised learning approach and compare the naive (and noisy) labels, commonly used in the literature, with the labels generated by a denoising autoencoder for the same downstream classification task. Our results demonstrate that these denoised labels improve the performances of the downstream learning algorithm, for both small and large datasets, while preserving the market trends. These findings suggest that with our proposed techniques, self-supervised learning constitutes a powerful framework for generating “better” financial labels that are useful for studying the underlying patterns of the market.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于自监督学习的金融时间序列数据去噪标签

电子交易平台的引入有效地改变了传统的系统交易组织，从报价驱动的市场转变为订单驱动的市场。它的便利性导致金融数据呈指数级增长，但由于金融时间序列的低信噪比和非平稳性，这些数据难以用于预测未来的价格。更简单的分类任务——其目标是通过监督学习算法预测未来价格走势——需要足够可靠的标签才能很好地泛化。然而，与其他领域相比，给金融数据贴上标签的定义不那么明确:价格上涨是因为噪音还是信号?现有的标注方法对噪声的应对措施有限，在改进学习算法方面的效果也有限。这项工作的灵感来自于交易[6]中的图像分类和计算机视觉中自监督学习的成功(例如[16])。我们研究了将这些技术应用于金融时间序列的想法，以减少噪声暴露，从而产生正确的标签。我们将标签生成视为自监督学习方法的借口任务，并将文献中常用的朴素(和噪声)标签与由去噪自编码器为相同的下游分类任务生成的标签进行比较。我们的研究结果表明，这些去噪的标签提高了下游学习算法的性能，无论是小数据集还是大数据集，同时保持了市场趋势。这些发现表明，通过我们提出的技术，自我监督学习构成了一个强大的框架，可以生成“更好”的金融标签，这些标签对研究市场的潜在模式很有用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the Third ACM International Conference on AI in Finance

自引率

0.00%

发文量

期刊最新文献

Core Matrix Regression and Prediction with Regularization Risk-Aware Linear Bandits with Application in Smart Order Routing Addressing Extreme Market Responses Using Secure Aggregation Addressing Non-Stationarity in FX Trading with Online Model Selection of Offline RL Experts Objective Driven Portfolio Construction Using Reinforcement Learning