Junshu Jiang, Jordan Richards, Raphaël Huser, David Bolin
In econometrics, the Efficient Market Hypothesis posits that asset prices reflect all available information in the market. Several empirical investigations show that market efficiency drops when it undergoes extreme events. Many models for multivariate extremes focus on positive dependence, making them unsuitable for studying extremal dependence in financial markets where data often exhibit both positive and negative extremal dependence. To this end, we construct regular variation models on the entirety of $mathbb{R}^d$ and develop a bivariate measure for asymmetry in the strength of extremal dependence between adjacent orthants. Our directional tail dependence (DTD) measure allows us to define the Efficient Tail Hypothesis (ETH) -- an analogue of the Efficient Market Hypothesis -- for the extremal behaviour of the market. Asymptotic results for estimators of DTD are described, and we discuss testing of the ETH via permutation-based methods and present novel tools for visualization. Empirical study of China's futures market leads to a rejection of the ETH and we identify potential profitable investment opportunities. To promote the research of microstructure in China's derivatives market, we open-source our high-frequency data, which are being collected continuously from multiple derivative exchanges.
在计量经济学中,有效市场假说(Efficient Market Hypothesis)认为资产价格反映了市场上所有可用的信息。一些实证研究表明,当市场出现极端事件时,市场效率就会下降。许多多元极端模型都侧重于正向依赖性,因此不适合研究金融市场的极端依赖性,因为金融市场的数据往往同时表现出正负极端依赖性。为此,我们构建了整个$mathbb{R}^d$的正则变异模型,并开发了一种双变量度量相邻正则之间极端依赖强度的不对称性。我们的定向尾部依赖性(DTD)度量使我们能够为市场的极端行为定义有效尾部假说(ETH)--有效市场假说的类似物。我们描述了 DTD 估计数的渐近结果,并讨论了通过基于排列组合的方法对 ETH 进行检验的问题,还介绍了新的可视化工具。通过对中国期货市场的实证研究,我们得出了 ETH 的投射结果,并发现了潜在的有利可图的投资机会。为了促进中国衍生品市场微观结构的研究,我们开源了从多个衍生品交易所持续收集的高频数据。
{"title":"The Efficient Tail Hypothesis: An Extreme Value Perspective on Market Efficiency","authors":"Junshu Jiang, Jordan Richards, Raphaël Huser, David Bolin","doi":"arxiv-2408.06661","DOIUrl":"https://doi.org/arxiv-2408.06661","url":null,"abstract":"In econometrics, the Efficient Market Hypothesis posits that asset prices\u0000reflect all available information in the market. Several empirical\u0000investigations show that market efficiency drops when it undergoes extreme\u0000events. Many models for multivariate extremes focus on positive dependence,\u0000making them unsuitable for studying extremal dependence in financial markets\u0000where data often exhibit both positive and negative extremal dependence. To\u0000this end, we construct regular variation models on the entirety of\u0000$mathbb{R}^d$ and develop a bivariate measure for asymmetry in the strength of\u0000extremal dependence between adjacent orthants. Our directional tail dependence\u0000(DTD) measure allows us to define the Efficient Tail Hypothesis (ETH) -- an\u0000analogue of the Efficient Market Hypothesis -- for the extremal behaviour of\u0000the market. Asymptotic results for estimators of DTD are described, and we\u0000discuss testing of the ETH via permutation-based methods and present novel\u0000tools for visualization. Empirical study of China's futures market leads to a\u0000rejection of the ETH and we identify potential profitable investment\u0000opportunities. To promote the research of microstructure in China's derivatives\u0000market, we open-source our high-frequency data, which are being collected\u0000continuously from multiple derivative exchanges.","PeriodicalId":501139,"journal":{"name":"arXiv - QuantFin - Statistical Finance","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurate stock market predictions following earnings reports are crucial for investors. Traditional methods, particularly classical machine learning models, struggle with these predictions because they cannot effectively process and interpret extensive textual data contained in earnings reports and often overlook nuances that influence market movements. This paper introduces an advanced approach by employing Large Language Models (LLMs) instruction fine-tuned with a novel combination of instruction-based techniques and quantized low-rank adaptation (QLoRA) compression. Our methodology integrates 'base factors', such as financial metric growth and earnings transcripts, with 'external factors', including recent market indices performances and analyst grades, to create a rich, supervised dataset. This comprehensive dataset enables our models to achieve superior predictive performance in terms of accuracy, weighted F1, and Matthews correlation coefficient (MCC), especially evident in the comparison with benchmarks such as GPT-4. We specifically highlight the efficacy of the llama-3-8b-Instruct-4bit model, which showcases significant improvements over baseline models. The paper also discusses the potential of expanding the output capabilities to include a 'Hold' option and extending the prediction horizon, aiming to accommodate various investment styles and time frames. This study not only demonstrates the power of integrating cutting-edge AI with fine-tuned financial data but also paves the way for future research in enhancing AI-driven financial analysis tools.
{"title":"Harnessing Earnings Reports for Stock Predictions: A QLoRA-Enhanced LLM Approach","authors":"Haowei Ni, Shuchen Meng, Xupeng Chen, Ziqing Zhao, Andi Chen, Panfeng Li, Shiyao Zhang, Qifu Yin, Yuanqing Wang, Yuxi Chan","doi":"arxiv-2408.06634","DOIUrl":"https://doi.org/arxiv-2408.06634","url":null,"abstract":"Accurate stock market predictions following earnings reports are crucial for\u0000investors. Traditional methods, particularly classical machine learning models,\u0000struggle with these predictions because they cannot effectively process and\u0000interpret extensive textual data contained in earnings reports and often\u0000overlook nuances that influence market movements. This paper introduces an\u0000advanced approach by employing Large Language Models (LLMs) instruction\u0000fine-tuned with a novel combination of instruction-based techniques and\u0000quantized low-rank adaptation (QLoRA) compression. Our methodology integrates\u0000'base factors', such as financial metric growth and earnings transcripts, with\u0000'external factors', including recent market indices performances and analyst\u0000grades, to create a rich, supervised dataset. This comprehensive dataset\u0000enables our models to achieve superior predictive performance in terms of\u0000accuracy, weighted F1, and Matthews correlation coefficient (MCC), especially\u0000evident in the comparison with benchmarks such as GPT-4. We specifically\u0000highlight the efficacy of the llama-3-8b-Instruct-4bit model, which showcases\u0000significant improvements over baseline models. The paper also discusses the\u0000potential of expanding the output capabilities to include a 'Hold' option and\u0000extending the prediction horizon, aiming to accommodate various investment\u0000styles and time frames. This study not only demonstrates the power of\u0000integrating cutting-edge AI with fine-tuned financial data but also paves the\u0000way for future research in enhancing AI-driven financial analysis tools.","PeriodicalId":501139,"journal":{"name":"arXiv - QuantFin - Statistical Finance","volume":"42 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Traditional quantitative investment research is encountering diminishing returns alongside rising labor and time costs. To overcome these challenges, we introduce the Large Investment Model (LIM), a novel research paradigm designed to enhance both performance and efficiency at scale. LIM employs end-to-end learning and universal modeling to create an upstream foundation model capable of autonomously learning comprehensive signal patterns from diverse financial data spanning multiple exchanges, instruments, and frequencies. These "global patterns" are subsequently transferred to downstream strategy modeling, optimizing performance for specific tasks. We detail the system architecture design of LIM, address the technical challenges inherent in this approach, and outline potential directions for future research. The advantages of LIM are demonstrated through a series of numerical experiments on cross-instrument prediction for commodity futures trading, leveraging insights from stock markets.
{"title":"Large Investment Model","authors":"Jian Guo, Heung-Yeung Shum","doi":"arxiv-2408.10255","DOIUrl":"https://doi.org/arxiv-2408.10255","url":null,"abstract":"Traditional quantitative investment research is encountering diminishing\u0000returns alongside rising labor and time costs. To overcome these challenges, we\u0000introduce the Large Investment Model (LIM), a novel research paradigm designed\u0000to enhance both performance and efficiency at scale. LIM employs end-to-end\u0000learning and universal modeling to create an upstream foundation model capable\u0000of autonomously learning comprehensive signal patterns from diverse financial\u0000data spanning multiple exchanges, instruments, and frequencies. These \"global\u0000patterns\" are subsequently transferred to downstream strategy modeling,\u0000optimizing performance for specific tasks. We detail the system architecture\u0000design of LIM, address the technical challenges inherent in this approach, and\u0000outline potential directions for future research. The advantages of LIM are\u0000demonstrated through a series of numerical experiments on cross-instrument\u0000prediction for commodity futures trading, leveraging insights from stock\u0000markets.","PeriodicalId":501139,"journal":{"name":"arXiv - QuantFin - Statistical Finance","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a novel data-driven network framework for forecasting problems related to E-mini S&P 500 and CBOE Volatility Index futures, in which products with different expirations act as distinct nodes. We provide visual demonstrations of the correlation structures of these products in terms of their returns, realized volatility, and trading volume. The resulting networks offer insights into the contemporaneous movements across the different products, illustrating how inherently connected the movements of the future products belonging to these two classes are. These networks are further utilized by a multi-channel Graph Convolutional Network to enhance the predictive power of a Long Short-Term Memory network, allowing for the propagation of forecasts of highly correlated quantities, combining the temporal with the spatial aspect of the term structure.
{"title":"A GCN-LSTM Approach for ES-mini and VX Futures Forecasting","authors":"Nikolas Michael, Mihai Cucuringu, Sam Howison","doi":"arxiv-2408.05659","DOIUrl":"https://doi.org/arxiv-2408.05659","url":null,"abstract":"We propose a novel data-driven network framework for forecasting problems\u0000related to E-mini S&P 500 and CBOE Volatility Index futures, in which products\u0000with different expirations act as distinct nodes. We provide visual\u0000demonstrations of the correlation structures of these products in terms of\u0000their returns, realized volatility, and trading volume. The resulting networks\u0000offer insights into the contemporaneous movements across the different\u0000products, illustrating how inherently connected the movements of the future\u0000products belonging to these two classes are. These networks are further\u0000utilized by a multi-channel Graph Convolutional Network to enhance the\u0000predictive power of a Long Short-Term Memory network, allowing for the\u0000propagation of forecasts of highly correlated quantities, combining the\u0000temporal with the spatial aspect of the term structure.","PeriodicalId":501139,"journal":{"name":"arXiv - QuantFin - Statistical Finance","volume":"24 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yaoyue Tang, Karina Arias-Calluari, Michael S. Harré
This paper compares and contrasts stationarity between the conventional stock market and cryptocurrency. The dataset used for the analysis is the intraday price indices of the S&P500 from 1996 to 2023 and the intraday Bitcoin indices from 2019 to 2023, both in USD. We adopt the definition of `wide sense stationary', which constrains the time independence of the first and second moments of a time series. The testing method used in this paper follows the Wiener-Khinchin Theorem, i.e., that for a wide sense stationary process, the power spectral density and the autocorrelation are a Fourier transform pair. We demonstrate that localized stationarity can be achieved by truncating the time series into segments, and for each segment, detrending and normalizing the price return are required. These results show that the S&P500 price return can achieve stationarity for the full 28-year period with a detrending window of 12 months and a constrained normalization window of 10 minutes. With truncated segments, a larger normalization window can be used to establish stationarity, indicating that within the segment the data is more homogeneous. For Bitcoin price return, the segment with higher volatility presents stationarity with a normalization window of 60 minutes, whereas stationarity cannot be established in other segments.
{"title":"Comparative analysis of stationarity for Bitcoin and the S&P500","authors":"Yaoyue Tang, Karina Arias-Calluari, Michael S. Harré","doi":"arxiv-2408.02973","DOIUrl":"https://doi.org/arxiv-2408.02973","url":null,"abstract":"This paper compares and contrasts stationarity between the conventional stock\u0000market and cryptocurrency. The dataset used for the analysis is the intraday\u0000price indices of the S&P500 from 1996 to 2023 and the intraday Bitcoin indices\u0000from 2019 to 2023, both in USD. We adopt the definition of `wide sense\u0000stationary', which constrains the time independence of the first and second\u0000moments of a time series. The testing method used in this paper follows the\u0000Wiener-Khinchin Theorem, i.e., that for a wide sense stationary process, the\u0000power spectral density and the autocorrelation are a Fourier transform pair. We\u0000demonstrate that localized stationarity can be achieved by truncating the time\u0000series into segments, and for each segment, detrending and normalizing the\u0000price return are required. These results show that the S&P500 price return can\u0000achieve stationarity for the full 28-year period with a detrending window of 12\u0000months and a constrained normalization window of 10 minutes. With truncated\u0000segments, a larger normalization window can be used to establish stationarity,\u0000indicating that within the segment the data is more homogeneous. For Bitcoin\u0000price return, the segment with higher volatility presents stationarity with a\u0000normalization window of 60 minutes, whereas stationarity cannot be established\u0000in other segments.","PeriodicalId":501139,"journal":{"name":"arXiv - QuantFin - Statistical Finance","volume":"39 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141933529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The use of machine learning for statistical modeling (and thus, generative modeling) has grown in popularity with the proliferation of time series models, text-to-image models, and especially large language models. Fundamentally, the goal of classical factor modeling is statistical modeling of stock returns, and in this work, we explore using deep generative modeling to enhance classical factor models. Prior work has explored the use of deep generative models in order to model hundreds of stocks, leading to accurate risk forecasting and alpha portfolio construction; however, that specific model does not allow for easy factor modeling interpretation in that the factor exposures cannot be deduced. In this work, we introduce NeuralFactors, a novel machine-learning based approach to factor analysis where a neural network outputs factor exposures and factor returns, trained using the same methodology as variational autoencoders. We show that this model outperforms prior approaches both in terms of log-likelihood performance and computational efficiency. Further, we show that this method is competitive to prior work in generating realistic synthetic data, covariance estimation, risk analysis (e.g., value at risk, or VaR, of portfolios), and portfolio optimization. Finally, due to the connection to classical factor analysis, we analyze how the factors our model learns cluster together and show that the factor exposures could be used for embedding stocks.
{"title":"NeuralFactors: A Novel Factor Learning Approach to Generative Modeling of Equities","authors":"Achintya Gopal","doi":"arxiv-2408.01499","DOIUrl":"https://doi.org/arxiv-2408.01499","url":null,"abstract":"The use of machine learning for statistical modeling (and thus, generative\u0000modeling) has grown in popularity with the proliferation of time series models,\u0000text-to-image models, and especially large language models. Fundamentally, the\u0000goal of classical factor modeling is statistical modeling of stock returns, and\u0000in this work, we explore using deep generative modeling to enhance classical\u0000factor models. Prior work has explored the use of deep generative models in\u0000order to model hundreds of stocks, leading to accurate risk forecasting and\u0000alpha portfolio construction; however, that specific model does not allow for\u0000easy factor modeling interpretation in that the factor exposures cannot be\u0000deduced. In this work, we introduce NeuralFactors, a novel machine-learning\u0000based approach to factor analysis where a neural network outputs factor\u0000exposures and factor returns, trained using the same methodology as variational\u0000autoencoders. We show that this model outperforms prior approaches both in\u0000terms of log-likelihood performance and computational efficiency. Further, we\u0000show that this method is competitive to prior work in generating realistic\u0000synthetic data, covariance estimation, risk analysis (e.g., value at risk, or\u0000VaR, of portfolios), and portfolio optimization. Finally, due to the connection\u0000to classical factor analysis, we analyze how the factors our model learns\u0000cluster together and show that the factor exposures could be used for embedding\u0000stocks.","PeriodicalId":501139,"journal":{"name":"arXiv - QuantFin - Statistical Finance","volume":"193 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141933528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Traditional approaches to estimating beta in finance often involve rigid assumptions and fail to adequately capture beta dynamics, limiting their effectiveness in use cases like hedging. To address these limitations, we have developed a novel method using neural networks called NeuralBeta, which is capable of handling both univariate and multivariate scenarios and tracking the dynamic behavior of beta. To address the issue of interpretability, we introduce a new output layer inspired by regularized weighted linear regression, which provides transparency into the model's decision-making process. We conducted extensive experiments on both synthetic and market data, demonstrating NeuralBeta's superior performance compared to benchmark methods across various scenarios, especially instances where beta is highly time-varying, e.g., during regime shifts in the market. This model not only represents an advancement in the field of beta estimation, but also shows potential for applications in other financial contexts that assume linear relationships.
{"title":"NeuralBeta: Estimating Beta Using Deep Learning","authors":"Yuxin Liu, Jimin Lin, Achintya Gopal","doi":"arxiv-2408.01387","DOIUrl":"https://doi.org/arxiv-2408.01387","url":null,"abstract":"Traditional approaches to estimating beta in finance often involve rigid\u0000assumptions and fail to adequately capture beta dynamics, limiting their\u0000effectiveness in use cases like hedging. To address these limitations, we have\u0000developed a novel method using neural networks called NeuralBeta, which is\u0000capable of handling both univariate and multivariate scenarios and tracking the\u0000dynamic behavior of beta. To address the issue of interpretability, we\u0000introduce a new output layer inspired by regularized weighted linear\u0000regression, which provides transparency into the model's decision-making\u0000process. We conducted extensive experiments on both synthetic and market data,\u0000demonstrating NeuralBeta's superior performance compared to benchmark methods\u0000across various scenarios, especially instances where beta is highly\u0000time-varying, e.g., during regime shifts in the market. This model not only\u0000represents an advancement in the field of beta estimation, but also shows\u0000potential for applications in other financial contexts that assume linear\u0000relationships.","PeriodicalId":501139,"journal":{"name":"arXiv - QuantFin - Statistical Finance","volume":"189 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141933530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Financial stock returns correlations have been studied in the prism of random matrix theory, to distinguish the signal from the "noise". Eigenvalues of the matrix that are above the rescaled Marchenko Pastur distribution can be interpreted as collective modes behavior while the modes under are usually considered as noise. In this analysis we use complex network analysis to simulate the "noise" and the "market" component of the return correlations, by introducing some meaningful correlations in simulated geometric Brownian motion for the stocks. We find that the returns correlation matrix is dominated by stocks with high eigenvector centrality and clustering found in the network. We then use simulated "market" random walks to build an optimal portfolio and find that the overall return performs better than using the historical mean-variance data, up to 50% on short time scale.
{"title":"Inferring financial stock returns correlation from complex network analysis","authors":"Ixandra Achitouv","doi":"arxiv-2407.20380","DOIUrl":"https://doi.org/arxiv-2407.20380","url":null,"abstract":"Financial stock returns correlations have been studied in the prism of random\u0000matrix theory, to distinguish the signal from the \"noise\". Eigenvalues of the\u0000matrix that are above the rescaled Marchenko Pastur distribution can be\u0000interpreted as collective modes behavior while the modes under are usually\u0000considered as noise. In this analysis we use complex network analysis to\u0000simulate the \"noise\" and the \"market\" component of the return correlations, by\u0000introducing some meaningful correlations in simulated geometric Brownian motion\u0000for the stocks. We find that the returns correlation matrix is dominated by\u0000stocks with high eigenvector centrality and clustering found in the network. We\u0000then use simulated \"market\" random walks to build an optimal portfolio and find\u0000that the overall return performs better than using the historical mean-variance\u0000data, up to 50% on short time scale.","PeriodicalId":501139,"journal":{"name":"arXiv - QuantFin - Statistical Finance","volume":"76 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141866154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Representation learning has emerged as a powerful paradigm for extracting valuable latent features from complex, high-dimensional data. In financial domains, learning informative representations for assets can be used for tasks like sector classification, and risk management. However, the complex and stochastic nature of financial markets poses unique challenges. We propose a novel contrastive learning framework to generate asset embeddings from financial time series data. Our approach leverages the similarity of asset returns over many subwindows to generate informative positive and negative samples, using a statistical sampling strategy based on hypothesis testing to address the noisy nature of financial data. We explore various contrastive loss functions that capture the relationships between assets in different ways to learn a discriminative representation space. Experiments on real-world datasets demonstrate the effectiveness of the learned asset embeddings on benchmark industry classification and portfolio optimization tasks. In each case our novel approaches significantly outperform existing baselines highlighting the potential for contrastive learning to capture meaningful and actionable relationships in financial data.
{"title":"Contrastive Learning of Asset Embeddings from Financial Time Series","authors":"Rian Dolphin, Barry Smyth, Ruihai Dong","doi":"arxiv-2407.18645","DOIUrl":"https://doi.org/arxiv-2407.18645","url":null,"abstract":"Representation learning has emerged as a powerful paradigm for extracting\u0000valuable latent features from complex, high-dimensional data. In financial\u0000domains, learning informative representations for assets can be used for tasks\u0000like sector classification, and risk management. However, the complex and\u0000stochastic nature of financial markets poses unique challenges. We propose a\u0000novel contrastive learning framework to generate asset embeddings from\u0000financial time series data. Our approach leverages the similarity of asset\u0000returns over many subwindows to generate informative positive and negative\u0000samples, using a statistical sampling strategy based on hypothesis testing to\u0000address the noisy nature of financial data. We explore various contrastive loss\u0000functions that capture the relationships between assets in different ways to\u0000learn a discriminative representation space. Experiments on real-world datasets\u0000demonstrate the effectiveness of the learned asset embeddings on benchmark\u0000industry classification and portfolio optimization tasks. In each case our\u0000novel approaches significantly outperform existing baselines highlighting the\u0000potential for contrastive learning to capture meaningful and actionable\u0000relationships in financial data.","PeriodicalId":501139,"journal":{"name":"arXiv - QuantFin - Statistical Finance","volume":"42 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141866155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We investigate whether an LLM can successfully perform financial statement analysis in a way similar to a professional human analyst. We provide standardized and anonymous financial statements to GPT4 and instruct the model to analyze them to determine the direction of future earnings. Even without any narrative or industry-specific information, the LLM outperforms financial analysts in its ability to predict earnings changes. The LLM exhibits a relative advantage over human analysts in situations when the analysts tend to struggle. Furthermore, we find that the prediction accuracy of the LLM is on par with the performance of a narrowly trained state-of-the-art ML model. LLM prediction does not stem from its training memory. Instead, we find that the LLM generates useful narrative insights about a company's future performance. Lastly, our trading strategies based on GPT's predictions yield a higher Sharpe ratio and alphas than strategies based on other models. Taken together, our results suggest that LLMs may take a central role in decision-making.
{"title":"Financial Statement Analysis with Large Language Models","authors":"Alex Kim, Maximilian Muhn, Valeri Nikolaev","doi":"arxiv-2407.17866","DOIUrl":"https://doi.org/arxiv-2407.17866","url":null,"abstract":"We investigate whether an LLM can successfully perform financial statement\u0000analysis in a way similar to a professional human analyst. We provide\u0000standardized and anonymous financial statements to GPT4 and instruct the model\u0000to analyze them to determine the direction of future earnings. Even without any\u0000narrative or industry-specific information, the LLM outperforms financial\u0000analysts in its ability to predict earnings changes. The LLM exhibits a\u0000relative advantage over human analysts in situations when the analysts tend to\u0000struggle. Furthermore, we find that the prediction accuracy of the LLM is on\u0000par with the performance of a narrowly trained state-of-the-art ML model. LLM\u0000prediction does not stem from its training memory. Instead, we find that the\u0000LLM generates useful narrative insights about a company's future performance.\u0000Lastly, our trading strategies based on GPT's predictions yield a higher Sharpe\u0000ratio and alphas than strategies based on other models. Taken together, our\u0000results suggest that LLMs may take a central role in decision-making.","PeriodicalId":501139,"journal":{"name":"arXiv - QuantFin - Statistical Finance","volume":"70 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141774179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}