arXiv - QuantFin - Statistical Finance最新文献

英文中文

The Efficient Tail Hypothesis: An Extreme Value Perspective on Market Efficiency 有效尾假设：从极值角度看市场效率

arXiv - QuantFin - Statistical Finance

Pub Date : 2024-08-13 DOI: arxiv-2408.06661

Junshu Jiang, Jordan Richards, Raphaël Huser, David Bolin

In econometrics, the Efficient Market Hypothesis posits that asset pricesreflect all available information in the market. Several empiricalinvestigations show that market efficiency drops when it undergoes extremeevents. Many models for multivariate extremes focus on positive dependence,making them unsuitable for studying extremal dependence in financial marketswhere data often exhibit both positive and negative extremal dependence. Tothis end, we construct regular variation models on the entirety of$mathbb{R}^d$ and develop a bivariate measure for asymmetry in the strength ofextremal dependence between adjacent orthants. Our directional tail dependence(DTD) measure allows us to define the Efficient Tail Hypothesis (ETH) -- ananalogue of the Efficient Market Hypothesis -- for the extremal behaviour ofthe market. Asymptotic results for estimators of DTD are described, and wediscuss testing of the ETH via permutation-based methods and present noveltools for visualization. Empirical study of China's futures market leads to arejection of the ETH and we identify potential profitable investmentopportunities. To promote the research of microstructure in China's derivativesmarket, we open-source our high-frequency data, which are being collectedcontinuously from multiple derivative exchanges.

在计量经济学中，有效市场假说（Efficient Market Hypothesis）认为资产价格反映了市场上所有可用的信息。一些实证研究表明，当市场出现极端事件时，市场效率就会下降。许多多元极端模型都侧重于正向依赖性，因此不适合研究金融市场的极端依赖性，因为金融市场的数据往往同时表现出正负极端依赖性。为此，我们构建了整个$mathbb{R}^d$的正则变异模型，并开发了一种双变量度量相邻正则之间极端依赖强度的不对称性。我们的定向尾部依赖性（DTD）度量使我们能够为市场的极端行为定义有效尾部假说（ETH）--有效市场假说的类似物。我们描述了 DTD 估计数的渐近结果，并讨论了通过基于排列组合的方法对 ETH 进行检验的问题，还介绍了新的可视化工具。通过对中国期货市场的实证研究，我们得出了 ETH 的投射结果，并发现了潜在的有利可图的投资机会。为了促进中国衍生品市场微观结构的研究，我们开源了从多个衍生品交易所持续收集的高频数据。

{"title":"The Efficient Tail Hypothesis: An Extreme Value Perspective on Market Efficiency","authors":"Junshu Jiang, Jordan Richards, Raphaël Huser, David Bolin","doi":"arxiv-2408.06661","DOIUrl":"https://doi.org/arxiv-2408.06661","url":null,"abstract":"In econometrics, the Efficient Market Hypothesis posits that asset prices\u0000reflect all available information in the market. Several empirical\u0000investigations show that market efficiency drops when it undergoes extreme\u0000events. Many models for multivariate extremes focus on positive dependence,\u0000making them unsuitable for studying extremal dependence in financial markets\u0000where data often exhibit both positive and negative extremal dependence. To\u0000this end, we construct regular variation models on the entirety of\u0000$mathbb{R}^d$ and develop a bivariate measure for asymmetry in the strength of\u0000extremal dependence between adjacent orthants. Our directional tail dependence\u0000(DTD) measure allows us to define the Efficient Tail Hypothesis (ETH) -- an\u0000analogue of the Efficient Market Hypothesis -- for the extremal behaviour of\u0000the market. Asymptotic results for estimators of DTD are described, and we\u0000discuss testing of the ETH via permutation-based methods and present novel\u0000tools for visualization. Empirical study of China's futures market leads to a\u0000rejection of the ETH and we identify potential profitable investment\u0000opportunities. To promote the research of microstructure in China's derivatives\u0000market, we open-source our high-frequency data, which are being collected\u0000continuously from multiple derivative exchanges.","PeriodicalId":501139,"journal":{"name":"arXiv - QuantFin - Statistical Finance","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Harnessing Earnings Reports for Stock Predictions: A QLoRA-Enhanced LLM Approach 利用盈利报告进行股票预测：QLoRA 增强型 LLM 方法

arXiv - QuantFin - Statistical Finance

Pub Date : 2024-08-13 DOI: arxiv-2408.06634

Haowei Ni, Shuchen Meng, Xupeng Chen, Ziqing Zhao, Andi Chen, Panfeng Li, Shiyao Zhang, Qifu Yin, Yuanqing Wang, Yuxi Chan

Accurate stock market predictions following earnings reports are crucial forinvestors. Traditional methods, particularly classical machine learning models,struggle with these predictions because they cannot effectively process andinterpret extensive textual data contained in earnings reports and oftenoverlook nuances that influence market movements. This paper introduces anadvanced approach by employing Large Language Models (LLMs) instructionfine-tuned with a novel combination of instruction-based techniques andquantized low-rank adaptation (QLoRA) compression. Our methodology integrates'base factors', such as financial metric growth and earnings transcripts, with'external factors', including recent market indices performances and analystgrades, to create a rich, supervised dataset. This comprehensive datasetenables our models to achieve superior predictive performance in terms ofaccuracy, weighted F1, and Matthews correlation coefficient (MCC), especiallyevident in the comparison with benchmarks such as GPT-4. We specificallyhighlight the efficacy of the llama-3-8b-Instruct-4bit model, which showcasessignificant improvements over baseline models. The paper also discusses thepotential of expanding the output capabilities to include a 'Hold' option andextending the prediction horizon, aiming to accommodate various investmentstyles and time frames. This study not only demonstrates the power ofintegrating cutting-edge AI with fine-tuned financial data but also paves theway for future research in enhancing AI-driven financial analysis tools.

准确预测财报发布后的股市行情对投资者至关重要。传统方法，尤其是经典的机器学习模型，在这些预测方面举步维艰，因为它们无法有效处理和解释财报中包含的大量文本数据，而且经常忽略影响市场走势的细微差别。本文介绍了一种先进的方法，即采用基于指令的技术和量化低秩适应（QLoRA）压缩的新组合对大语言模型（LLMs）进行指令微调。我们的方法将财务指标增长和盈利记录等 "基础因素 "与近期市场指数表现和分析师评级等 "外部因素 "整合在一起，创建了一个丰富的监督数据集。这种全面的数据集使我们的模型在准确性、加权 F1 和马修斯相关系数 (MCC) 方面取得了卓越的预测性能，这在与 GPT-4 等基准的比较中尤为明显。我们特别强调了llama-3-8b-Instruct-4bit 模型的功效，它比基准模型有了显著的改进。本文还讨论了扩展输出功能的潜力，包括 "持有 "选项和延长预测期限，旨在适应各种投资风格和时间框架。这项研究不仅展示了尖端人工智能与微调金融数据相结合的威力，还为未来加强人工智能驱动的金融分析工具的研究铺平了道路。

{"title":"Harnessing Earnings Reports for Stock Predictions: A QLoRA-Enhanced LLM Approach","authors":"Haowei Ni, Shuchen Meng, Xupeng Chen, Ziqing Zhao, Andi Chen, Panfeng Li, Shiyao Zhang, Qifu Yin, Yuanqing Wang, Yuxi Chan","doi":"arxiv-2408.06634","DOIUrl":"https://doi.org/arxiv-2408.06634","url":null,"abstract":"Accurate stock market predictions following earnings reports are crucial for\u0000investors. Traditional methods, particularly classical machine learning models,\u0000struggle with these predictions because they cannot effectively process and\u0000interpret extensive textual data contained in earnings reports and often\u0000overlook nuances that influence market movements. This paper introduces an\u0000advanced approach by employing Large Language Models (LLMs) instruction\u0000fine-tuned with a novel combination of instruction-based techniques and\u0000quantized low-rank adaptation (QLoRA) compression. Our methodology integrates\u0000'base factors', such as financial metric growth and earnings transcripts, with\u0000'external factors', including recent market indices performances and analyst\u0000grades, to create a rich, supervised dataset. This comprehensive dataset\u0000enables our models to achieve superior predictive performance in terms of\u0000accuracy, weighted F1, and Matthews correlation coefficient (MCC), especially\u0000evident in the comparison with benchmarks such as GPT-4. We specifically\u0000highlight the efficacy of the llama-3-8b-Instruct-4bit model, which showcases\u0000significant improvements over baseline models. The paper also discusses the\u0000potential of expanding the output capabilities to include a 'Hold' option and\u0000extending the prediction horizon, aiming to accommodate various investment\u0000styles and time frames. This study not only demonstrates the power of\u0000integrating cutting-edge AI with fine-tuned financial data but also paves the\u0000way for future research in enhancing AI-driven financial analysis tools.","PeriodicalId":501139,"journal":{"name":"arXiv - QuantFin - Statistical Finance","volume":"42 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Large Investment Model 大型投资模式

arXiv - QuantFin - Statistical Finance

Pub Date : 2024-08-12 DOI: arxiv-2408.10255

Jian Guo, Heung-Yeung Shum

Traditional quantitative investment research is encountering diminishingreturns alongside rising labor and time costs. To overcome these challenges, weintroduce the Large Investment Model (LIM), a novel research paradigm designedto enhance both performance and efficiency at scale. LIM employs end-to-endlearning and universal modeling to create an upstream foundation model capableof autonomously learning comprehensive signal patterns from diverse financialdata spanning multiple exchanges, instruments, and frequencies. These "globalpatterns" are subsequently transferred to downstream strategy modeling,optimizing performance for specific tasks. We detail the system architecturedesign of LIM, address the technical challenges inherent in this approach, andoutline potential directions for future research. The advantages of LIM aredemonstrated through a series of numerical experiments on cross-instrumentprediction for commodity futures trading, leveraging insights from stockmarkets.

传统的定量投资研究在人力和时间成本上升的同时，也遇到了收益递减的问题。为了克服这些挑战，我们推出了大型投资模型（LIM），这是一种新颖的研究范式，旨在大规模提高性能和效率。LIM 采用端到端学习和通用建模技术，创建了一个上游基础模型，能够从跨越多个交易所、工具和频率的各种金融数据中自主学习综合信号模式。这些 "全局模式 "随后被转移到下游策略建模中，优化特定任务的性能。我们详细介绍了 LIM 的系统架构设计，解决了这一方法固有的技术难题，并概述了未来研究的潜在方向。我们利用从股票市场中获得的洞察力，对商品期货交易的跨工具预测进行了一系列数值实验，从而展示了 LIM 的优势。

引用次数: 0

A GCN-LSTM Approach for ES-mini and VX Futures Forecasting 用于 ES-mini 和 VX 期货预测的 GCN-LSTM 方法

arXiv - QuantFin - Statistical Finance

Pub Date : 2024-08-10 DOI: arxiv-2408.05659

Nikolas Michael, Mihai Cucuringu, Sam Howison

We propose a novel data-driven network framework for forecasting problemsrelated to E-mini S&P 500 and CBOE Volatility Index futures, in which productswith different expirations act as distinct nodes. We provide visualdemonstrations of the correlation structures of these products in terms oftheir returns, realized volatility, and trading volume. The resulting networksoffer insights into the contemporaneous movements across the differentproducts, illustrating how inherently connected the movements of the futureproducts belonging to these two classes are. These networks are furtherutilized by a multi-channel Graph Convolutional Network to enhance thepredictive power of a Long Short-Term Memory network, allowing for thepropagation of forecasts of highly correlated quantities, combining thetemporal with the spatial aspect of the term structure.

我们提出了一个新颖的数据驱动网络框架，用于预测与 E-mini S&P 500 和 CBOE 波动率指数期货相关的问题，其中不同到期日的产品作为不同的节点。我们从这些产品的收益、已实现波动率和交易量等方面直观地展示了它们的相关性结构。由此产生的网络提供了对不同产品同期走势的洞察，说明了属于这两类的未来产品的走势是如何内在地联系在一起的。多通道图卷积网络进一步利用这些网络，增强了长短期记忆网络的预测能力，使高度相关数量的预测得以传播，从而将期限结构的时间和空间方面结合起来。

引用次数: 0

Comparative analysis of stationarity for Bitcoin and the S&P500 比特币与标准普尔 500 指数的静态比较分析

arXiv - QuantFin - Statistical Finance

Pub Date : 2024-08-06 DOI: arxiv-2408.02973

Yaoyue Tang, Karina Arias-Calluari, Michael S. Harré

This paper compares and contrasts stationarity between the conventional stockmarket and cryptocurrency. The dataset used for the analysis is the intradayprice indices of the S&P500 from 1996 to 2023 and the intraday Bitcoin indicesfrom 2019 to 2023, both in USD. We adopt the definition of `wide sensestationary', which constrains the time independence of the first and secondmoments of a time series. The testing method used in this paper follows theWiener-Khinchin Theorem, i.e., that for a wide sense stationary process, thepower spectral density and the autocorrelation are a Fourier transform pair. Wedemonstrate that localized stationarity can be achieved by truncating the timeseries into segments, and for each segment, detrending and normalizing theprice return are required. These results show that the S&P500 price return canachieve stationarity for the full 28-year period with a detrending window of 12months and a constrained normalization window of 10 minutes. With truncatedsegments, a larger normalization window can be used to establish stationarity,indicating that within the segment the data is more homogeneous. For Bitcoinprice return, the segment with higher volatility presents stationarity with anormalization window of 60 minutes, whereas stationarity cannot be establishedin other segments.

本文比较和对比了传统股票市场和加密货币之间的静态性。分析使用的数据集是 1996 年至 2023 年的标准普尔 500 指数盘中价格指数和 2019 年至 2023 年的比特币盘中指数，均以美元为单位。我们采用了 "广义敏感期 "的定义，该定义限制了时间序列的第一和第二矩的时间独立性。本文使用的检验方法遵循维纳-欣钦定理，即对于广义静止过程，功率谱密度和自相关是一对傅立叶变换。我们证明，可以通过将时间序列截断成段来实现局部静止，而对于每个段，都需要对价格回报进行去趋势和归一化处理。这些结果表明，在 12 个月的去趋势窗口和 10 分钟的限制归一化窗口下，S&P500 指数的价格收益率可以在整个 28 年期间实现静止。对于截断的分段，可以使用更大的归一化窗口来建立静态，这表明分段内的数据更加均匀。对于比特币价格回报率，波动性较高的分段在 60 分钟的归一化窗口下呈现出静态性，而其他分段则无法建立静态性。

{"title":"Comparative analysis of stationarity for Bitcoin and the S&P500","authors":"Yaoyue Tang, Karina Arias-Calluari, Michael S. Harré","doi":"arxiv-2408.02973","DOIUrl":"https://doi.org/arxiv-2408.02973","url":null,"abstract":"This paper compares and contrasts stationarity between the conventional stock\u0000market and cryptocurrency. The dataset used for the analysis is the intraday\u0000price indices of the S&P500 from 1996 to 2023 and the intraday Bitcoin indices\u0000from 2019 to 2023, both in USD. We adopt the definition of `wide sense\u0000stationary', which constrains the time independence of the first and second\u0000moments of a time series. The testing method used in this paper follows the\u0000Wiener-Khinchin Theorem, i.e., that for a wide sense stationary process, the\u0000power spectral density and the autocorrelation are a Fourier transform pair. We\u0000demonstrate that localized stationarity can be achieved by truncating the time\u0000series into segments, and for each segment, detrending and normalizing the\u0000price return are required. These results show that the S&P500 price return can\u0000achieve stationarity for the full 28-year period with a detrending window of 12\u0000months and a constrained normalization window of 10 minutes. With truncated\u0000segments, a larger normalization window can be used to establish stationarity,\u0000indicating that within the segment the data is more homogeneous. For Bitcoin\u0000price return, the segment with higher volatility presents stationarity with a\u0000normalization window of 60 minutes, whereas stationarity cannot be established\u0000in other segments.","PeriodicalId":501139,"journal":{"name":"arXiv - QuantFin - Statistical Finance","volume":"39 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141933529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

NeuralFactors: A Novel Factor Learning Approach to Generative Modeling of Equities 神经因子：股票生成模型的新型因子学习方法

arXiv - QuantFin - Statistical Finance

Pub Date : 2024-08-02 DOI: arxiv-2408.01499

Achintya Gopal

The use of machine learning for statistical modeling (and thus, generativemodeling) has grown in popularity with the proliferation of time series models,text-to-image models, and especially large language models. Fundamentally, thegoal of classical factor modeling is statistical modeling of stock returns, andin this work, we explore using deep generative modeling to enhance classicalfactor models. Prior work has explored the use of deep generative models inorder to model hundreds of stocks, leading to accurate risk forecasting andalpha portfolio construction; however, that specific model does not allow foreasy factor modeling interpretation in that the factor exposures cannot bededuced. In this work, we introduce NeuralFactors, a novel machine-learningbased approach to factor analysis where a neural network outputs factorexposures and factor returns, trained using the same methodology as variationalautoencoders. We show that this model outperforms prior approaches both interms of log-likelihood performance and computational efficiency. Further, weshow that this method is competitive to prior work in generating realisticsynthetic data, covariance estimation, risk analysis (e.g., value at risk, orVaR, of portfolios), and portfolio optimization. Finally, due to the connectionto classical factor analysis, we analyze how the factors our model learnscluster together and show that the factor exposures could be used for embeddingstocks.

随着时间序列模型、文本到图像模型，尤其是大型语言模型的普及，机器学习在统计建模（也就是生成建模）中的应用越来越受欢迎。从根本上说，经典因子建模的目标是对股票回报率进行统计建模，而在这项工作中，我们探索使用深度生成建模来增强经典因子模型。之前的工作已经探索了使用深度生成模型对数百种股票进行建模，从而实现准确的风险预测和阿尔法投资组合构建；但是，这种特定模型无法对因子暴露进行教育，因此无法轻松地进行因子建模解释。在这项工作中，我们引入了神经因子，这是一种基于机器学习的新型因子分析方法，由神经网络输出因子风险敞口和因子收益，并使用与变异自动编码器相同的方法进行训练。我们的研究表明，该模型在对数似然性能和计算效率方面都优于之前的方法。此外，我们还展示了这种方法在生成真实合成数据、协方差估计、风险分析（如投资组合的风险价值）和投资组合优化方面与之前的工作相比具有竞争力。最后，由于与经典因子分析的联系，我们分析了我们的模型所学习的因子是如何聚集在一起的，并表明因子暴露可用于嵌入股票。

{"title":"NeuralFactors: A Novel Factor Learning Approach to Generative Modeling of Equities","authors":"Achintya Gopal","doi":"arxiv-2408.01499","DOIUrl":"https://doi.org/arxiv-2408.01499","url":null,"abstract":"The use of machine learning for statistical modeling (and thus, generative\u0000modeling) has grown in popularity with the proliferation of time series models,\u0000text-to-image models, and especially large language models. Fundamentally, the\u0000goal of classical factor modeling is statistical modeling of stock returns, and\u0000in this work, we explore using deep generative modeling to enhance classical\u0000factor models. Prior work has explored the use of deep generative models in\u0000order to model hundreds of stocks, leading to accurate risk forecasting and\u0000alpha portfolio construction; however, that specific model does not allow for\u0000easy factor modeling interpretation in that the factor exposures cannot be\u0000deduced. In this work, we introduce NeuralFactors, a novel machine-learning\u0000based approach to factor analysis where a neural network outputs factor\u0000exposures and factor returns, trained using the same methodology as variational\u0000autoencoders. We show that this model outperforms prior approaches both in\u0000terms of log-likelihood performance and computational efficiency. Further, we\u0000show that this method is competitive to prior work in generating realistic\u0000synthetic data, covariance estimation, risk analysis (e.g., value at risk, or\u0000VaR, of portfolios), and portfolio optimization. Finally, due to the connection\u0000to classical factor analysis, we analyze how the factors our model learns\u0000cluster together and show that the factor exposures could be used for embedding\u0000stocks.","PeriodicalId":501139,"journal":{"name":"arXiv - QuantFin - Statistical Finance","volume":"193 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141933528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

NeuralBeta: Estimating Beta Using Deep Learning NeuralBeta：使用深度学习估算贝塔值

arXiv - QuantFin - Statistical Finance

Pub Date : 2024-08-02 DOI: arxiv-2408.01387

Yuxin Liu, Jimin Lin, Achintya Gopal

Traditional approaches to estimating beta in finance often involve rigidassumptions and fail to adequately capture beta dynamics, limiting theireffectiveness in use cases like hedging. To address these limitations, we havedeveloped a novel method using neural networks called NeuralBeta, which iscapable of handling both univariate and multivariate scenarios and tracking thedynamic behavior of beta. To address the issue of interpretability, weintroduce a new output layer inspired by regularized weighted linearregression, which provides transparency into the model's decision-makingprocess. We conducted extensive experiments on both synthetic and market data,demonstrating NeuralBeta's superior performance compared to benchmark methodsacross various scenarios, especially instances where beta is highlytime-varying, e.g., during regime shifts in the market. This model not onlyrepresents an advancement in the field of beta estimation, but also showspotential for applications in other financial contexts that assume linearrelationships.

金融领域估算贝塔值的传统方法往往涉及僵化的假设，无法充分捕捉贝塔值的动态变化，从而限制了其在对冲等应用案例中的有效性。为了解决这些局限性，我们利用神经网络开发了一种名为 NeuralBeta 的新方法，它能够处理单变量和多变量情况，并跟踪贝塔系数的动态行为。为了解决可解释性问题，我们从正则化加权线性回归中获得灵感，引入了一个新的输出层，为模型的决策过程提供了透明度。我们在合成数据和市场数据上进行了大量实验，证明神经贝塔模型在各种情况下，尤其是在贝塔值高度时变的情况下，如市场制度转变期间，与基准方法相比具有更优越的性能。该模型不仅代表了贝塔估算领域的进步，还显示了在其他假设线性关系的金融环境中的应用潜力。

引用次数: 0

Inferring financial stock returns correlation from complex network analysis 从复杂网络分析中推断金融股回报相关性

arXiv - QuantFin - Statistical Finance

Pub Date : 2024-07-29 DOI: arxiv-2407.20380

Ixandra Achitouv

Financial stock returns correlations have been studied in the prism of randommatrix theory, to distinguish the signal from the "noise". Eigenvalues of thematrix that are above the rescaled Marchenko Pastur distribution can beinterpreted as collective modes behavior while the modes under are usuallyconsidered as noise. In this analysis we use complex network analysis tosimulate the "noise" and the "market" component of the return correlations, byintroducing some meaningful correlations in simulated geometric Brownian motionfor the stocks. We find that the returns correlation matrix is dominated bystocks with high eigenvector centrality and clustering found in the network. Wethen use simulated "market" random walks to build an optimal portfolio and findthat the overall return performs better than using the historical mean-variancedata, up to 50% on short time scale.

为了区分信号和 "噪音"，人们从 randommatrix 理论的棱镜中研究了金融股票收益相关性。矩阵中高于马琴科-帕斯特分布的特征值可被解释为集体模式行为，而低于该分布的模式通常被视为噪声。在本分析中，我们使用复杂网络分析来模拟收益相关性中的 "噪声 "和 "市场 "部分，在模拟股票的几何布朗运动中引入一些有意义的相关性。我们发现，收益相关矩阵主要由网络中具有高特征向量中心性和聚类的股票构成。我们使用模拟的 "市场 "随机漫步来构建最优投资组合，发现整体回报率比使用历史均值-方差数据要好，在短时间内最高可达 50%。

引用次数: 0

Contrastive Learning of Asset Embeddings from Financial Time Series 从金融时间序列对比学习资产嵌入

arXiv - QuantFin - Statistical Finance

Pub Date : 2024-07-26 DOI: arxiv-2407.18645

Rian Dolphin, Barry Smyth, Ruihai Dong

Representation learning has emerged as a powerful paradigm for extractingvaluable latent features from complex, high-dimensional data. In financialdomains, learning informative representations for assets can be used for taskslike sector classification, and risk management. However, the complex andstochastic nature of financial markets poses unique challenges. We propose anovel contrastive learning framework to generate asset embeddings fromfinancial time series data. Our approach leverages the similarity of assetreturns over many subwindows to generate informative positive and negativesamples, using a statistical sampling strategy based on hypothesis testing toaddress the noisy nature of financial data. We explore various contrastive lossfunctions that capture the relationships between assets in different ways tolearn a discriminative representation space. Experiments on real-world datasetsdemonstrate the effectiveness of the learned asset embeddings on benchmarkindustry classification and portfolio optimization tasks. In each case ournovel approaches significantly outperform existing baselines highlighting thepotential for contrastive learning to capture meaningful and actionablerelationships in financial data.

表征学习已成为从复杂的高维数据中提取有价值的潜在特征的强大范例。在金融领域，学习资产的信息表征可用于行业分类和风险管理等任务。然而，金融市场的复杂性和随机性带来了独特的挑战。我们提出了一种新的对比学习框架，用于从金融时间序列数据中生成资产嵌入。我们的方法利用许多子窗口中资产回报的相似性来生成信息丰富的正样本和负样本，并使用基于假设检验的统计抽样策略来解决金融数据的噪声特性。我们探索了各种对比损失函数，它们以不同的方式捕捉资产之间的关系，从而学习出一个具有区分性的表示空间。在真实世界数据集上的实验证明了所学资产嵌入在基准行业分类和投资组合优化任务中的有效性。在每种情况下，我们的新方法都明显优于现有的基线，突出了对比学习在捕捉金融数据中有意义和可操作的关系方面的潜力。

{"title":"Contrastive Learning of Asset Embeddings from Financial Time Series","authors":"Rian Dolphin, Barry Smyth, Ruihai Dong","doi":"arxiv-2407.18645","DOIUrl":"https://doi.org/arxiv-2407.18645","url":null,"abstract":"Representation learning has emerged as a powerful paradigm for extracting\u0000valuable latent features from complex, high-dimensional data. In financial\u0000domains, learning informative representations for assets can be used for tasks\u0000like sector classification, and risk management. However, the complex and\u0000stochastic nature of financial markets poses unique challenges. We propose a\u0000novel contrastive learning framework to generate asset embeddings from\u0000financial time series data. Our approach leverages the similarity of asset\u0000returns over many subwindows to generate informative positive and negative\u0000samples, using a statistical sampling strategy based on hypothesis testing to\u0000address the noisy nature of financial data. We explore various contrastive loss\u0000functions that capture the relationships between assets in different ways to\u0000learn a discriminative representation space. Experiments on real-world datasets\u0000demonstrate the effectiveness of the learned asset embeddings on benchmark\u0000industry classification and portfolio optimization tasks. In each case our\u0000novel approaches significantly outperform existing baselines highlighting the\u0000potential for contrastive learning to capture meaningful and actionable\u0000relationships in financial data.","PeriodicalId":501139,"journal":{"name":"arXiv - QuantFin - Statistical Finance","volume":"42 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141866155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Financial Statement Analysis with Large Language Models 利用大型语言模型分析财务报表

arXiv - QuantFin - Statistical Finance

Pub Date : 2024-07-25 DOI: arxiv-2407.17866

Alex Kim, Maximilian Muhn, Valeri Nikolaev

We investigate whether an LLM can successfully perform financial statementanalysis in a way similar to a professional human analyst. We providestandardized and anonymous financial statements to GPT4 and instruct the modelto analyze them to determine the direction of future earnings. Even without anynarrative or industry-specific information, the LLM outperforms financialanalysts in its ability to predict earnings changes. The LLM exhibits arelative advantage over human analysts in situations when the analysts tend tostruggle. Furthermore, we find that the prediction accuracy of the LLM is onpar with the performance of a narrowly trained state-of-the-art ML model. LLMprediction does not stem from its training memory. Instead, we find that theLLM generates useful narrative insights about a company's future performance.Lastly, our trading strategies based on GPT's predictions yield a higher Sharperatio and alphas than strategies based on other models. Taken together, ourresults suggest that LLMs may take a central role in decision-making.

我们研究了 LLM 能否以类似于人类专业分析师的方式成功地进行财务报表分析。我们向 GPT4 提供了标准化的匿名财务报表，并指示模型对其进行分析，以确定未来收益的方向。即使没有任何叙述性信息或特定行业信息，LLM 在预测盈利变化的能力上也优于金融分析师。在分析师容易陷入困境的情况下，LLM 比人类分析师更具优势。此外，我们还发现 LLM 的预测准确率与经过严格训练的最先进 ML 模型的性能相当。LLM 的预测能力并非源于它的训练记忆。最后，与基于其他模型的策略相比，我们基于 GPT 预测的交易策略产生了更高的夏普比率（Sharperatio）和阿尔法比率（Alphas）。总之，我们的研究结果表明，LLM 可能会在决策中发挥核心作用。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

arXiv - QuantFin - Statistical Finance

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀