Alberto Manzano, Emanuele Nastasi, Andrea Pallavicini, Carlos Vázquez
In this article, we analyze two modeling approaches for the pricing of derivative contracts on a commodity index. The first one is a microscopic approach, where the components of the index are modeled individually, and the index price is derived from their combination. The second one is a macroscopic approach, where the index is modeled directly. While the microscopic approach offers greater flexibility, its calibration results to be more challenging, thus leading practitioners to favor the macroscopic approach. However, in the macroscopic model, the lack of explicit futures curve dynamics raises questions about its ability to accurately capture the behavior of the index and its sensitivities. In order to investigate this, we calibrate both models using derivatives of the S&P GSCI Crude Oil excess-return index and compare their pricing and sensitivities on path-dependent options, such as autocallable contracts. This research provides insights into the suitability of macroscopic models for pricing and hedging purposes in real scenarios.
{"title":"Evaluating Microscopic and Macroscopic Models for Derivative Contracts on Commodity Indices","authors":"Alberto Manzano, Emanuele Nastasi, Andrea Pallavicini, Carlos Vázquez","doi":"arxiv-2408.00784","DOIUrl":"https://doi.org/arxiv-2408.00784","url":null,"abstract":"In this article, we analyze two modeling approaches for the pricing of\u0000derivative contracts on a commodity index. The first one is a microscopic\u0000approach, where the components of the index are modeled individually, and the\u0000index price is derived from their combination. The second one is a macroscopic\u0000approach, where the index is modeled directly. While the microscopic approach\u0000offers greater flexibility, its calibration results to be more challenging,\u0000thus leading practitioners to favor the macroscopic approach. However, in the\u0000macroscopic model, the lack of explicit futures curve dynamics raises questions\u0000about its ability to accurately capture the behavior of the index and its\u0000sensitivities. In order to investigate this, we calibrate both models using\u0000derivatives of the S&P GSCI Crude Oil excess-return index and compare their\u0000pricing and sensitivities on path-dependent options, such as autocallable\u0000contracts. This research provides insights into the suitability of macroscopic\u0000models for pricing and hedging purposes in real scenarios.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"77 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141932409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dynamic knowledge graphs (DKGs) are popular structures to express different types of connections between objects over time. They can also serve as an efficient mathematical tool to represent information extracted from complex unstructured data sources, such as text or images. Within financial applications, DKGs could be used to detect trends for strategic thematic investing, based on information obtained from financial news articles. In this work, we explore the properties of large language models (LLMs) as dynamic knowledge graph generators, proposing a novel open-source fine-tuned LLM for this purpose, called the Integrated Contextual Knowledge Graph Generator (ICKG). We use ICKG to produce a novel open-source DKG from a corpus of financial news articles, called FinDKG, and we propose an attention-based GNN architecture for analysing it, called KGTransformer. We test the performance of the proposed model on benchmark datasets and FinDKG, demonstrating superior performance on link prediction tasks. Additionally, we evaluate the performance of the KGTransformer on FinDKG for thematic investing, showing it can outperform existing thematic ETFs.
{"title":"FinDKG: Dynamic Knowledge Graphs with Large Language Models for Detecting Global Trends in Financial Markets","authors":"Xiaohui Victor Li, Francesco Sanna Passino","doi":"arxiv-2407.10909","DOIUrl":"https://doi.org/arxiv-2407.10909","url":null,"abstract":"Dynamic knowledge graphs (DKGs) are popular structures to express different\u0000types of connections between objects over time. They can also serve as an\u0000efficient mathematical tool to represent information extracted from complex\u0000unstructured data sources, such as text or images. Within financial\u0000applications, DKGs could be used to detect trends for strategic thematic\u0000investing, based on information obtained from financial news articles. In this\u0000work, we explore the properties of large language models (LLMs) as dynamic\u0000knowledge graph generators, proposing a novel open-source fine-tuned LLM for\u0000this purpose, called the Integrated Contextual Knowledge Graph Generator\u0000(ICKG). We use ICKG to produce a novel open-source DKG from a corpus of\u0000financial news articles, called FinDKG, and we propose an attention-based GNN\u0000architecture for analysing it, called KGTransformer. We test the performance of\u0000the proposed model on benchmark datasets and FinDKG, demonstrating superior\u0000performance on link prediction tasks. Additionally, we evaluate the performance\u0000of the KGTransformer on FinDKG for thematic investing, showing it can\u0000outperform existing thematic ETFs.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"2012 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141719350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To predict the future movements of stock markets, numerous studies concentrate on daily data and employ various machine learning (ML) models as benchmarks that often vary and lack standardization across different research works. This paper tries to solve the problem from a fresh standpoint by aiming to predict the weekly movements, and introducing a novel benchmark of random traders. This benchmark is independent of any ML model, thus making it more objective and potentially serving as a commonly recognized standard. During training process, apart from the basic features such as technical indicators, scaling laws and directional changes are introduced as additional features, furthermore, the training datasets are also adjusted by assigning varying weights to different samples, the weighting approach allows the models to emphasize specific samples. On back-testing, several trained models show good performance, with the multi-layer perception (MLP) demonstrating stability and robustness across extensive and comprehensive data that include upward, downward and cyclic trends. The unique perspective of this work that focuses on weekly movements, incorporates new features and creates an objective benchmark, contributes to the existing literature on stock market prediction.
为了预测股票市场的未来走势,许多研究都集中在每日数据上,并采用各种机器学习(ML)模型作为基准,但不同的研究成果往往各不相同,缺乏标准化。本文试图从一个全新的角度来解决这个问题,即预测每周的走势,并引入一个随机交易者的新基准。该基准独立于任何 ML 模型,因此更具客观性,并有可能成为公认的标准。在训练过程中,除了技术指标等基本特征外,还引入了缩放规律和方向变化作为附加特征,此外,还通过为不同样本分配不同权重来调整训练数据集,权重方法允许模型强调特定样本。在回溯测试中,几个训练有素的模型表现出了良好的性能,其中多层感知(MLP)在广泛而全面的数据(包括上升、下降和周期趋势)中表现出了稳定性和稳健性。这项工作以独特的视角关注每周的走势,纳入了新的特征,并创建了一个客观的基准,为现有的股市预测文献做出了贡献。
{"title":"Machine learning in weekly movement prediction","authors":"Han Gui","doi":"arxiv-2407.09831","DOIUrl":"https://doi.org/arxiv-2407.09831","url":null,"abstract":"To predict the future movements of stock markets, numerous studies\u0000concentrate on daily data and employ various machine learning (ML) models as\u0000benchmarks that often vary and lack standardization across different research\u0000works. This paper tries to solve the problem from a fresh standpoint by aiming\u0000to predict the weekly movements, and introducing a novel benchmark of random\u0000traders. This benchmark is independent of any ML model, thus making it more\u0000objective and potentially serving as a commonly recognized standard. During\u0000training process, apart from the basic features such as technical indicators,\u0000scaling laws and directional changes are introduced as additional features,\u0000furthermore, the training datasets are also adjusted by assigning varying\u0000weights to different samples, the weighting approach allows the models to\u0000emphasize specific samples. On back-testing, several trained models show good\u0000performance, with the multi-layer perception (MLP) demonstrating stability and\u0000robustness across extensive and comprehensive data that include upward,\u0000downward and cyclic trends. The unique perspective of this work that focuses on\u0000weekly movements, incorporates new features and creates an objective benchmark,\u0000contributes to the existing literature on stock market prediction.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141719351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Over the past few decades, machine learning models have been extremely successful. As a result of axiomatic attribution methods, feature contributions have been explained more clearly and rigorously. There are, however, few studies that have examined domain knowledge in conjunction with the axioms. In this study, we examine asset pricing in finance, a field closely related to risk management. Consequently, when applying machine learning models, we must ensure that the attribution methods reflect the underlying risks accurately. In this work, we present and study several axioms derived from asset pricing domain knowledge. It is shown that while Shapley value and Integrated Gradients preserve most axioms, neither can satisfy all axioms. Using extensive analytical and empirical examples, we demonstrate how attribution methods can reflect risks and when they should not be used.
{"title":"Attribution Methods in Asset Pricing: Do They Account for Risk?","authors":"Dangxing Chen, Yuan Gao","doi":"arxiv-2407.08953","DOIUrl":"https://doi.org/arxiv-2407.08953","url":null,"abstract":"Over the past few decades, machine learning models have been extremely\u0000successful. As a result of axiomatic attribution methods, feature contributions\u0000have been explained more clearly and rigorously. There are, however, few\u0000studies that have examined domain knowledge in conjunction with the axioms. In\u0000this study, we examine asset pricing in finance, a field closely related to\u0000risk management. Consequently, when applying machine learning models, we must\u0000ensure that the attribution methods reflect the underlying risks accurately. In\u0000this work, we present and study several axioms derived from asset pricing\u0000domain knowledge. It is shown that while Shapley value and Integrated Gradients\u0000preserve most axioms, neither can satisfy all axioms. Using extensive\u0000analytical and empirical examples, we demonstrate how attribution methods can\u0000reflect risks and when they should not be used.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"54 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141719421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper studies the continuous-time q-learning in the mean-field jump-diffusion models from the representative agent's perspective. To overcome the challenge when the population distribution may not be directly observable, we introduce the integrated q-function in decoupled form (decoupled Iq-function) and establish its martingale characterization together with the value function, which provides a unified policy evaluation rule for both mean-field game (MFG) and mean-field control (MFC) problems. Moreover, depending on the task to solve the MFG or MFC problem, we can employ the decoupled Iq-function by different means to learn the mean-field equilibrium policy or the mean-field optimal policy respectively. As a result, we devise a unified q-learning algorithm for both MFG and MFC problems by utilizing all test policies stemming from the mean-field interactions. For several examples in the jump-diffusion setting, within and beyond the LQ framework, we can obtain the exact parameterization of the decoupled Iq-functions and the value functions, and illustrate our algorithm from the representative agent's perspective with satisfactory performance.
{"title":"Unified continuous-time q-learning for mean-field game and mean-field control problems","authors":"Xiaoli Wei, Xiang Yu, Fengyi Yuan","doi":"arxiv-2407.04521","DOIUrl":"https://doi.org/arxiv-2407.04521","url":null,"abstract":"This paper studies the continuous-time q-learning in the mean-field\u0000jump-diffusion models from the representative agent's perspective. To overcome\u0000the challenge when the population distribution may not be directly observable,\u0000we introduce the integrated q-function in decoupled form (decoupled\u0000Iq-function) and establish its martingale characterization together with the\u0000value function, which provides a unified policy evaluation rule for both\u0000mean-field game (MFG) and mean-field control (MFC) problems. Moreover,\u0000depending on the task to solve the MFG or MFC problem, we can employ the\u0000decoupled Iq-function by different means to learn the mean-field equilibrium\u0000policy or the mean-field optimal policy respectively. As a result, we devise a\u0000unified q-learning algorithm for both MFG and MFC problems by utilizing all\u0000test policies stemming from the mean-field interactions. For several examples\u0000in the jump-diffusion setting, within and beyond the LQ framework, we can\u0000obtain the exact parameterization of the decoupled Iq-functions and the value\u0000functions, and illustrate our algorithm from the representative agent's\u0000perspective with satisfactory performance.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"37 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141573213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deep learning techniques for predicting stock market prices is an popular topic in the field of data science. Customized feature engineering arises as pre-processing tools of different stock market dataset. In this paper, we give a graph neural network based convolutional neural network (CNN) model, that can be applied on diverse source of data, in the attempt to extract features to predict the trends of indices of text{S}&text{P} 500, NASDAQ, DJI, NYSE, and RUSSEL.
{"title":"GraphCNNpred: A stock market indices prediction using a Graph based deep learning system","authors":"Yuhui Jin","doi":"arxiv-2407.03760","DOIUrl":"https://doi.org/arxiv-2407.03760","url":null,"abstract":"Deep learning techniques for predicting stock market prices is an popular\u0000topic in the field of data science. Customized feature engineering arises as\u0000pre-processing tools of different stock market dataset. In this paper, we give\u0000a graph neural network based convolutional neural network (CNN) model, that can\u0000be applied on diverse source of data, in the attempt to extract features to\u0000predict the trends of indices of text{S}&text{P} 500, NASDAQ, DJI, NYSE, and\u0000RUSSEL.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141573476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This research dissects financial equity research reports (ERRs) by mapping their content into categories. There is insufficient empirical analysis of the questions answered in ERRs. In particular, it is not understood how frequently certain information appears, what information is considered essential, and what information requires human judgment to distill into an ERR. The study analyzes 72 ERRs sentence-by-sentence, classifying their 4940 sentences into 169 unique question archetypes. We did not predefine the questions but derived them solely from the statements in the ERRs. This approach provides an unbiased view of the content of the observed ERRs. Subsequently, we used public corporate reports to classify the questions' potential for automation. Answers were labeled "text-extractable" if the answers to the question were accessible in corporate reports. 78.7% of the questions in ERRs can be automated. Those automatable question consist of 48.2% text-extractable (suited to processing by large language models, LLMs) and 30.5% database-extractable questions. Only 21.3% of questions require human judgment to answer. We empirically validate using Llama-3-70B and GPT-4-turbo-2024-04-09 that recent advances in language generation and information extraction enable the automation of approximately 80% of the statements in ERRs. Surprisingly, the models complement each other's strengths and weaknesses well. The research confirms that the current writing process of ERRs can likely benefit from additional automation, improving quality and efficiency. The research thus allows us to quantify the potential impacts of introducing large language models in the ERR writing process. The full question list, including the archetypes and their frequency, will be made available online after peer review.
{"title":"The Structure of Financial Equity Research Reports -- Identification of the Most Frequently Asked Questions in Financial Analyst Reports to Automate Equity Research Using Llama 3 and GPT-4","authors":"Adria Pop, Jan Spörer, Siegfried Handschuh","doi":"arxiv-2407.18327","DOIUrl":"https://doi.org/arxiv-2407.18327","url":null,"abstract":"This research dissects financial equity research reports (ERRs) by mapping\u0000their content into categories. There is insufficient empirical analysis of the questions answered in ERRs.\u0000In particular, it is not understood how frequently certain information appears,\u0000what information is considered essential, and what information requires human\u0000judgment to distill into an ERR. The study analyzes 72 ERRs sentence-by-sentence, classifying their 4940\u0000sentences into 169 unique question archetypes. We did not predefine the\u0000questions but derived them solely from the statements in the ERRs. This\u0000approach provides an unbiased view of the content of the observed ERRs.\u0000Subsequently, we used public corporate reports to classify the questions'\u0000potential for automation. Answers were labeled \"text-extractable\" if the\u0000answers to the question were accessible in corporate reports. 78.7% of the questions in ERRs can be automated. Those automatable question\u0000consist of 48.2% text-extractable (suited to processing by large language\u0000models, LLMs) and 30.5% database-extractable questions. Only 21.3% of questions\u0000require human judgment to answer. We empirically validate using Llama-3-70B and GPT-4-turbo-2024-04-09 that\u0000recent advances in language generation and information extraction enable the\u0000automation of approximately 80% of the statements in ERRs. Surprisingly, the\u0000models complement each other's strengths and weaknesses well. The research confirms that the current writing process of ERRs can likely\u0000benefit from additional automation, improving quality and efficiency. The\u0000research thus allows us to quantify the potential impacts of introducing large\u0000language models in the ERR writing process. The full question list, including the archetypes and their frequency, will be\u0000made available online after peer review.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"50 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141865953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The integration of Large Language Models (LLMs) into financial analysis has garnered significant attention in the NLP community. This paper presents our solution to IJCAI-2024 FinLLM challenge, investigating the capabilities of LLMs within three critical areas of financial tasks: financial classification, financial text summarization, and single stock trading. We adopted Llama3-8B and Mistral-7B as base models, fine-tuning them through Parameter Efficient Fine-Tuning (PEFT) and Low-Rank Adaptation (LoRA) approaches. To enhance model performance, we combine datasets from task 1 and task 2 for data fusion. Our approach aims to tackle these diverse tasks in a comprehensive and integrated manner, showcasing LLMs' capacity to address diverse and complex financial tasks with improved accuracy and decision-making capabilities.
{"title":"CatMemo at the FinLLM Challenge Task: Fine-Tuning Large Language Models using Data Fusion in Financial Applications","authors":"Yupeng Cao, Zhiyuan Yao, Zhi Chen, Zhiyang Deng","doi":"arxiv-2407.01953","DOIUrl":"https://doi.org/arxiv-2407.01953","url":null,"abstract":"The integration of Large Language Models (LLMs) into financial analysis has\u0000garnered significant attention in the NLP community. This paper presents our\u0000solution to IJCAI-2024 FinLLM challenge, investigating the capabilities of LLMs\u0000within three critical areas of financial tasks: financial classification,\u0000financial text summarization, and single stock trading. We adopted Llama3-8B\u0000and Mistral-7B as base models, fine-tuning them through Parameter Efficient\u0000Fine-Tuning (PEFT) and Low-Rank Adaptation (LoRA) approaches. To enhance model\u0000performance, we combine datasets from task 1 and task 2 for data fusion. Our\u0000approach aims to tackle these diverse tasks in a comprehensive and integrated\u0000manner, showcasing LLMs' capacity to address diverse and complex financial\u0000tasks with improved accuracy and decision-making capabilities.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141520787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We process private equity transactions to predict public market behavior with a logit model. Specifically, we estimate our model to predict quarterly returns for both the broad market and for individual sectors. Our hypothesis is that private equity investments (in aggregate) carry predictive signal about publicly traded securities. The key source of such predictive signal is the fact that, during their diligence process, private equity fund managers are privy to valuable company information that may not yet be reflected in the public markets at the time of their investment. Thus, we posit that we can discover investors' collective near-term insight via detailed analysis of the timing and nature of the deals they execute. We evaluate the accuracy of the estimated model by applying it to test data where we know the correct output value. Remarkably, our model performs consistently better than a null model simply based on return statistics, while showing a predictive accuracy of up to 70% in sectors such as Consumer Services, Communications, and Non Energy Minerals.
{"title":"Predicting public market behavior from private equity deals","authors":"Paolo Barucca, Flaviano Morone","doi":"arxiv-2407.01818","DOIUrl":"https://doi.org/arxiv-2407.01818","url":null,"abstract":"We process private equity transactions to predict public market behavior with\u0000a logit model. Specifically, we estimate our model to predict quarterly returns\u0000for both the broad market and for individual sectors. Our hypothesis is that\u0000private equity investments (in aggregate) carry predictive signal about\u0000publicly traded securities. The key source of such predictive signal is the\u0000fact that, during their diligence process, private equity fund managers are\u0000privy to valuable company information that may not yet be reflected in the\u0000public markets at the time of their investment. Thus, we posit that we can\u0000discover investors' collective near-term insight via detailed analysis of the\u0000timing and nature of the deals they execute. We evaluate the accuracy of the\u0000estimated model by applying it to test data where we know the correct output\u0000value. Remarkably, our model performs consistently better than a null model\u0000simply based on return statistics, while showing a predictive accuracy of up to\u000070% in sectors such as Consumer Services, Communications, and Non Energy\u0000Minerals.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"136 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141520786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hao Shi, Cuicui Luo, Weili Song, Xinting Zhang, Xiang Ao
The variability and low signal-to-noise ratio in financial data, combined with the necessity for interpretability, make the alpha factor mining workflow a crucial component of quantitative investment. Transitioning from early manual extraction to genetic programming, the most advanced approach in this domain currently employs reinforcement learning to mine a set of combination factors with fixed weights. However, the performance of resultant alpha factors exhibits inconsistency, and the inflexibility of fixed factor weights proves insufficient in adapting to the dynamic nature of financial markets. To address this issue, this paper proposes a two-stage formulaic alpha generating framework AlphaForge, for alpha factor mining and factor combination. This framework employs a generative-predictive neural network to generate factors, leveraging the robust spatial exploration capabilities inherent in deep learning while concurrently preserving diversity. The combination model within the framework incorporates the temporal performance of factors for selection and dynamically adjusts the weights assigned to each component alpha factor. Experiments conducted on real-world datasets demonstrate that our proposed model outperforms contemporary benchmarks in formulaic alpha factor mining. Furthermore, our model exhibits a notable enhancement in portfolio returns within the realm of quantitative investment.
{"title":"AlphaForge: A Framework to Mine and Dynamically Combine Formulaic Alpha Factors","authors":"Hao Shi, Cuicui Luo, Weili Song, Xinting Zhang, Xiang Ao","doi":"arxiv-2406.18394","DOIUrl":"https://doi.org/arxiv-2406.18394","url":null,"abstract":"The variability and low signal-to-noise ratio in financial data, combined\u0000with the necessity for interpretability, make the alpha factor mining workflow\u0000a crucial component of quantitative investment. Transitioning from early manual\u0000extraction to genetic programming, the most advanced approach in this domain\u0000currently employs reinforcement learning to mine a set of combination factors\u0000with fixed weights. However, the performance of resultant alpha factors\u0000exhibits inconsistency, and the inflexibility of fixed factor weights proves\u0000insufficient in adapting to the dynamic nature of financial markets. To address\u0000this issue, this paper proposes a two-stage formulaic alpha generating\u0000framework AlphaForge, for alpha factor mining and factor combination. This\u0000framework employs a generative-predictive neural network to generate factors,\u0000leveraging the robust spatial exploration capabilities inherent in deep\u0000learning while concurrently preserving diversity. The combination model within\u0000the framework incorporates the temporal performance of factors for selection\u0000and dynamically adjusts the weights assigned to each component alpha factor.\u0000Experiments conducted on real-world datasets demonstrate that our proposed\u0000model outperforms contemporary benchmarks in formulaic alpha factor mining.\u0000Furthermore, our model exhibits a notable enhancement in portfolio returns\u0000within the realm of quantitative investment.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"35 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141501637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}