Haoren Zhu, Pengfei Zhao, Wilfred Siu Hung NG, Dik Lun Lee
Financial assets exhibit complex dependency structures, which are crucial for investors to create diversified portfolios to mitigate risk in volatile financial markets. To explore the financial asset dependencies dynamics, we propose a novel approach that models the dependencies of assets as an Asset Dependency Matrix (ADM) and treats the ADM sequences as image sequences. This allows us to leverage deep learning-based video prediction methods to capture the spatiotemporal dependencies among assets. However, unlike images where neighboring pixels exhibit explicit spatiotemporal dependencies due to the natural continuity of object movements, assets in ADM do not have a natural order. This poses challenges to organizing the relational assets to reveal better the spatiotemporal dependencies among neighboring assets for ADM forecasting. To tackle the challenges, we propose the Asset Dependency Neural Network (ADNN), which employs the Convolutional Long Short-Term Memory (ConvLSTM) network, a highly successful method for video prediction. ADNN can employ static and dynamic transformation functions to optimize the representations of the ADM. Through extensive experiments, we demonstrate that our proposed framework consistently outperforms the baselines in the ADM prediction and downstream application tasks. This research contributes to understanding and predicting asset dependencies, offering valuable insights for financial market participants.
{"title":"Financial Assets Dependency Prediction Utilizing Spatiotemporal Patterns","authors":"Haoren Zhu, Pengfei Zhao, Wilfred Siu Hung NG, Dik Lun Lee","doi":"arxiv-2406.11886","DOIUrl":"https://doi.org/arxiv-2406.11886","url":null,"abstract":"Financial assets exhibit complex dependency structures, which are crucial for\u0000investors to create diversified portfolios to mitigate risk in volatile\u0000financial markets. To explore the financial asset dependencies dynamics, we\u0000propose a novel approach that models the dependencies of assets as an Asset\u0000Dependency Matrix (ADM) and treats the ADM sequences as image sequences. This\u0000allows us to leverage deep learning-based video prediction methods to capture\u0000the spatiotemporal dependencies among assets. However, unlike images where\u0000neighboring pixels exhibit explicit spatiotemporal dependencies due to the\u0000natural continuity of object movements, assets in ADM do not have a natural\u0000order. This poses challenges to organizing the relational assets to reveal\u0000better the spatiotemporal dependencies among neighboring assets for ADM\u0000forecasting. To tackle the challenges, we propose the Asset Dependency Neural\u0000Network (ADNN), which employs the Convolutional Long Short-Term Memory\u0000(ConvLSTM) network, a highly successful method for video prediction. ADNN can\u0000employ static and dynamic transformation functions to optimize the\u0000representations of the ADM. Through extensive experiments, we demonstrate that\u0000our proposed framework consistently outperforms the baselines in the ADM\u0000prediction and downstream application tasks. This research contributes to\u0000understanding and predicting asset dependencies, offering valuable insights for\u0000financial market participants.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"111 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141520791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The accurate prediction of stock movements is crucial for investment strategies. Stock prices are subject to the influence of various forms of information, including financial indicators, sentiment analysis, news documents, and relational structures. Predominant analytical approaches, however, tend to address only unimodal or bimodal sources, neglecting the complexity of multimodal data. Further complicating the landscape are the issues of data sparsity and semantic conflicts between these modalities, which are frequently overlooked by current models, leading to unstable performance and limiting practical applicability. To address these shortcomings, this study introduces a novel architecture, named Multimodal Stable Fusion with Gated Cross-Attention (MSGCA), designed to robustly integrate multimodal input for stock movement prediction. The MSGCA framework consists of three integral components: (1) a trimodal encoding module, responsible for processing indicator sequences, dynamic documents, and a relational graph, and standardizing their feature representations; (2) a cross-feature fusion module, where primary and consistent features guide the multimodal fusion of the three modalities via a pair of gated cross-attention networks; and (3) a prediction module, which refines the fused features through temporal and dimensional reduction to execute precise movement forecasting. Empirical evaluations demonstrate that the MSGCA framework exceeds current leading methods, achieving performance gains of 8.1%, 6.1%, 21.7% and 31.6% on four multimodal datasets, respectively, attributed to its enhanced multimodal fusion stability.
{"title":"Stock Movement Prediction with Multimodal Stable Fusion via Gated Cross-Attention Mechanism","authors":"Chang Zong, Jian Shao, Weiming Lu, Yueting Zhuang","doi":"arxiv-2406.06594","DOIUrl":"https://doi.org/arxiv-2406.06594","url":null,"abstract":"The accurate prediction of stock movements is crucial for investment\u0000strategies. Stock prices are subject to the influence of various forms of\u0000information, including financial indicators, sentiment analysis, news\u0000documents, and relational structures. Predominant analytical approaches,\u0000however, tend to address only unimodal or bimodal sources, neglecting the\u0000complexity of multimodal data. Further complicating the landscape are the\u0000issues of data sparsity and semantic conflicts between these modalities, which\u0000are frequently overlooked by current models, leading to unstable performance\u0000and limiting practical applicability. To address these shortcomings, this study\u0000introduces a novel architecture, named Multimodal Stable Fusion with Gated\u0000Cross-Attention (MSGCA), designed to robustly integrate multimodal input for\u0000stock movement prediction. The MSGCA framework consists of three integral\u0000components: (1) a trimodal encoding module, responsible for processing\u0000indicator sequences, dynamic documents, and a relational graph, and\u0000standardizing their feature representations; (2) a cross-feature fusion module,\u0000where primary and consistent features guide the multimodal fusion of the three\u0000modalities via a pair of gated cross-attention networks; and (3) a prediction\u0000module, which refines the fused features through temporal and dimensional\u0000reduction to execute precise movement forecasting. Empirical evaluations\u0000demonstrate that the MSGCA framework exceeds current leading methods, achieving\u0000performance gains of 8.1%, 6.1%, 21.7% and 31.6% on four multimodal datasets,\u0000respectively, attributed to its enhanced multimodal fusion stability.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141501642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The recent advancement of deep learning architectures, neural networks, and the combination of abundant financial data and powerful computers are transforming finance, leading us to develop an advanced method for predicting future stock prices. However, the accessibility of investment and trading at everyone's fingertips made the stock markets increasingly intricate and prone to volatility. The increased complexity and volatility of the stock market have driven demand for more models, which would effectively capture high volatility and non-linear behavior of the different stock prices. This study explored gated recurrent neural network (GRNN) algorithms such as LSTM (long short-term memory), GRU (gated recurrent unit), and hybrid models like GRU-LSTM, LSTM-GRU, with Tree-structured Parzen Estimator (TPE) Bayesian optimization for hyperparameter optimization (TPE-GRNN). The aim is to improve the prediction accuracy of the next day's closing price of the NIFTY 50 index, a prominent Indian stock market index, using TPE-GRNN. A combination of eight influential factors is carefully chosen from fundamental stock data, technical indicators, crude oil price, and macroeconomic data to train the models for capturing the changes in the price of the index with the factors of the broader economy. Single-layer and multi-layer TPE-GRNN models have been developed. The models' performance is evaluated using standard matrices like R2, MAPE, and RMSE. The analysis of models' performance reveals the impact of feature selection and hyperparameter optimization (HPO) in enhancing stock index price prediction accuracy. The results show that the MAPE of our proposed TPE-LSTM method is the lowest (best) with respect to all the previous models for stock index price prediction.
{"title":"Gated recurrent neural network with TPE Bayesian optimization for enhancing stock index prediction accuracy","authors":"Bivas Dinda","doi":"arxiv-2406.02604","DOIUrl":"https://doi.org/arxiv-2406.02604","url":null,"abstract":"The recent advancement of deep learning architectures, neural networks, and\u0000the combination of abundant financial data and powerful computers are\u0000transforming finance, leading us to develop an advanced method for predicting\u0000future stock prices. However, the accessibility of investment and trading at\u0000everyone's fingertips made the stock markets increasingly intricate and prone\u0000to volatility. The increased complexity and volatility of the stock market have\u0000driven demand for more models, which would effectively capture high volatility\u0000and non-linear behavior of the different stock prices. This study explored\u0000gated recurrent neural network (GRNN) algorithms such as LSTM (long short-term\u0000memory), GRU (gated recurrent unit), and hybrid models like GRU-LSTM, LSTM-GRU,\u0000with Tree-structured Parzen Estimator (TPE) Bayesian optimization for\u0000hyperparameter optimization (TPE-GRNN). The aim is to improve the prediction\u0000accuracy of the next day's closing price of the NIFTY 50 index, a prominent\u0000Indian stock market index, using TPE-GRNN. A combination of eight influential\u0000factors is carefully chosen from fundamental stock data, technical indicators,\u0000crude oil price, and macroeconomic data to train the models for capturing the\u0000changes in the price of the index with the factors of the broader economy.\u0000Single-layer and multi-layer TPE-GRNN models have been developed. The models'\u0000performance is evaluated using standard matrices like R2, MAPE, and RMSE. The\u0000analysis of models' performance reveals the impact of feature selection and\u0000hyperparameter optimization (HPO) in enhancing stock index price prediction\u0000accuracy. The results show that the MAPE of our proposed TPE-LSTM method is the\u0000lowest (best) with respect to all the previous models for stock index price\u0000prediction.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141550542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Binomial trees are widely used in the financial sector for valuing securities with early exercise characteristics, such as American stock options. However, while effective in many scenarios, pricing options with CRR binomial trees are limited. Major limitations are volatility estimation, constant volatility assumption, subjectivity in parameter choices, and impracticality of instantaneous delta hedging. This paper presents a novel tree: Gaussian Recombining Split Tree (GRST), which is recombining and does not need log-normality or normality market assumption. GRST generates a discrete probability mass function of market data distribution, which approximates a Gaussian distribution with known parameters at any chosen time interval. GRST Mixture builds upon the GRST concept while being flexible to fit a large class of market distributions and when given a 1-D time series data and moments of distributions at each time interval, fits a Gaussian mixture with the same mixture component probabilities applied at each time interval. Gaussian Recombining Split Tre Mixture comprises several GRST tied using Gaussian mixture component probabilities at the first node. Our extensive empirical analysis shows that the option prices from the GRST align closely with the market.
{"title":"Gaussian Recombining Split Tree","authors":"Yury Lebedev, Arunava Banerjee","doi":"arxiv-2405.16333","DOIUrl":"https://doi.org/arxiv-2405.16333","url":null,"abstract":"Binomial trees are widely used in the financial sector for valuing securities\u0000with early exercise characteristics, such as American stock options. However,\u0000while effective in many scenarios, pricing options with CRR binomial trees are\u0000limited. Major limitations are volatility estimation, constant volatility\u0000assumption, subjectivity in parameter choices, and impracticality of\u0000instantaneous delta hedging. This paper presents a novel tree: Gaussian\u0000Recombining Split Tree (GRST), which is recombining and does not need\u0000log-normality or normality market assumption. GRST generates a discrete\u0000probability mass function of market data distribution, which approximates a\u0000Gaussian distribution with known parameters at any chosen time interval. GRST\u0000Mixture builds upon the GRST concept while being flexible to fit a large class\u0000of market distributions and when given a 1-D time series data and moments of\u0000distributions at each time interval, fits a Gaussian mixture with the same\u0000mixture component probabilities applied at each time interval. Gaussian\u0000Recombining Split Tre Mixture comprises several GRST tied using Gaussian\u0000mixture component probabilities at the first node. Our extensive empirical\u0000analysis shows that the option prices from the GRST align closely with the\u0000market.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"62 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141166277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper introduces a novel stochastic control framework to enhance the capabilities of automated investment managers, or robo-advisors, by accurately inferring clients' investment preferences from past activities. Our approach leverages a continuous-time model that incorporates utility functions and a generic discounting scheme of a time-varying rate, tailored to each client's risk tolerance, valuation of daily consumption, and significant life goals. We address the resulting time inconsistency issue through state augmentation and the establishment of the dynamic programming principle and the verification theorem. Additionally, we provide sufficient conditions for the identifiability of client investment preferences. To complement our theoretical developments, we propose a learning algorithm based on maximum likelihood estimation within a discrete-time Markov Decision Process framework, augmented with entropy regularization. We prove that the log-likelihood function is locally concave, facilitating the fast convergence of our proposed algorithm. Practical effectiveness and efficiency are showcased through two numerical examples, including Merton's problem and an investment problem with unhedgeable risks. Our proposed framework not only advances financial technology by improving personalized investment advice but also contributes broadly to other fields such as healthcare, economics, and artificial intelligence, where understanding individual preferences is crucial.
{"title":"Inference of Utilities and Time Preference in Sequential Decision-Making","authors":"Haoyang Cao, Zhengqi Wu, Renyuan Xu","doi":"arxiv-2405.15975","DOIUrl":"https://doi.org/arxiv-2405.15975","url":null,"abstract":"This paper introduces a novel stochastic control framework to enhance the\u0000capabilities of automated investment managers, or robo-advisors, by accurately\u0000inferring clients' investment preferences from past activities. Our approach\u0000leverages a continuous-time model that incorporates utility functions and a\u0000generic discounting scheme of a time-varying rate, tailored to each client's\u0000risk tolerance, valuation of daily consumption, and significant life goals. We\u0000address the resulting time inconsistency issue through state augmentation and\u0000the establishment of the dynamic programming principle and the verification\u0000theorem. Additionally, we provide sufficient conditions for the identifiability\u0000of client investment preferences. To complement our theoretical developments,\u0000we propose a learning algorithm based on maximum likelihood estimation within a\u0000discrete-time Markov Decision Process framework, augmented with entropy\u0000regularization. We prove that the log-likelihood function is locally concave,\u0000facilitating the fast convergence of our proposed algorithm. Practical\u0000effectiveness and efficiency are showcased through two numerical examples,\u0000including Merton's problem and an investment problem with unhedgeable risks. Our proposed framework not only advances financial technology by improving\u0000personalized investment advice but also contributes broadly to other fields\u0000such as healthcare, economics, and artificial intelligence, where understanding\u0000individual preferences is crucial.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141166351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maximilian Nägele, Jan Olle, Thomas Fösel, Remmy Zen, Florian Marquardt
Markov decision processes (MDPs) are used to model a wide variety of applications ranging from game playing over robotics to finance. Their optimal policy typically maximizes the expected sum of rewards given at each step of the decision process. However, a large class of problems does not fit straightforwardly into this framework: Non-cumulative Markov decision processes (NCMDPs), where instead of the expected sum of rewards, the expected value of an arbitrary function of the rewards is maximized. Example functions include the maximum of the rewards or their mean divided by their standard deviation. In this work, we introduce a general mapping of NCMDPs to standard MDPs. This allows all techniques developed to find optimal policies for MDPs, such as reinforcement learning or dynamic programming, to be directly applied to the larger class of NCMDPs. Focusing on reinforcement learning, we show applications in a diverse set of tasks, including classical control, portfolio optimization in finance, and discrete optimization problems. Given our approach, we can improve both final performance and training time compared to relying on standard MDPs.
{"title":"Tackling Decision Processes with Non-Cumulative Objectives using Reinforcement Learning","authors":"Maximilian Nägele, Jan Olle, Thomas Fösel, Remmy Zen, Florian Marquardt","doi":"arxiv-2405.13609","DOIUrl":"https://doi.org/arxiv-2405.13609","url":null,"abstract":"Markov decision processes (MDPs) are used to model a wide variety of\u0000applications ranging from game playing over robotics to finance. Their optimal\u0000policy typically maximizes the expected sum of rewards given at each step of\u0000the decision process. However, a large class of problems does not fit\u0000straightforwardly into this framework: Non-cumulative Markov decision processes\u0000(NCMDPs), where instead of the expected sum of rewards, the expected value of\u0000an arbitrary function of the rewards is maximized. Example functions include\u0000the maximum of the rewards or their mean divided by their standard deviation.\u0000In this work, we introduce a general mapping of NCMDPs to standard MDPs. This\u0000allows all techniques developed to find optimal policies for MDPs, such as\u0000reinforcement learning or dynamic programming, to be directly applied to the\u0000larger class of NCMDPs. Focusing on reinforcement learning, we show\u0000applications in a diverse set of tasks, including classical control, portfolio\u0000optimization in finance, and discrete optimization problems. Given our\u0000approach, we can improve both final performance and training time compared to\u0000relying on standard MDPs.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"52 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141151250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We explore brokerage between traders in an online learning framework. At any round $t$, two traders meet to exchange an asset, provided the exchange is mutually beneficial. The broker proposes a trading price, and each trader tries to sell their asset or buy the asset from the other party, depending on whether the price is higher or lower than their private valuations. A trade happens if one trader is willing to sell and the other is willing to buy at the proposed price. Previous work provided guidance to a broker aiming at enhancing traders' total earnings by maximizing the gain from trade, defined as the sum of the traders' net utilities after each interaction. In contrast, we investigate how the broker should behave to maximize the trading volume, i.e., the total number of trades. We model the traders' valuations as an i.i.d. process with an unknown distribution. If the traders' valuations are revealed after each interaction (full-feedback), and the traders' valuations cumulative distribution function (cdf) is continuous, we provide an algorithm achieving logarithmic regret and show its optimality up to constant factors. If only their willingness to sell or buy at the proposed price is revealed after each interaction ($2$-bit feedback), we provide an algorithm achieving poly-logarithmic regret when the traders' valuations cdf is Lipschitz and show that this rate is near-optimal. We complement our results by analyzing the implications of dropping the regularity assumptions on the unknown traders' valuations cdf. If we drop the continuous cdf assumption, the regret rate degrades to $Theta(sqrt{T})$ in the full-feedback case, where $T$ is the time horizon. If we drop the Lipschitz cdf assumption, learning becomes impossible in the $2$-bit feedback case.
{"title":"Trading Volume Maximization with Online Learning","authors":"Tommaso Cesari, Roberto Colomboni","doi":"arxiv-2405.13102","DOIUrl":"https://doi.org/arxiv-2405.13102","url":null,"abstract":"We explore brokerage between traders in an online learning framework. At any\u0000round $t$, two traders meet to exchange an asset, provided the exchange is\u0000mutually beneficial. The broker proposes a trading price, and each trader tries\u0000to sell their asset or buy the asset from the other party, depending on whether\u0000the price is higher or lower than their private valuations. A trade happens if\u0000one trader is willing to sell and the other is willing to buy at the proposed\u0000price. Previous work provided guidance to a broker aiming at enhancing traders'\u0000total earnings by maximizing the gain from trade, defined as the sum of the\u0000traders' net utilities after each interaction. In contrast, we investigate how\u0000the broker should behave to maximize the trading volume, i.e., the total number\u0000of trades. We model the traders' valuations as an i.i.d. process with an\u0000unknown distribution. If the traders' valuations are revealed after each\u0000interaction (full-feedback), and the traders' valuations cumulative\u0000distribution function (cdf) is continuous, we provide an algorithm achieving\u0000logarithmic regret and show its optimality up to constant factors. If only\u0000their willingness to sell or buy at the proposed price is revealed after each\u0000interaction ($2$-bit feedback), we provide an algorithm achieving\u0000poly-logarithmic regret when the traders' valuations cdf is Lipschitz and show\u0000that this rate is near-optimal. We complement our results by analyzing the\u0000implications of dropping the regularity assumptions on the unknown traders'\u0000valuations cdf. If we drop the continuous cdf assumption, the regret rate\u0000degrades to $Theta(sqrt{T})$ in the full-feedback case, where $T$ is the time\u0000horizon. If we drop the Lipschitz cdf assumption, learning becomes impossible\u0000in the $2$-bit feedback case.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141151260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Raeid Saqur, Ken Kato, Nicholas Vinden, Frank Rudzicz
We introduce and make publicly available the NIFTY Financial News Headlines dataset, designed to facilitate and advance research in financial market forecasting using large language models (LLMs). This dataset comprises two distinct versions tailored for different modeling approaches: (i) NIFTY-LM, which targets supervised fine-tuning (SFT) of LLMs with an auto-regressive, causal language-modeling objective, and (ii) NIFTY-RL, formatted specifically for alignment methods (like reinforcement learning from human feedback (RLHF)) to align LLMs via rejection sampling and reward modeling. Each dataset version provides curated, high-quality data incorporating comprehensive metadata, market indices, and deduplicated financial news headlines systematically filtered and ranked to suit modern LLM frameworks. We also include experiments demonstrating some applications of the dataset in tasks like stock price movement and the role of LLM embeddings in information acquisition/richness. The NIFTY dataset along with utilities (like truncating prompt's context length systematically) are available on Hugging Face at https://huggingface.co/datasets/raeidsaqur/NIFTY.
{"title":"NIFTY Financial News Headlines Dataset","authors":"Raeid Saqur, Ken Kato, Nicholas Vinden, Frank Rudzicz","doi":"arxiv-2405.09747","DOIUrl":"https://doi.org/arxiv-2405.09747","url":null,"abstract":"We introduce and make publicly available the NIFTY Financial News Headlines\u0000dataset, designed to facilitate and advance research in financial market\u0000forecasting using large language models (LLMs). This dataset comprises two\u0000distinct versions tailored for different modeling approaches: (i) NIFTY-LM,\u0000which targets supervised fine-tuning (SFT) of LLMs with an auto-regressive,\u0000causal language-modeling objective, and (ii) NIFTY-RL, formatted specifically\u0000for alignment methods (like reinforcement learning from human feedback (RLHF))\u0000to align LLMs via rejection sampling and reward modeling. Each dataset version\u0000provides curated, high-quality data incorporating comprehensive metadata,\u0000market indices, and deduplicated financial news headlines systematically\u0000filtered and ranked to suit modern LLM frameworks. We also include experiments\u0000demonstrating some applications of the dataset in tasks like stock price\u0000movement and the role of LLM embeddings in information acquisition/richness.\u0000The NIFTY dataset along with utilities (like truncating prompt's context length\u0000systematically) are available on Hugging Face at\u0000https://huggingface.co/datasets/raeidsaqur/NIFTY.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141059900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joseph M. Southgate, Katrina Groth, Peter Sandborn, Shapour Azarm
Recent developments in condition-based maintenance (CBM) have helped make it a promising approach to maintenance cost avoidance in engineering systems. By performing maintenance based on conditions of the component with regards to failure or time, there is potential to avoid the large costs of system shutdown and maintenance delays. However, CBM requires a large investment cost compared to other available maintenance strategies. The investment cost is required for research, development, and implementation. Despite the potential to avoid significant maintenance costs, the large investment cost of CBM makes decision makers hesitant to implement. This study is the first in the literature that attempts to address the problem of conducting a cost-benefit analysis (CBA) for implementing CBM concepts for unmanned systems. This paper proposes a method for conducting a CBA to determine the return on investment (ROI) of potential CBM strategies. The CBA seeks to compare different CBM strategies based on the differences in the various maintenance requirements associated with maintaining a multi-component, unmanned system. The proposed method uses modular dynamic fault tree analysis (MDFTA) with Monte Carlo simulations (MCS) to assess the various maintenance requirements. The proposed method is demonstrated on an unmanned surface vessel (USV) example taken from the literature that consists of 5 subsystems and 71 components. Following this USV example, it is found that selecting different combinations of components for a CBM strategy can have a significant impact on maintenance requirements and ROI by impacting cost avoidances and investment costs.
{"title":"Cost-Benefit Analysis using Modular Dynamic Fault Tree Analysis and Monte Carlo Simulations for Condition-based Maintenance of Unmanned Systems","authors":"Joseph M. Southgate, Katrina Groth, Peter Sandborn, Shapour Azarm","doi":"arxiv-2405.09519","DOIUrl":"https://doi.org/arxiv-2405.09519","url":null,"abstract":"Recent developments in condition-based maintenance (CBM) have helped make it\u0000a promising approach to maintenance cost avoidance in engineering systems. By\u0000performing maintenance based on conditions of the component with regards to\u0000failure or time, there is potential to avoid the large costs of system shutdown\u0000and maintenance delays. However, CBM requires a large investment cost compared\u0000to other available maintenance strategies. The investment cost is required for\u0000research, development, and implementation. Despite the potential to avoid\u0000significant maintenance costs, the large investment cost of CBM makes decision\u0000makers hesitant to implement. This study is the first in the literature that\u0000attempts to address the problem of conducting a cost-benefit analysis (CBA) for\u0000implementing CBM concepts for unmanned systems. This paper proposes a method\u0000for conducting a CBA to determine the return on investment (ROI) of potential\u0000CBM strategies. The CBA seeks to compare different CBM strategies based on the\u0000differences in the various maintenance requirements associated with maintaining\u0000a multi-component, unmanned system. The proposed method uses modular dynamic\u0000fault tree analysis (MDFTA) with Monte Carlo simulations (MCS) to assess the\u0000various maintenance requirements. The proposed method is demonstrated on an\u0000unmanned surface vessel (USV) example taken from the literature that consists\u0000of 5 subsystems and 71 components. Following this USV example, it is found that\u0000selecting different combinations of components for a CBM strategy can have a\u0000significant impact on maintenance requirements and ROI by impacting cost\u0000avoidances and investment costs.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141059899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We design and train machine learning models to capture the nonlinear interactions between financial market dynamics and high-frequency trading (HFT) activity. In doing so, we introduce new metrics to identify liquidity-demanding and -supplying HFT strategies. Both types of HFT strategies increase activity in response to information events and decrease it when trading speed is restricted, with liquidity-supplying strategies demonstrating greater responsiveness. Liquidity-demanding HFT is positively linked with latency arbitrage opportunities, whereas liquidity-supplying HFT is negatively related, aligning with theoretical expectations. Our metrics have implications for understanding the information production process in financial markets.
{"title":"Can machine learning unlock new insights into high-frequency trading?","authors":"G. Ibikunle, B. Moews, K. Rzayev","doi":"arxiv-2405.08101","DOIUrl":"https://doi.org/arxiv-2405.08101","url":null,"abstract":"We design and train machine learning models to capture the nonlinear\u0000interactions between financial market dynamics and high-frequency trading (HFT)\u0000activity. In doing so, we introduce new metrics to identify liquidity-demanding\u0000and -supplying HFT strategies. Both types of HFT strategies increase activity\u0000in response to information events and decrease it when trading speed is\u0000restricted, with liquidity-supplying strategies demonstrating greater\u0000responsiveness. Liquidity-demanding HFT is positively linked with latency\u0000arbitrage opportunities, whereas liquidity-supplying HFT is negatively related,\u0000aligning with theoretical expectations. Our metrics have implications for\u0000understanding the information production process in financial markets.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"1198 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141059836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}