首页 > 最新文献

Journal of Finance and Data Science最新文献

英文 中文
Liquidity risk analysis via drawdown-based measures 以缩减为基础的流动性风险分析
Q1 Mathematics Pub Date : 2024-09-24 DOI: 10.1016/j.jfds.2024.100138
Trading volumes are key variables in determining the degree of an asset's liquidity. We examine the volume drawdown process and crash recovery measures in rolling-time windows to assess exposure to liquidity risk. The time-varying windows protect our financial indicators from the massive amount of volume transactions that characterize the opening and closing of the stock market. The empirical study is carried out for three Nasdaq-listed assets from April to September 2022. Firstly, we shape all of the volume time series using a weighted-indexed semi-Markov (WISMC) model, as well as the EGARCH and GJR models for comparisons. Next, we calculate drawdown-based risk measures on real and synthetic data, simulated from all the considered econometric models. Finally, we employ the Kullback-Leibler divergence to compare real and simulated risk indicators. Results reveal that the WISMC model reproduces all the drawdown-based risk measures better than the EGARCH and GJR models do for all the considered stocks.
交易量是决定资产流动性程度的关键变量。我们研究滚动时间窗口中的交易量缩减过程和暴跌恢复措施,以评估流动性风险敞口。时变窗口可以保护我们的金融指标不受股票市场开盘和收盘时大量交易量的影响。实证研究针对 2022 年 4 月至 9 月期间在纳斯达克上市的三种资产。首先,我们使用加权指数化半马尔科夫(WISMC)模型以及 EGARCH 和 GJR 模型对所有交易量时间序列进行塑造,以进行比较。接下来,我们根据所有考虑过的计量经济学模型模拟的真实数据和合成数据计算基于缩减的风险度量。最后,我们使用 Kullback-Leibler 发散度来比较真实和模拟风险指标。结果显示,就所有考虑的股票而言,WISMC 模型比 EGARCH 和 GJR 模型更好地再现了所有基于缩减的风险指标。
{"title":"Liquidity risk analysis via drawdown-based measures","authors":"","doi":"10.1016/j.jfds.2024.100138","DOIUrl":"10.1016/j.jfds.2024.100138","url":null,"abstract":"<div><div>Trading volumes are key variables in determining the degree of an asset's liquidity. We examine the volume drawdown process and crash recovery measures in rolling-time windows to assess exposure to liquidity risk. The time-varying windows protect our financial indicators from the massive amount of volume transactions that characterize the opening and closing of the stock market. The empirical study is carried out for three Nasdaq-listed assets from April to September 2022. Firstly, we shape all of the volume time series using a weighted-indexed semi-Markov (WISMC) model, as well as the EGARCH and GJR models for comparisons. Next, we calculate drawdown-based risk measures on real and synthetic data, simulated from all the considered econometric models. Finally, we employ the Kullback-Leibler divergence to compare real and simulated risk indicators. Results reveal that the WISMC model reproduces all the drawdown-based risk measures better than the EGARCH and GJR models do for all the considered stocks.</div></div>","PeriodicalId":36340,"journal":{"name":"Journal of Finance and Data Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142356887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reinforcement prompting for financial synthetic data generation 金融合成数据生成的强化提示
Q1 Mathematics Pub Date : 2024-08-03 DOI: 10.1016/j.jfds.2024.100137

The emergence of Large Language Models (LLMs) has unlocked unprecedented potential for comprehending and generating human-like text, fueling advances in the finance domain – a tool that can shape investment strategies and market predictions. Nevertheless, challenges stemming from the necessity for extensive labeled data and the imperative for data privacy remain. The generation of high-quality synthetic data emerges as a promising avenue to circumvent these issues. In this paper, we introduce a novel methodology, named “Reinforcement Prompting”, to address these challenges. Our strategy employs a policy network as a Selector to generate prompts, and an LLM as an Executor to produce financial synthetic data. This synthetic data generation process preserves data privacy and mitigates the dependency on real-world labeled datasets. We validate the effectiveness of our approach through experimental evaluations. Our results indicate that models trained on synthetic data generated via our approach exhibit competitive performance when compared to those trained on actual financial data, thereby bridging the performance gap. This research provides a novel solution to the challenges of data privacy and labeled data scarcity in financial sentiment analysis, offering considerable advancement in the field of financial machine learning.

大型语言模型(LLMs)的出现为理解和生成类人文本释放了前所未有的潜力,推动了金融领域的进步--这是一种可以制定投资策略和市场预测的工具。然而,由于需要大量标注数据以及数据隐私的必要性,挑战依然存在。生成高质量的合成数据是规避这些问题的一条大有可为的途径。在本文中,我们介绍了一种名为 "强化提示 "的新方法来应对这些挑战。我们的策略采用策略网络作为选择器来生成提示,并采用 LLM 作为执行器来生成金融合成数据。这种合成数据生成过程保护了数据隐私,并减轻了对真实世界标记数据集的依赖。我们通过实验评估验证了我们方法的有效性。结果表明,通过我们的方法生成的合成数据上训练的模型与实际金融数据上训练的模型相比,表现出极具竞争力的性能,从而缩小了性能差距。这项研究为金融情感分析中的数据隐私和标记数据稀缺难题提供了一种新颖的解决方案,为金融机器学习领域带来了巨大的进步。
{"title":"Reinforcement prompting for financial synthetic data generation","authors":"","doi":"10.1016/j.jfds.2024.100137","DOIUrl":"10.1016/j.jfds.2024.100137","url":null,"abstract":"<div><p>The emergence of Large Language Models (LLMs) has unlocked unprecedented potential for comprehending and generating human-like text, fueling advances in the finance domain – a tool that can shape investment strategies and market predictions. Nevertheless, challenges stemming from the necessity for extensive labeled data and the imperative for data privacy remain. The generation of high-quality synthetic data emerges as a promising avenue to circumvent these issues. In this paper, we introduce a novel methodology, named “Reinforcement Prompting”, to address these challenges. Our strategy employs a policy network as a Selector to generate prompts, and an LLM as an Executor to produce financial synthetic data. This synthetic data generation process preserves data privacy and mitigates the dependency on real-world labeled datasets. We validate the effectiveness of our approach through experimental evaluations. Our results indicate that models trained on synthetic data generated via our approach exhibit competitive performance when compared to those trained on actual financial data, thereby bridging the performance gap. This research provides a novel solution to the challenges of data privacy and labeled data scarcity in financial sentiment analysis, offering considerable advancement in the field of financial machine learning.</p></div>","PeriodicalId":36340,"journal":{"name":"Journal of Finance and Data Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2405918824000229/pdfft?md5=00bc590d50782ff3979a1146c9c7d2aa&pid=1-s2.0-S2405918824000229-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141992690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on credit card default repayment prediction model 信用卡违约还款预测模型研究
Q1 Mathematics Pub Date : 2024-07-11 DOI: 10.1016/j.jfds.2024.100136

This study compares the predictive ability of various machine learning models for credit card default repayment within different prediction frameworks, using data from a commercial bank in China. Firstly, utilizing different tree models, we explore the impact on post-default repayment of different factors. Next, a split-sample time series prediction is carried out with two neural network algorithms, BPNN and ELM. The outcomes indicate that, ELM yields a significantly superior prediction performance compared to the BPNN model. Thirdly, the predictive performances of ten machine learning models are compared using full-sample data. The findings demonstrate that XGBoost and ELM models have superior predictive performances in full-sample analyses. Fourthly, this study employs the EMD data decomposition technique to examine the predictive ability of the XGBoost and ELM models in various frequency data. The results indicate that the predictive efficacy may differ depending on the frequency and repayment period after default. The findings are valuable for commercial banks in developing a framework and selecting a methodology to address the challenge of predicting credit card default payments.

本研究利用中国一家商业银行的数据,比较了各种机器学习模型在不同预测框架下对信用卡违约还款的预测能力。首先,利用不同的树模型,我们探讨了不同因素对违约后还款的影响。其次,利用 BPNN 和 ELM 两种神经网络算法进行分样本时间序列预测。结果表明,ELM 的预测性能明显优于 BPNN 模型。第三,使用全样本数据比较了十种机器学习模型的预测性能。结果表明,在全样本分析中,XGBoost 和 ELM 模型具有更优越的预测性能。第四,本研究采用 EMD 数据分解技术来检验 XGBoost 和 ELM 模型在各种频率数据中的预测能力。结果表明,违约频率和违约后的还款期不同,预测效果也可能不同。研究结果对商业银行制定框架和选择方法以应对信用卡违约还款预测挑战很有价值。
{"title":"Research on credit card default repayment prediction model","authors":"","doi":"10.1016/j.jfds.2024.100136","DOIUrl":"10.1016/j.jfds.2024.100136","url":null,"abstract":"<div><p>This study compares the predictive ability of various machine learning models for credit card default repayment within different prediction frameworks, using data from a commercial bank in China. Firstly, utilizing different tree models, we explore the impact on post-default repayment of different factors. Next, a split-sample time series prediction is carried out with two neural network algorithms, BPNN and ELM. The outcomes indicate that, ELM yields a significantly superior prediction performance compared to the BPNN model. Thirdly, the predictive performances of ten machine learning models are compared using full-sample data. The findings demonstrate that XGBoost and ELM models have superior predictive performances in full-sample analyses. Fourthly, this study employs the EMD data decomposition technique to examine the predictive ability of the XGBoost and ELM models in various frequency data. The results indicate that the predictive efficacy may differ depending on the frequency and repayment period after default. The findings are valuable for commercial banks in developing a framework and selecting a methodology to address the challenge of predicting credit card default payments.</p></div>","PeriodicalId":36340,"journal":{"name":"Journal of Finance and Data Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2405918824000217/pdfft?md5=21435a376ab3e2fef9741931c14d8cf4&pid=1-s2.0-S2405918824000217-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141638741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CPC-SAX: Data mining of financial chart patterns with symbolic aggregate approXimation and instance-based multilabel classification 利用深度学习和统计学习对金融数据进行集合预测
Q1 Mathematics Pub Date : 2024-06-03 DOI: 10.1016/j.jfds.2024.100132
Konstantinos Nikolaou

In order to be able to classify financial chart patterns through machine learning, we introduced and applied a novel classification algorithm on time series data of different financial assets through SAX (Symbolic Aggregate approXimation), a transformation algorithm. After applying a linear regression model on the features of a dataset to reduce the number of parameters needed, converting real valued data to strings of characters through Piecewise Aggregate Approximation (PAA) and labelling each level increasingly with Latin alphabets characters, the new algorithm called CPC-SAX (Chart Pattern Classification) compares vectors describing the ASCII value changes along the string and classifies them using already labelled SAX-transformed data. The results show satisfying accuracy scores on data of different time windows and types of assets. We also obtain information on the appearance of said patterns. By reaching our goal of properly classifying chart patterns as they appear, we can have a better indication of the future price trend, allowing the investor/trader to make better informed decisions.

为了能够通过机器学习对金融图表模式进行分类,我们通过一种转换算法 SAX(Symbolic Aggregate approXimation),在不同金融资产的时间序列数据上引入并应用了一种新颖的分类算法。在对数据集的特征应用线性回归模型以减少所需的参数数量、通过分片聚合近似法(PAA)将实值数据转换为字符串并越来越多地使用拉丁字母字符标记每个级别之后,名为 CPC-SAX 的新算法(图表模式分类)比较了描述字符串沿 ASCII 值变化的向量,并使用已标记的 SAX 转换数据对其进行分类。结果显示,在不同时间窗口和资产类型的数据上,准确率都令人满意。我们还获得了有关上述模式外观的信息。通过实现对出现的图表模式进行正确分类的目标,我们可以更好地了解未来的价格趋势,从而让投资者/交易者做出更明智的决策。
{"title":"CPC-SAX: Data mining of financial chart patterns with symbolic aggregate approXimation and instance-based multilabel classification","authors":"Konstantinos Nikolaou","doi":"10.1016/j.jfds.2024.100132","DOIUrl":"10.1016/j.jfds.2024.100132","url":null,"abstract":"<div><p>In order to be able to classify financial chart patterns through machine learning, we introduced and applied a novel classification algorithm on time series data of different financial assets through SAX (Symbolic Aggregate approXimation), a transformation algorithm. After applying a linear regression model on the features of a dataset to reduce the number of parameters needed, converting real valued data to strings of characters through Piecewise Aggregate Approximation (PAA) and labelling each level increasingly with Latin alphabets characters, the new algorithm called CPC-SAX (Chart Pattern Classification) compares vectors describing the ASCII value changes along the string and classifies them using already labelled SAX-transformed data. The results show satisfying accuracy scores on data of different time windows and types of assets. We also obtain information on the appearance of said patterns. By reaching our goal of properly classifying chart patterns as they appear, we can have a better indication of the future price trend, allowing the investor/trader to make better informed decisions.</p></div>","PeriodicalId":36340,"journal":{"name":"Journal of Finance and Data Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2405918824000175/pdfft?md5=231f2d62031e05b4e39adbf2530d03c2&pid=1-s2.0-S2405918824000175-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141281391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explicit formulae for the valuation of European options with price impacts 具有价格影响的欧式期权估值的明确公式
Q1 Mathematics Pub Date : 2024-05-24 DOI: 10.1016/j.jfds.2024.100133
Gerardo Hernández-del-Valle , Julio César Rodríguez-Burgos , Héctor Jasso-Fuentes

In this work, we examine the consequences of trading a large position in vanilla European options within a multi-period binomial model framework for the underlying asset price, S. Given the significant size of the transaction, we expect both the derivative's price and the underlying asset's price to be affected by market impacts. Consequently, derivative valuation should incorporate these effects. To address this, we not only utilize a multi-period binomial model to represent the price process S but also incorporate trading impacts in a multiplicative manner.

Moreover, we conduct our analysis in discrete time to better capture the influence of price impacts. Our findings suggest, for instance, that the strike price should be determined by both the trade's magnitude and parameterized market impacts. We present explicit formulas for European option prices under market impacts and offer numerical examples to elucidate our findings. Upon request, we can provide code implemented in the statistical package R.

在这项工作中,我们研究了在多期二叉模型框架内交易大量虚值欧式期权头寸对标的资产价格 S 的影响。鉴于交易规模巨大,我们预计衍生品价格和标的资产价格都会受到市场影响。因此,衍生品估值应考虑到这些影响。为了解决这个问题,我们不仅使用了多期二叉模型来表示价格过程 S,还以乘法的方式纳入了交易影响。例如,我们的研究结果表明,执行价格应由交易规模和参数化的市场影响共同决定。我们提出了市场影响下欧式期权价格的明确公式,并提供了数字实例来阐明我们的发现。如有需要,我们可以提供用 R 统计软件包实现的代码。
{"title":"Explicit formulae for the valuation of European options with price impacts","authors":"Gerardo Hernández-del-Valle ,&nbsp;Julio César Rodríguez-Burgos ,&nbsp;Héctor Jasso-Fuentes","doi":"10.1016/j.jfds.2024.100133","DOIUrl":"10.1016/j.jfds.2024.100133","url":null,"abstract":"<div><p>In this work, we examine the consequences of trading a large position in vanilla European options within a multi-period binomial model framework for the underlying asset price, <em>S</em>. Given the significant size of the transaction, we expect both the derivative's price and the underlying asset's price to be affected by market impacts. Consequently, derivative valuation should incorporate these effects. To address this, we not only utilize a multi-period binomial model to represent the price process <em>S</em> but also incorporate trading impacts in a multiplicative manner.</p><p>Moreover, we conduct our analysis in discrete time to better capture the influence of price impacts. Our findings suggest, for instance, that the strike price should be determined by both the trade's magnitude and parameterized market impacts. We present explicit formulas for European option prices under market impacts and offer numerical examples to elucidate our findings. Upon request, we can provide code implemented in the statistical package <em>R</em>.</p></div>","PeriodicalId":36340,"journal":{"name":"Journal of Finance and Data Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2405918824000187/pdfft?md5=e4f7c9fff11deba41d42f03de17167a5&pid=1-s2.0-S2405918824000187-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141134202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Revising data collection methodology - evidence from the Australian financial sector 修订数据收集方法--来自澳大利亚金融业的证据
Q1 Mathematics Pub Date : 2024-05-21 DOI: 10.1016/j.jfds.2024.100131
Ben Neilson , Tom Marty , Nat Daley

Time requirements of data collection account for a significant portion of the total time required to provide financial advice. This research applies data collection software to the financial planning process seeking to identify benefits that may assist to reduce rising barriers of accessing financial advice. Experimental two-phase study seeks qualitative input surrounding problematic themes before quantitative input records impacts of data collection software use. The research seeks to evidence beneficial impacts that software use may have on the data collection requirements by way of comparison between traditional and software methodologies in Australian professional practice. Respondents were asked to complete data collection inputs using both traditional and digital methods with metrics recorded throughout the process. Input from 112 consumers and 71 practising advisers were recorded. Results suggest the use of software may decrease time taken to complete task and often results in higher levels of data accuracy. Traditional methods were affiliated with extended time periods and lower levels of data accuracy. Results aim to evolve methods of traditional practise within the financial sector. The research provides original contributions to financial planning literature by examining the potential impact data collection methodologies may have on reducing barriers to accessing financial services in Australia.

在提供财务咨询所需的总时间中,数据收集所需的时间占了很大一部分。本研究将数据收集软件应用于财务规划过程,旨在确定其优势,以帮助减少获取财务建议过程中不断增加的障碍。实验性研究分为两个阶段,在定量输入记录数据收集软件使用的影响之前,先围绕问题主题寻求定性输入。研究试图通过比较澳大利亚专业实践中的传统方法和软件方法,证明软件使用可能对数据收集要求产生的有益影响。受访者被要求使用传统方法和数字方法完成数据收集输入,并在整个过程中记录指标。112 名消费者和 71 名执业顾问提供的信息均被记录在案。结果表明,使用软件可以减少完成任务所需的时间,通常还能提高数据的准确性。而传统方法耗时较长,数据准确性较低。研究结果旨在改进金融行业的传统实践方法。本研究通过考察数据收集方法对减少澳大利亚金融服务获取障碍的潜在影响,为金融规划文献做出了原创性贡献。
{"title":"Revising data collection methodology - evidence from the Australian financial sector","authors":"Ben Neilson ,&nbsp;Tom Marty ,&nbsp;Nat Daley","doi":"10.1016/j.jfds.2024.100131","DOIUrl":"10.1016/j.jfds.2024.100131","url":null,"abstract":"<div><p>Time requirements of data collection account for a significant portion of the total time required to provide financial advice. This research applies data collection software to the financial planning process seeking to identify benefits that may assist to reduce rising barriers of accessing financial advice. Experimental two-phase study seeks qualitative input surrounding problematic themes before quantitative input records impacts of data collection software use. The research seeks to evidence beneficial impacts that software use may have on the data collection requirements by way of comparison between traditional and software methodologies in Australian professional practice. Respondents were asked to complete data collection inputs using both traditional and digital methods with metrics recorded throughout the process. Input from 112 consumers and 71 practising advisers were recorded. Results suggest the use of software may decrease time taken to complete task and often results in higher levels of data accuracy. Traditional methods were affiliated with extended time periods and lower levels of data accuracy. Results aim to evolve methods of traditional practise within the financial sector. The research provides original contributions to financial planning literature by examining the potential impact data collection methodologies may have on reducing barriers to accessing financial services in Australia.</p></div>","PeriodicalId":36340,"journal":{"name":"Journal of Finance and Data Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2405918824000163/pdfft?md5=e29161b9f336f45f98910a8ba06bb187&pid=1-s2.0-S2405918824000163-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141132579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cluster-based regression using variational inference and applications in financial forecasting 使用变异推理的聚类回归及其在金融预测中的应用
Q1 Mathematics Pub Date : 2024-05-08 DOI: 10.1016/j.jfds.2024.100130
Udai Nagpal , Krishan Nagpal

This paper describes an approach to simultaneously identify clusters and estimate cluster-specific regression parameters from the given data. Such an approach can be useful in learning the relationship between input and output when the regression parameters for estimating output are different in different regions of the input space. Variational Inference (VI), a machine learning approach to obtain posterior probability densities using optimization techniques, is used to identify clusters of explanatory variables and regression parameters for each cluster. From these results, one can obtain both the expected value and the full distribution of predicted output. Other advantages of the proposed approach include the elegant theoretical solution and clear interpretability of results. The proposed approach is well-suited for financial forecasting where markets have different regimes (or clusters) with different patterns and correlations of market changes in each regime. In financial applications, knowledge about such clusters can provide useful insights about portfolio performance and identify the relative importance of variables in different market regimes. An illustrative example of predicting one-day S&P change is considered to illustrate the approach and compare the performance of the proposed approach with standard regression without clusters. Due to the broad applicability of the problem, its elegant theoretical solution, and the computational efficiency of the proposed algorithm, the approach may be useful in a number of areas extending beyond the financial domain.

本文介绍了一种从给定数据中同时识别群组和估计特定群组回归参数的方法。当用于估计输出的回归参数在输入空间的不同区域有所不同时,这种方法有助于学习输入和输出之间的关系。变量推理(Variational Inference,VI)是一种利用优化技术获得后验概率密度的机器学习方法,用于识别解释变量群组和每个群组的回归参数。从这些结果中,我们可以获得预测输出的期望值和完整分布。所提方法的其他优点还包括:理论解决方法优美,结果解释清晰。所提出的方法非常适合金融预测,因为在金融预测中,市场有不同的制度(或群组),每个制度中的市场变化具有不同的模式和相关性。在金融应用中,有关这些群组的知识可以提供有关投资组合表现的有用见解,并确定不同市场制度中变量的相对重要性。本文以预测单日 S&P 变化为例,对该方法进行了说明,并比较了拟议方法与无聚类标准回归的性能。由于该问题的广泛适用性、其优雅的理论解决方案以及所提算法的计算效率,该方法可能会在金融领域以外的多个领域发挥作用。
{"title":"Cluster-based regression using variational inference and applications in financial forecasting","authors":"Udai Nagpal ,&nbsp;Krishan Nagpal","doi":"10.1016/j.jfds.2024.100130","DOIUrl":"https://doi.org/10.1016/j.jfds.2024.100130","url":null,"abstract":"<div><p>This paper describes an approach to simultaneously identify clusters and estimate cluster-specific regression parameters from the given data. Such an approach can be useful in learning the relationship between input and output when the regression parameters for estimating output are different in different regions of the input space. Variational Inference (VI), a machine learning approach to obtain posterior probability densities using optimization techniques, is used to identify clusters of explanatory variables and regression parameters for each cluster. From these results, one can obtain both the expected value and the full distribution of predicted output. Other advantages of the proposed approach include the elegant theoretical solution and clear interpretability of results. The proposed approach is well-suited for financial forecasting where markets have different regimes (or clusters) with different patterns and correlations of market changes in each regime. In financial applications, knowledge about such clusters can provide useful insights about portfolio performance and identify the relative importance of variables in different market regimes. An illustrative example of predicting one-day S&amp;P change is considered to illustrate the approach and compare the performance of the proposed approach with standard regression without clusters. Due to the broad applicability of the problem, its elegant theoretical solution, and the computational efficiency of the proposed algorithm, the approach may be useful in a number of areas extending beyond the financial domain.</p></div>","PeriodicalId":36340,"journal":{"name":"Journal of Finance and Data Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2405918824000151/pdfft?md5=d14569ce823f6d454daa2b2a1c4bdb82&pid=1-s2.0-S2405918824000151-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141163946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep unsupervised anomaly detection in high-frequency markets 高频市场中的深度无监督异常检测
Q1 Mathematics Pub Date : 2024-04-19 DOI: 10.1016/j.jfds.2024.100129
Cédric Poutré , Didier Chételat , Manuel Morales

Inspired by recent advances in the deep learning literature, this article introduces a novel hybrid anomaly detection framework specifically designed for limit order book (LOB) data. A modified Transformer autoencoder architecture is proposed to learn rich temporal LOB subsequence representations, which eases the separability of normal and fraudulent time series. A dissimilarity function is then learned in the representation space to characterize normal LOB behavior, enabling the detection of any anomalous subsequences out-of-sample. We also develop a complete trade-based manipulation simulation methodology able to generate a variety of scenarios derived from actual trade–based fraud cases. The complete framework is tested on LOB data of five NASDAQ stocks in which we randomly insert synthetic quote stuffing, layering, and pump-and-dump manipulations. We show that the proposed asset-independent approach achieves new state-of-the-art fraud detection performance, without requiring any prior knowledge of manipulation patterns.

受深度学习文献最新进展的启发,本文介绍了一种专为限价订单簿(LOB)数据设计的新型混合异常检测框架。本文提出了一种改进的 Transformer 自动编码器架构,用于学习丰富的时间 LOB 子序列表示,从而简化了正常时间序列和欺诈性时间序列的可分离性。然后,在表征空间中学习异质性函数,以描述正常的 LOB 行为,从而能够在样本外检测到任何异常子序列。我们还开发了一套完整的基于贸易的操纵模拟方法,能够生成源自实际贸易欺诈案例的各种情景。我们在五只纳斯达克股票的 LOB 数据上测试了完整的框架,并在其中随机插入了合成报价填充、分层和抽水操纵。我们的测试结果表明,所提出的独立于资产的方法无需事先了解操纵模式,就能实现最先进的欺诈检测性能。
{"title":"Deep unsupervised anomaly detection in high-frequency markets","authors":"Cédric Poutré ,&nbsp;Didier Chételat ,&nbsp;Manuel Morales","doi":"10.1016/j.jfds.2024.100129","DOIUrl":"https://doi.org/10.1016/j.jfds.2024.100129","url":null,"abstract":"<div><p>Inspired by recent advances in the deep learning literature, this article introduces a novel hybrid anomaly detection framework specifically designed for limit order book (LOB) data. A modified Transformer autoencoder architecture is proposed to learn rich temporal LOB subsequence representations, which eases the separability of normal and fraudulent time series. A dissimilarity function is then learned in the representation space to characterize normal LOB behavior, enabling the detection of any anomalous subsequences out-of-sample. We also develop a complete trade-based manipulation simulation methodology able to generate a variety of scenarios derived from actual trade–based fraud cases. The complete framework is tested on LOB data of five NASDAQ stocks in which we randomly insert synthetic quote stuffing, layering, and pump-and-dump manipulations. We show that the proposed asset-independent approach achieves new state-of-the-art fraud detection performance, without requiring any prior knowledge of manipulation patterns.</p></div>","PeriodicalId":36340,"journal":{"name":"Journal of Finance and Data Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S240591882400014X/pdfft?md5=438a8d321f83d33f31a8f9a74e366e8e&pid=1-s2.0-S240591882400014X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140807851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating the relationship between processes and profit: A work-based assessment of process used in Australian financial planning firms 调查流程与利润之间的关系:对澳大利亚财务规划公司使用的流程进行基于工作的评估
Q1 Mathematics Pub Date : 2024-03-05 DOI: 10.1016/j.jfds.2024.100128
Ben Neilson

The research explores relationship dynamics between process and profit in Australian professional practise. We analyse data collected from 134 financial planning firms located in Southeast Queensland as a sample size. The research introduces a complete financial planning process framework designed to measure the impact that process may have on the relationship with firm profit. Quantitative profit data was recorded using Dovetail software to capture results and evidence regression between groups. The research found that firms’ processes are positively associated with profit, and both process and profit contribute to the decreasing influence of firm agency theory. The research suggests that process could be leveraged as an asset to develop commercial advantages. The research may help identify new measures of standard practise, develop the perception of Australian financial firms and assist to reduce barriers of accessing financial services.

本研究探讨了澳大利亚专业实践中流程与利润之间的动态关系。我们以昆士兰州东南部的 134 家财务规划公司为样本,对收集到的数据进行了分析。研究引入了一个完整的财务规划流程框架,旨在衡量流程可能对公司利润关系产生的影响。使用 Dovetail 软件记录定量利润数据,以获取结果并证明组间回归。研究发现,企业流程与利润呈正相关,流程和利润都有助于降低企业代理理论的影响。研究表明,流程可以作为一种资产来开发商业优势。这项研究可能有助于确定标准做法的新措施,发展澳大利亚金融公司的观念,并帮助减少获得金融服务的障碍。
{"title":"Investigating the relationship between processes and profit: A work-based assessment of process used in Australian financial planning firms","authors":"Ben Neilson","doi":"10.1016/j.jfds.2024.100128","DOIUrl":"https://doi.org/10.1016/j.jfds.2024.100128","url":null,"abstract":"<div><p>The research explores relationship dynamics between process and profit in Australian professional practise. We analyse data collected from 134 financial planning firms located in Southeast Queensland as a sample size. The research introduces a complete financial planning process framework designed to measure the impact that process may have on the relationship with firm profit. Quantitative profit data was recorded using Dovetail software to capture results and evidence regression between groups. The research found that firms’ processes are positively associated with profit, and both process and profit contribute to the decreasing influence of firm agency theory. The research suggests that process could be leveraged as an asset to develop commercial advantages. The research may help identify new measures of standard practise, develop the perception of Australian financial firms and assist to reduce barriers of accessing financial services.</p></div>","PeriodicalId":36340,"journal":{"name":"Journal of Finance and Data Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2405918824000138/pdfft?md5=91f4bca2d9982e90a70cdc2817adf108&pid=1-s2.0-S2405918824000138-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140349945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Overlooked biases from misidentifications of causal structures 因果结构识别错误而产生的被忽视的偏差
Q1 Mathematics Pub Date : 2024-02-29 DOI: 10.1016/j.jfds.2024.100127
Simone Cenci

Testing theories and explaining phenomena in empirical finance often requires estimating causal effects from observational data. In this note, we argue that some of the standard practices to address endogeneity concerns in regression-based estimation approaches can, when not correctly implemented and their results not appropriately interpreted, generate additional, often overlooked, problems. We identify three main systemic issues in empirical finance, provide theoretical and numerical examples to illustrate and support our arguments, and propose solutions to overcome these limitations. Overall, we suggest that these issues are caused by a systematic underestimation of the importance of robust ex-ante identification, and interpretation, of causal structures in empirical studies in finance.

在实证金融学中检验理论和解释现象往往需要从观测数据中估计因果效应。在本说明中,我们认为,在基于回归的估计方法中,一些解决内生性问题的标准做法如果没有正确实施,其结果也没有得到恰当解释,就会产生更多经常被忽视的问题。我们指出了实证金融学中的三大系统性问题,提供了理论和数字实例来说明和支持我们的论点,并提出了克服这些局限性的解决方案。总体而言,我们认为这些问题是由于系统性地低估了金融实证研究中对因果结构进行稳健的事前识别和解释的重要性造成的。
{"title":"Overlooked biases from misidentifications of causal structures","authors":"Simone Cenci","doi":"10.1016/j.jfds.2024.100127","DOIUrl":"https://doi.org/10.1016/j.jfds.2024.100127","url":null,"abstract":"<div><p>Testing theories and explaining phenomena in empirical finance often requires estimating causal effects from observational data. In this note, we argue that some of the standard practices to address endogeneity concerns in regression-based estimation approaches can, when not correctly implemented and their results not appropriately interpreted, generate additional, often overlooked, problems. We identify three main systemic issues in empirical finance, provide theoretical and numerical examples to illustrate and support our arguments, and propose solutions to overcome these limitations. Overall, we suggest that these issues are caused by a systematic underestimation of the importance of robust ex-ante identification, and interpretation, of causal structures in empirical studies in finance.</p></div>","PeriodicalId":36340,"journal":{"name":"Journal of Finance and Data Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2405918824000126/pdfft?md5=9a7e5f1ee2e1cfae809f6e4fd338d7cf&pid=1-s2.0-S2405918824000126-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140162974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Finance and Data Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1