首页 > 最新文献

The Journal of Financial Data Science最新文献

英文 中文
An Integrated Framework on Human-in-the-Loop Risk Analytics 人在循环风险分析的集成框架
Pub Date : 2022-12-15 DOI: 10.3905/jfds.2022.1.116
Peng Liu
Risk analytics is an integral component in the overall assessment of the risk profile for potential and existing obligors. For example, credit worthiness is often assessed via the use of scorecards, which are regulatory credit risk models developed based on historical data and domain expertise in banks and financial institutions. A pure statistical model, however, often fails to entertain regulatory requirements on both predictiveness and interpretability at the same time. Instead, practical risk models are developed by incorporating expert opinions within the development process, such as forcing the direction of travel for certain financial factors. In this article, the author proposes a unified framework, termed constrained and partially regularized logistic regression (CPR-LR) model, on how human inputs could be embedded in the statistical estimation procedure when developing credit risk models. By expressing such inputs as model constraints at different levels, the proposed approach serves as an effective solution to developing intuitive, easy-to-interpret, and statistically robust credit risk models, as demonstrated in the author’s experiments. This work also contributes to the growing field of human-in-the-loop model development, in which the author shows that domain expertise can be formulated as model constraints, thus biasing the resulting statistical model to be more interpretable and regulation compliant.
风险分析是对潜在和现有债务人的风险概况进行全面评估的一个组成部分。例如,信用价值通常通过使用记分卡来评估,记分卡是基于银行和金融机构的历史数据和领域专业知识开发的监管信用风险模型。然而,一个纯粹的统计模型往往不能同时满足对可预测性和可解释性的监管要求。相反,实际的风险模型是通过在开发过程中结合专家意见来开发的,例如强制某些金融因素的行进方向。在本文中,作者提出了一个统一的框架,称为约束和部分正则化逻辑回归(CPR-LR)模型,用于在开发信用风险模型时如何将人类输入嵌入到统计估计过程中。正如作者的实验所证明的那样,通过将这些输入表达为不同层次的模型约束,所提出的方法可以有效地解决开发直观、易于解释且统计上稳健的信用风险模型的问题。这项工作也有助于人在循环模型开发领域的发展,其中作者表明,领域专业知识可以被表述为模型约束,从而使结果统计模型更具可解释性和法规遵从性。
{"title":"An Integrated Framework on Human-in-the-Loop Risk Analytics","authors":"Peng Liu","doi":"10.3905/jfds.2022.1.116","DOIUrl":"https://doi.org/10.3905/jfds.2022.1.116","url":null,"abstract":"Risk analytics is an integral component in the overall assessment of the risk profile for potential and existing obligors. For example, credit worthiness is often assessed via the use of scorecards, which are regulatory credit risk models developed based on historical data and domain expertise in banks and financial institutions. A pure statistical model, however, often fails to entertain regulatory requirements on both predictiveness and interpretability at the same time. Instead, practical risk models are developed by incorporating expert opinions within the development process, such as forcing the direction of travel for certain financial factors. In this article, the author proposes a unified framework, termed constrained and partially regularized logistic regression (CPR-LR) model, on how human inputs could be embedded in the statistical estimation procedure when developing credit risk models. By expressing such inputs as model constraints at different levels, the proposed approach serves as an effective solution to developing intuitive, easy-to-interpret, and statistically robust credit risk models, as demonstrated in the author’s experiments. This work also contributes to the growing field of human-in-the-loop model development, in which the author shows that domain expertise can be formulated as model constraints, thus biasing the resulting statistical model to be more interpretable and regulation compliant.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123003377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ensemble Meta-Labeling 合奏Meta-Labeling
Pub Date : 2022-12-14 DOI: 10.3905/jfds.2022.1.114
Dennis Thumm, P. Barucca, J. Joubert
This study systematically investigates different ensemble methods for meta-labeling in finance and presents a framework to facilitate the selection of ensemble learning models for this purpose. Experiments were conducted on the components of information advantage and modeling for false positives to discover whether ensembles were better at extracting and detecting regimes and whether they increased model efficiency. The authors demonstrate that ensembles are especially beneficial when the underlying data consist of multiple regimes and are nonlinear in nature. The authors’ framework serves as a starting point for further research. They suggest that the use of different fusion strategies may foster model selection. Finally, the authors elaborate on how additional applications, such as position sizing, may benefit from their framework.
本研究系统地探讨了金融中元标签的不同集成方法,并提出了一个框架,以方便为此目的选择集成学习模型。对信息优势和假阳性建模的组成部分进行了实验,以发现集成是否在提取和检测制度方面更好,以及它们是否提高了模型效率。作者证明,当底层数据由多个区域组成并且本质上是非线性的时,集成特别有益。作者的框架可以作为进一步研究的起点。他们认为,使用不同的融合策略可能会促进模型选择。最后,作者详细说明了其他应用程序(如头寸大小)如何从他们的框架中受益。
{"title":"Ensemble Meta-Labeling","authors":"Dennis Thumm, P. Barucca, J. Joubert","doi":"10.3905/jfds.2022.1.114","DOIUrl":"https://doi.org/10.3905/jfds.2022.1.114","url":null,"abstract":"This study systematically investigates different ensemble methods for meta-labeling in finance and presents a framework to facilitate the selection of ensemble learning models for this purpose. Experiments were conducted on the components of information advantage and modeling for false positives to discover whether ensembles were better at extracting and detecting regimes and whether they increased model efficiency. The authors demonstrate that ensembles are especially beneficial when the underlying data consist of multiple regimes and are nonlinear in nature. The authors’ framework serves as a starting point for further research. They suggest that the use of different fusion strategies may foster model selection. Finally, the authors elaborate on how additional applications, such as position sizing, may benefit from their framework.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116008454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ESG Text Classification: An Application of the Prompt-Based Learning Approach ESG文本分类:基于提示学习方法的应用
Pub Date : 2022-12-14 DOI: 10.3905/jfds.2022.1.115
Zhengzheng Yang, Le Zhang, Xiaoyu Wang, Yubo Mai
Over the past decade, there is a surging trend to integrate environmental, social, and governance (ESG) criteria into financial decision making. ESG information extracted manually from text sources, such as company statements, press releases, and regulatory disclosures, can be expensive and inconsistent due to human interpretation. In this article, the authors introduce the application of prompt-based learning, a cutting-edge natural language processing (NLP) technology, to classify textual data into ESG and non-ESG categories. In particular, the authors establish a prompt-based ESG classifier, using data from Refinitiv, and benchmark it against a traditional pre-train and fine-tune classifier through statistical test. The authors fine-tune the classifiers on various sizes of training data. The experiment shows that the prompt-based learning approach outperforms the traditional pre-train and fine-tune classifier and can generate promising results when training data are limited.
在过去的十年中,将环境、社会和治理(ESG)标准纳入财务决策的趋势激增。人工从文本源(如公司声明、新闻稿和监管披露)中提取的ESG信息可能会很昂贵,而且由于人工解释而不一致。在本文中,作者介绍了基于提示的学习技术的应用,这是一种前沿的自然语言处理(NLP)技术,用于将文本数据分为ESG和非ESG类别。特别是,作者使用Refinitiv的数据建立了一个基于提示的ESG分类器,并通过统计测试将其与传统的预训练和微调分类器进行比较。作者在不同大小的训练数据上微调分类器。实验表明,基于提示的学习方法优于传统的预训练和微调分类器,在训练数据有限的情况下可以产生令人满意的结果。
{"title":"ESG Text Classification: An Application of the Prompt-Based Learning Approach","authors":"Zhengzheng Yang, Le Zhang, Xiaoyu Wang, Yubo Mai","doi":"10.3905/jfds.2022.1.115","DOIUrl":"https://doi.org/10.3905/jfds.2022.1.115","url":null,"abstract":"Over the past decade, there is a surging trend to integrate environmental, social, and governance (ESG) criteria into financial decision making. ESG information extracted manually from text sources, such as company statements, press releases, and regulatory disclosures, can be expensive and inconsistent due to human interpretation. In this article, the authors introduce the application of prompt-based learning, a cutting-edge natural language processing (NLP) technology, to classify textual data into ESG and non-ESG categories. In particular, the authors establish a prompt-based ESG classifier, using data from Refinitiv, and benchmark it against a traditional pre-train and fine-tune classifier through statistical test. The authors fine-tune the classifiers on various sizes of training data. The experiment shows that the prompt-based learning approach outperforms the traditional pre-train and fine-tune classifier and can generate promising results when training data are limited.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124015700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Point-in-Time Language Model for Geopolitical Risk Events 地缘政治风险事件的时间点语言模型
Pub Date : 2022-12-14 DOI: 10.3905/jfds.2022.1.113
Matthias Apel, A. Betzer, B. Scherer
In this article, the authors show how to build a real-time geopolitical risk index from news data using textual analysis. The presented method defines a point-in-time dictionary of terms related to political tension. It does not rely on the in-sample definition of a set of n-grams that are likely chosen and updated with hindsight bias. The proposed model can be applied to any topic and is language agnostic. Only a few topic-related words are required to initialize the buildup of a dynamically self-adjusting dictionary. The authors show that their approach can resemble the results of other more supervised methods. The findings indicate how topic identification and news index construction may benefit from a time-dependent dictionary generation.
在本文中,作者展示了如何利用文本分析从新闻数据中构建实时地缘政治风险指数。所提出的方法定义了与政治紧张相关的术语的时间点词典。它不依赖于一组n个图的样本内定义,这些图很可能是用后见之明的偏见选择和更新的。该模型可以应用于任何主题,并且是语言不可知论的。初始化动态自调整字典的构建只需要几个与主题相关的词。作者表明,他们的方法可以类似于其他更有监督的方法的结果。研究结果表明,主题识别和新闻索引构建可能受益于时间依赖的词典生成。
{"title":"Point-in-Time Language Model for Geopolitical Risk Events","authors":"Matthias Apel, A. Betzer, B. Scherer","doi":"10.3905/jfds.2022.1.113","DOIUrl":"https://doi.org/10.3905/jfds.2022.1.113","url":null,"abstract":"In this article, the authors show how to build a real-time geopolitical risk index from news data using textual analysis. The presented method defines a point-in-time dictionary of terms related to political tension. It does not rely on the in-sample definition of a set of n-grams that are likely chosen and updated with hindsight bias. The proposed model can be applied to any topic and is language agnostic. Only a few topic-related words are required to initialize the buildup of a dynamically self-adjusting dictionary. The authors show that their approach can resemble the results of other more supervised methods. The findings indicate how topic identification and news index construction may benefit from a time-dependent dictionary generation.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"71 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128716219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Relevance-Based Prediction: A Transparent and Adaptive Alternative to Machine Learning 基于相关性的预测:机器学习的透明和自适应替代方案
Pub Date : 2022-12-01 DOI: 10.3905/jfds.2022.1.110
M. Czasonis, M. Kritzman, D. Turkington
The authors describe a new prediction system based on relevance, which gives a mathematically precise measure of the importance of an observation to forming a prediction, as well as fit, which measures a specific prediction’s reliability. They show how their relevance-based approach to prediction identifies the optimal combination of observations and predictive variables for any given prediction task, thereby presenting a unified alternative to both kernel regression and lasso regression, which they call CKT regression. They argue that their new prediction system addresses complexities that are beyond the capacity of linear regression analysis but in a way that is more transparent, more flexible, and less arbitrary than widely used machine learning algorithms.
作者描述了一种新的基于相关性的预测系统,它给出了一种数学上精确的衡量观察对形成预测的重要性的方法,以及衡量特定预测可靠性的拟合方法。他们展示了他们基于相关性的预测方法如何为任何给定的预测任务识别观测值和预测变量的最佳组合,从而提出了核回归和套索回归的统一替代方案,他们称之为CKT回归。他们认为,他们的新预测系统解决的复杂性超出了线性回归分析的能力,但比广泛使用的机器学习算法更透明、更灵活、更少武断。
{"title":"Relevance-Based Prediction: A Transparent and Adaptive Alternative to Machine Learning","authors":"M. Czasonis, M. Kritzman, D. Turkington","doi":"10.3905/jfds.2022.1.110","DOIUrl":"https://doi.org/10.3905/jfds.2022.1.110","url":null,"abstract":"The authors describe a new prediction system based on relevance, which gives a mathematically precise measure of the importance of an observation to forming a prediction, as well as fit, which measures a specific prediction’s reliability. They show how their relevance-based approach to prediction identifies the optimal combination of observations and predictive variables for any given prediction task, thereby presenting a unified alternative to both kernel regression and lasso regression, which they call CKT regression. They argue that their new prediction system addresses complexities that are beyond the capacity of linear regression analysis but in a way that is more transparent, more flexible, and less arbitrary than widely used machine learning algorithms.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123983152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Visualizing Structures in Financial Time-Series Datasets through Affinity-Based Diffusion Transition Embedding 基于亲和性扩散转移嵌入的金融时间序列数据结构可视化
Pub Date : 2022-12-01 DOI: 10.3905/jfds.2022.1.111
Rui Ding
In this work, the author proposes a modified version of PHATE, a diffusion map-based embedding algorithm that is tuned for working on financial time-series data primarily. The new algorithm, financial affinity-based diffusion transition embedding (FATE), takes in user-specified distance metrics that make sense for time-series data and uses symmetrized f-divergences applied to the diffusion probabilities as the final embedding distance before passing them into a metric multidimensional scaling step. The proposed visualization method reveals both local and global structures of the input time-series dataset. Performance of this visualization algorithm is first demonstrated through numerical experiments with Dow Jones 30 stock returns and S&P 100 stock returns. The author compares FATE visualization results using correlation-type distances with t-stochastic neighbor embedding and PHATE embeddings, among others, to demonstrate the advantages and new perspectives of FATE both qualitatively and quantitatively. On the other hand, experiments on synthetic ARMA time series with fine control of the structure of the underlying model parameters are provided. The results demonstrate the ability of transfer function information distance and time-lagged Hellinger distance to identify structures within the generating time-series models from their time-series realizations alone, which cannot be identified by correlation-type distances or Euclidean distances. The author concludes that the choice of distance metrics has an important role in the kind of structure one can uncover from time-series datasets.
在这项工作中,作者提出了一种修改版本的PHATE,这是一种基于扩散图的嵌入算法,主要用于处理金融时间序列数据。新算法,基于金融亲和力的扩散转移嵌入(FATE),采用用户指定的对时间序列数据有意义的距离度量,并使用对称的f-散度应用于扩散概率作为最终嵌入距离,然后将它们传递到度量多维缩放步骤中。所提出的可视化方法可以同时显示输入时间序列数据集的局部和全局结构。首先通过道琼斯30指数股票收益和标准普尔100指数股票收益的数值实验证明了该可视化算法的性能。作者将使用关联型距离的FATE可视化结果与t随机邻居嵌入和PHATE嵌入等进行了比较,以定性和定量地展示FATE的优势和新的视角。另一方面,提供了对底层模型参数结构进行精细控制的合成ARMA时间序列实验。结果表明,传递函数信息距离和滞后海灵格距离能够从生成时间序列模型的时间序列实现中识别结构,这是相关型距离或欧几里得距离无法识别的。作者得出结论,距离度量的选择对于从时间序列数据集中揭示的结构类型具有重要作用。
{"title":"Visualizing Structures in Financial Time-Series Datasets through Affinity-Based Diffusion Transition Embedding","authors":"Rui Ding","doi":"10.3905/jfds.2022.1.111","DOIUrl":"https://doi.org/10.3905/jfds.2022.1.111","url":null,"abstract":"In this work, the author proposes a modified version of PHATE, a diffusion map-based embedding algorithm that is tuned for working on financial time-series data primarily. The new algorithm, financial affinity-based diffusion transition embedding (FATE), takes in user-specified distance metrics that make sense for time-series data and uses symmetrized f-divergences applied to the diffusion probabilities as the final embedding distance before passing them into a metric multidimensional scaling step. The proposed visualization method reveals both local and global structures of the input time-series dataset. Performance of this visualization algorithm is first demonstrated through numerical experiments with Dow Jones 30 stock returns and S&P 100 stock returns. The author compares FATE visualization results using correlation-type distances with t-stochastic neighbor embedding and PHATE embeddings, among others, to demonstrate the advantages and new perspectives of FATE both qualitatively and quantitatively. On the other hand, experiments on synthetic ARMA time series with fine control of the structure of the underlying model parameters are provided. The results demonstrate the ability of transfer function information distance and time-lagged Hellinger distance to identify structures within the generating time-series models from their time-series realizations alone, which cannot be identified by correlation-type distances or Euclidean distances. The author concludes that the choice of distance metrics has an important role in the kind of structure one can uncover from time-series datasets.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115169079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Managing Editor’s Letter 总编辑的信
Pub Date : 2022-10-31 DOI: 10.3905/jfds.2022.4.4.001
F. Fabozzi
{"title":"Managing Editor’s Letter","authors":"F. Fabozzi","doi":"10.3905/jfds.2022.4.4.001","DOIUrl":"https://doi.org/10.3905/jfds.2022.4.4.001","url":null,"abstract":"","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129822186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using Machine Learning to Model Advised-Investor Behavior 使用机器学习建模建议投资者行为
Pub Date : 2022-10-31 DOI: 10.3905/jfds.2022.4.4.025
Han-Tai Shiao, Cynthia A. Pagliaro, D. Mehta
During periods of extreme market volatility, such as that experienced during the COVID-19 pandemic, advised investors may consider impulsive and inappropriate investment decisions like moving all assets to cash. Financial advisors, through proactive behavioral coaching, can help their clients avoid such decisions. But which clients need the most help? A predictive model that better identifies the clients most likely to react to market volatility can be an invaluable tool for financial advisors. Such a model requires insight into the investors’ mindset. In previous work, the authors focused on the perspective of the financial advisor and used natural language processing to explore advisors’ summary notes to extract such investor insights. They then used this novel data source as input for a machine-learning model to predict the investors most in need of intervention during volatile market periods. In this article, the authors further expand the model to include a unique dataset of investors’ digital activity, including investor-initiated contacts (via web, email, and phone) and web activity (page view and browsing history), to better reveal investor intention. Using machine-learning techniques, the authors build a model using this novel dataset as well as advisor notes, transaction activity, and a market volatility index to identify advised investors most in need of proactive intervention. The authors further describe the implication such work has for both traditional and robo-advisory service models.
在市场剧烈波动期间,例如在COVID-19大流行期间,建议投资者考虑冲动和不适当的投资决策,例如将所有资产转换为现金。财务顾问,通过积极的行为指导,可以帮助他们的客户避免这样的决定。但哪些客户最需要帮助呢?一个预测模型可以更好地识别出最有可能对市场波动做出反应的客户,这对财务顾问来说是一个非常宝贵的工具。这种模式需要洞察投资者的心态。在之前的工作中,作者专注于财务顾问的视角,并使用自然语言处理来探索顾问的总结笔记,以提取投资者的见解。然后,他们使用这个新颖的数据源作为机器学习模型的输入,以预测在动荡的市场时期最需要干预的投资者。在本文中,作者进一步扩展了该模型,纳入了投资者数字活动的独特数据集,包括投资者发起的联系(通过网络、电子邮件和电话)和网络活动(页面浏览量和浏览历史),以更好地揭示投资者的意图。利用机器学习技术,作者利用这个新的数据集以及顾问笔记、交易活动和市场波动指数建立了一个模型,以确定最需要主动干预的建议投资者。作者进一步描述了这种工作对传统和机器人咨询服务模式的影响。
{"title":"Using Machine Learning to Model Advised-Investor Behavior","authors":"Han-Tai Shiao, Cynthia A. Pagliaro, D. Mehta","doi":"10.3905/jfds.2022.4.4.025","DOIUrl":"https://doi.org/10.3905/jfds.2022.4.4.025","url":null,"abstract":"During periods of extreme market volatility, such as that experienced during the COVID-19 pandemic, advised investors may consider impulsive and inappropriate investment decisions like moving all assets to cash. Financial advisors, through proactive behavioral coaching, can help their clients avoid such decisions. But which clients need the most help? A predictive model that better identifies the clients most likely to react to market volatility can be an invaluable tool for financial advisors. Such a model requires insight into the investors’ mindset. In previous work, the authors focused on the perspective of the financial advisor and used natural language processing to explore advisors’ summary notes to extract such investor insights. They then used this novel data source as input for a machine-learning model to predict the investors most in need of intervention during volatile market periods. In this article, the authors further expand the model to include a unique dataset of investors’ digital activity, including investor-initiated contacts (via web, email, and phone) and web activity (page view and browsing history), to better reveal investor intention. Using machine-learning techniques, the authors build a model using this novel dataset as well as advisor notes, transaction activity, and a market volatility index to identify advised investors most in need of proactive intervention. The authors further describe the implication such work has for both traditional and robo-advisory service models.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"241 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128218491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tackling the Exponential Scaling of Signature-Based Generative Adversarial Networks for High-Dimensional Financial Time-Series Generation 高维金融时间序列生成中基于签名的生成对抗网络的指数尺度问题
Pub Date : 2022-09-24 DOI: 10.3905/jfds.2022.1.109
Fernando de Meer Pardo, Peter Schwendner, Marcus Wunsch
Generative adversarial networks (GANs) have been shown to be able to generate samples of complex financial time series, particularly by employing the concept of path signatures, a universal description of the geometric properties of a data stream whose expected value uniquely characterizes the time series. Specifically, the SigCWGAN model (Ni et al. 2020) can generate time series of arbitrary length; however, the parameters of the neural network employed grow exponentially with the dimension of the underlying time series, which makes the model intractable when seeking to generate large financial market scenarios. To overcome this problem of dimensionality, the authors propose an iterative generation procedure relying on the concept of hierarchies in financial markets. The authors construct an ensemble of GANs that they call the Hierarchical-SigCWGAN, which is based on hierarchical clustering that approximates signatures in the spirit of the original model. The Hierarchical-SigCWGAN can scale to higher dimensions and generate large-dimensional scenarios in which the joint behavior of all the assets in the market is replicated. The model is validated by comparing its performance on a series of similarity metrics with respect to the original SigCWGAN on a dataset in which it is still tractable and by showing its scalability on a larger dataset.
生成对抗网络(gan)已被证明能够生成复杂金融时间序列的样本,特别是通过采用路径签名的概念,路径签名是对数据流的几何属性的通用描述,其期望值是时间序列的唯一特征。具体来说,SigCWGAN模型(Ni et al. 2020)可以生成任意长度的时间序列;然而,所使用的神经网络的参数随着底层时间序列的维度呈指数增长,这使得模型在寻求生成大型金融市场场景时难以处理。为了克服这一维度问题,作者提出了一种基于金融市场层次概念的迭代生成过程。作者构建了一个gan的集合,他们称之为hierarchical - sigcwgan,它基于层次聚类,在原始模型的精神中近似签名。Hierarchical-SigCWGAN可以扩展到更高的维度,并生成大维度的场景,在这些场景中,市场中所有资产的联合行为被复制。通过在仍然可处理的数据集上比较其与原始SigCWGAN在一系列相似性指标上的性能,并通过在更大的数据集上显示其可伸缩性来验证该模型。
{"title":"Tackling the Exponential Scaling of Signature-Based Generative Adversarial Networks for High-Dimensional Financial Time-Series Generation","authors":"Fernando de Meer Pardo, Peter Schwendner, Marcus Wunsch","doi":"10.3905/jfds.2022.1.109","DOIUrl":"https://doi.org/10.3905/jfds.2022.1.109","url":null,"abstract":"Generative adversarial networks (GANs) have been shown to be able to generate samples of complex financial time series, particularly by employing the concept of path signatures, a universal description of the geometric properties of a data stream whose expected value uniquely characterizes the time series. Specifically, the SigCWGAN model (Ni et al. 2020) can generate time series of arbitrary length; however, the parameters of the neural network employed grow exponentially with the dimension of the underlying time series, which makes the model intractable when seeking to generate large financial market scenarios. To overcome this problem of dimensionality, the authors propose an iterative generation procedure relying on the concept of hierarchies in financial markets. The authors construct an ensemble of GANs that they call the Hierarchical-SigCWGAN, which is based on hierarchical clustering that approximates signatures in the spirit of the original model. The Hierarchical-SigCWGAN can scale to higher dimensions and generate large-dimensional scenarios in which the joint behavior of all the assets in the market is replicated. The model is validated by comparing its performance on a series of similarity metrics with respect to the original SigCWGAN on a dataset in which it is still tractable and by showing its scalability on a larger dataset.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115049795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Meta-Labeling Architecture Meta-Labeling架构
Pub Date : 2022-09-16 DOI: 10.3905/jfds.2022.1.108
M. Meyer, J. Joubert, Mesias Alfeus
Separating the side and size of a position allows for sophisticated strategy structures to be developed. Modeling the size component can be done through a meta-labeling approach. This article establishes several heterogeneous architectures to account for key aspects of meta-labeling. They serve as a guide for practitioners in the model development process, as well as for researchers to further build on these ideas. An architecture can be developed through the lens of feature- and/or strategy-driven approaches. The feature-driven approach exploits the way the information in the data is structured and how the selected models use that information, whereas a strategy-driven approach specifically aims to incorporate unique characteristics of the underlying trading strategy. Furthermore, the concept of inverse meta-labeling is introduced as a technique to improve the quantity and quality of the side forecasts.
将仓位的侧面和规模分开,可以形成复杂的策略结构。可以通过元标记方法对size组件进行建模。本文建立了几个异构体系结构来解释元标签的关键方面。它们可以作为模型开发过程中的实践者的指南,也可以作为研究人员在这些思想的基础上进一步构建的指南。架构可以通过功能和/或策略驱动的方法进行开发。特征驱动的方法利用数据中信息的结构方式以及所选择的模型如何使用这些信息,而策略驱动的方法专门针对合并潜在交易策略的独特特征。此外,还引入了逆元标记的概念,作为一种提高侧预测数量和质量的技术。
{"title":"Meta-Labeling Architecture","authors":"M. Meyer, J. Joubert, Mesias Alfeus","doi":"10.3905/jfds.2022.1.108","DOIUrl":"https://doi.org/10.3905/jfds.2022.1.108","url":null,"abstract":"Separating the side and size of a position allows for sophisticated strategy structures to be developed. Modeling the size component can be done through a meta-labeling approach. This article establishes several heterogeneous architectures to account for key aspects of meta-labeling. They serve as a guide for practitioners in the model development process, as well as for researchers to further build on these ideas. An architecture can be developed through the lens of feature- and/or strategy-driven approaches. The feature-driven approach exploits the way the information in the data is structured and how the selected models use that information, whereas a strategy-driven approach specifically aims to incorporate unique characteristics of the underlying trading strategy. Furthermore, the concept of inverse meta-labeling is introduced as a technique to improve the quantity and quality of the side forecasts.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130273834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
The Journal of Financial Data Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1