首页 > 最新文献

The Journal of Financial Data Science最新文献

英文 中文
Managing Editor’s Letter 总编辑的信
Pub Date : 2021-10-31 DOI: 10.3905/jfds.2021.3.4.001
F. Fabozzi
Cathy Scott General Manager and Publisher Several articles, two of which were published in this journal, have shown how reinforcement learning can be used to take trading costs into account in hedging decisions. In the lead article of this issue, “Deep Hedging of Derivatives Using Reinforcement Learning,” Jay Cao, Jacky Chen, John Hull, and Zissis Poulos extend the standard reinforcement learning approach by utilizing multiple Q-functions for the purpose of increasing the range of objective functions that can be used and by using algorithms that allow the state space and action space to be continuous. The authors suggest an approach where a relatively simple valuation model is used in conjunction with more complex models for the evolution of the asset price. This allows good hedges to be developed for asset price processes that are not associated with analytic pricing models. Deep sequence models have been applied to predicting asset returns. These models are flexible enough to capture the high-dimensionality, nonlinear, interactive, low signal-to-noise, and dynamic nature of financial data. More specifically, these models can outperform the conventionally used models because of their ability to detect path-dependence patterns. Lin William Cong, Ke Tang, Jingyuan Wang, and Yang Zhang in their article “Deep Sequence Modeling: Development and Applications in Asset Pricing,” show how to predict asset returns and measure risk premiums by applying deep sequence modeling. They begin by providing an overview of the development of deep sequence models, introducing their applications in asset pricing, and discussing the advantages and limitations of deep sequence models. A comparative analysis of these methods using data on US equities is then provided in the second part of the article where the authors demonstrate how sequence modeling benefits investors in general by incorporating complex historical path dependence. They report that long short-term memory has the best performance in terms of out-of-sample predictive R-squared, and long short-term memory with an attention mechanism has the best portfolio performance when excluding microcap stocks. In the formulation of an investment process, it is critical to build a view of causal relations among economic entities. Because of the complex and opaque nature of many market interactions, this can be challenging. Various models of economic causality have been proposed to both explain the past and aide investors in the investment process such as causal networks. Such networks provide an efficient framework for assisting with investment decisions that are supported by both quantitative and qualitative evidence. When building causal networks, the addition of more causes adds to the issue of computational complexity because of the necessity to calculate the combined impact of larger and larger sets of causes. In “Causal Uncertainty in Capital Markets: A Robust Noisy-Or Framework for Portfolio Management,” Joseph
几篇文章,其中两篇发表在本杂志上,展示了如何使用强化学习来考虑对冲决策中的交易成本。在本期的第一篇文章“使用强化学习的衍生品深度套期保值”中,Jay Cao、Jacky Chen、John Hull和Zissis Poulos扩展了标准的强化学习方法,通过使用多个q函数来增加可使用的目标函数的范围,并通过使用允许状态空间和动作空间连续的算法。作者提出了一种方法,将相对简单的估值模型与更复杂的资产价格演变模型结合使用。这允许为与分析定价模型无关的资产价格过程开发良好的对冲。深度序列模型已被应用于预测资产收益。这些模型足够灵活,可以捕捉金融数据的高维性、非线性、交互性、低信噪比和动态性。更具体地说,这些模型可以优于传统使用的模型,因为它们能够检测路径依赖模式。丛林威廉、唐科、王景远和张杨在他们的文章《深度序列建模:在资产定价中的发展和应用》中展示了如何通过应用深度序列建模来预测资产收益和衡量风险溢价。他们首先概述了深序列模型的发展,介绍了它们在资产定价中的应用,并讨论了深序列模型的优点和局限性。然后在文章的第二部分提供了使用美国股票数据的这些方法的比较分析,其中作者展示了序列建模如何通过结合复杂的历史路径依赖而使投资者受益。他们报告说,就样本外预测r平方而言,长短期记忆具有最佳表现,而在排除小盘股时,具有注意机制的长短期记忆具有最佳组合表现。在制定投资过程中,建立经济实体之间因果关系的观点是至关重要的。由于许多市场互动的复杂性和不透明性,这可能具有挑战性。已经提出了各种经济因果关系模型来解释过去和帮助投资者在投资过程中,如因果网络。这种网络为协助作出有数量和质量证据支持的投资决定提供了一个有效的框架。当建立因果网络时,增加更多的原因会增加计算复杂性,因为需要计算越来越多的原因集合的综合影响。在《资本市场的因果不确定性:投资组合管理的稳健噪声或框架》一书中,约瑟夫·西蒙尼安认为,在各种因果网络的方法中,“噪声或模型”提供了以线性方式计算原因总效应的方法,假设模型构建者使用的因果概率值是完全可靠的。为了解决不确定性问题,Simonian提供了一个健壮的、不确定性调整的噪声或框架,该框架利用基于证据的主观逻辑(即,明确设计用于评估可靠性的多值逻辑)。2002年8月1日,我和我的朋友们来到了洛杉矶。
{"title":"Managing Editor’s Letter","authors":"F. Fabozzi","doi":"10.3905/jfds.2021.3.4.001","DOIUrl":"https://doi.org/10.3905/jfds.2021.3.4.001","url":null,"abstract":"Cathy Scott General Manager and Publisher Several articles, two of which were published in this journal, have shown how reinforcement learning can be used to take trading costs into account in hedging decisions. In the lead article of this issue, “Deep Hedging of Derivatives Using Reinforcement Learning,” Jay Cao, Jacky Chen, John Hull, and Zissis Poulos extend the standard reinforcement learning approach by utilizing multiple Q-functions for the purpose of increasing the range of objective functions that can be used and by using algorithms that allow the state space and action space to be continuous. The authors suggest an approach where a relatively simple valuation model is used in conjunction with more complex models for the evolution of the asset price. This allows good hedges to be developed for asset price processes that are not associated with analytic pricing models. Deep sequence models have been applied to predicting asset returns. These models are flexible enough to capture the high-dimensionality, nonlinear, interactive, low signal-to-noise, and dynamic nature of financial data. More specifically, these models can outperform the conventionally used models because of their ability to detect path-dependence patterns. Lin William Cong, Ke Tang, Jingyuan Wang, and Yang Zhang in their article “Deep Sequence Modeling: Development and Applications in Asset Pricing,” show how to predict asset returns and measure risk premiums by applying deep sequence modeling. They begin by providing an overview of the development of deep sequence models, introducing their applications in asset pricing, and discussing the advantages and limitations of deep sequence models. A comparative analysis of these methods using data on US equities is then provided in the second part of the article where the authors demonstrate how sequence modeling benefits investors in general by incorporating complex historical path dependence. They report that long short-term memory has the best performance in terms of out-of-sample predictive R-squared, and long short-term memory with an attention mechanism has the best portfolio performance when excluding microcap stocks. In the formulation of an investment process, it is critical to build a view of causal relations among economic entities. Because of the complex and opaque nature of many market interactions, this can be challenging. Various models of economic causality have been proposed to both explain the past and aide investors in the investment process such as causal networks. Such networks provide an efficient framework for assisting with investment decisions that are supported by both quantitative and qualitative evidence. When building causal networks, the addition of more causes adds to the issue of computational complexity because of the necessity to calculate the combined impact of larger and larger sets of causes. In “Causal Uncertainty in Capital Markets: A Robust Noisy-Or Framework for Portfolio Management,” Joseph","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114810693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On Robustness of Mutual Funds Categorization and Distance Metric Learning 共同基金分类和距离度量学习的鲁棒性
Pub Date : 2021-10-31 DOI: 10.3905/jfds.2021.3.4.130
Dhruv Desai, D. Mehta
Identifying similar mutual funds among a given universe of funds has many applications, including competitor analysis, marketing and sales, tax loss harvesting, and so on. For a contemporary analyst, the most popular approach to finding similar funds is to look up a categorization system such as Morningstar categorization. Morningstar categorization has been heavily investigated by academic researchers from various angles, including using unsupervised clustering techniques in which clusters were found to be inconsistent with categorization. Recently, however, categorization has been studied using supervised classification techniques, with the categories being the target labels. Categorization was indeed learnable with very high accuracy using a purely data-driven approach, causing a paradox: Clustering was inconsistent with respect to categorization, whereas supervised classification was able to reproduce (near) complete categorization. Here, the authors resolve this apparent paradox by pointing out incorrect uses and interpretations of machine learning techniques in the previous academic literature. The authors demonstrate that by using an appropriate list of variables and metrics to identify the optimal number of clusters and preprocessing the data using distance metric learning, one can indeed reproduce the Morningstar categorization using a data-driven approach. The present work puts an end to the debate on this issue and establishes that the Morningstar categorization is indeed intrinsically rigorous, consistent, rule-based, and reproducible using data-driven approaches, if machine learning techniques are correctly implemented. Key Findings ▪ Academic literature has time and again questioned the consistency and robustness of mutual fund’s categorization systems, such as Morningstar categorization, by contrasting them with unsupervised clustering of funds. ▪ Here, the authors settle the debate in favor of Morningstar categorization by pointing out the use of incorrect lists of variables and interpretation of machine learning algorithms in the previous literature, while emphasizing that the main missing piece from the machine learning side in previous research was the appropriate distance metric. ▪ The authors employ a machine learning technique called distance metric learning and reproduce the Morningstar categorization completely using a data-driven approach.
在给定的基金范围中识别相似的共同基金有许多应用,包括竞争对手分析、营销和销售、税收损失收集等。对于当代分析师来说,寻找类似基金最流行的方法是查找晨星(Morningstar)之类的分类系统。学术研究人员从各个角度对晨星分类进行了大量研究,包括使用无监督聚类技术,其中发现聚类与分类不一致。然而,近年来,人们开始使用监督分类技术进行分类研究,将类别作为目标标签。使用纯粹的数据驱动方法,分类确实是可以非常准确地学习的,这导致了一个悖论:聚类与分类不一致,而监督分类能够重现(接近)完整的分类。在这里,作者通过指出先前学术文献中对机器学习技术的错误使用和解释来解决这个明显的悖论。作者证明,通过使用适当的变量和指标列表来确定最佳簇数,并使用距离度量学习对数据进行预处理,可以使用数据驱动的方法再现晨星分类。目前的工作结束了关于这个问题的争论,并确立了晨星分类本质上确实是严格的、一致的、基于规则的,并且使用数据驱动的方法是可重复的,如果机器学习技术被正确地实现的话。学术文献通过将共同基金分类系统(如晨星分类系统)与无监督的基金聚类进行对比,一再质疑共同基金分类系统的一致性和稳健性。▪在这里,作者通过指出先前文献中使用不正确的变量列表和机器学习算法的解释来解决有利于晨星分类的争论,同时强调先前研究中机器学习方面的主要缺失部分是适当的距离度量。▪作者采用了一种称为距离度量学习的机器学习技术,并使用数据驱动的方法完全重现了晨星分类。
{"title":"On Robustness of Mutual Funds Categorization and Distance Metric Learning","authors":"Dhruv Desai, D. Mehta","doi":"10.3905/jfds.2021.3.4.130","DOIUrl":"https://doi.org/10.3905/jfds.2021.3.4.130","url":null,"abstract":"Identifying similar mutual funds among a given universe of funds has many applications, including competitor analysis, marketing and sales, tax loss harvesting, and so on. For a contemporary analyst, the most popular approach to finding similar funds is to look up a categorization system such as Morningstar categorization. Morningstar categorization has been heavily investigated by academic researchers from various angles, including using unsupervised clustering techniques in which clusters were found to be inconsistent with categorization. Recently, however, categorization has been studied using supervised classification techniques, with the categories being the target labels. Categorization was indeed learnable with very high accuracy using a purely data-driven approach, causing a paradox: Clustering was inconsistent with respect to categorization, whereas supervised classification was able to reproduce (near) complete categorization. Here, the authors resolve this apparent paradox by pointing out incorrect uses and interpretations of machine learning techniques in the previous academic literature. The authors demonstrate that by using an appropriate list of variables and metrics to identify the optimal number of clusters and preprocessing the data using distance metric learning, one can indeed reproduce the Morningstar categorization using a data-driven approach. The present work puts an end to the debate on this issue and establishes that the Morningstar categorization is indeed intrinsically rigorous, consistent, rule-based, and reproducible using data-driven approaches, if machine learning techniques are correctly implemented. Key Findings ▪ Academic literature has time and again questioned the consistency and robustness of mutual fund’s categorization systems, such as Morningstar categorization, by contrasting them with unsupervised clustering of funds. ▪ Here, the authors settle the debate in favor of Morningstar categorization by pointing out the use of incorrect lists of variables and interpretation of machine learning algorithms in the previous literature, while emphasizing that the main missing piece from the machine learning side in previous research was the appropriate distance metric. ▪ The authors employ a machine learning technique called distance metric learning and reproduce the Morningstar categorization completely using a data-driven approach.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125460857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Adaptive Seriational Risk Parity and Other Extensions for Heuristic Portfolio Construction Using Machine Learning and Graph Theory 基于机器学习和图论的启发式投资组合构建的自适应序列风险奇偶及其他扩展
Pub Date : 2021-10-06 DOI: 10.3905/jfds.2021.1.078
Peter Schwendner, Jochen Papenbrock, Markus Jaeger, Stephan Krügel
In this article, the authors present a conceptual framework named adaptive seriational risk parity (ASRP) to extend hierarchical risk parity (HRP) as an asset allocation heuristic. The first step of HRP (quasi-diagonalization), determining the hierarchy of assets, is required for the actual allocation done in the second step (recursive bisectioning). In the original HRP scheme, this hierarchy is found using single-linkage hierarchical clustering of the correlation matrix, which is a static tree-based method. The authors compare the performance of the standard HRP with other static and adaptive tree-based methods, as well as seriation-based methods that do not rely on trees. Seriation is a broader concept allowing reordering of the rows or columns of a matrix to best express similarities between the elements. Each discussed variation leads to a different time series reflecting portfolio performance using a 20-year backtest of a multi-asset futures universe. Unsupervised learningbased on these time-series creates a taxonomy that groups the strategies in high correspondence to the construction hierarchy of the various types of ASRP. Performance analysis of the variations shows that most of the static tree-based alternatives to HRP outperform the single-linkage clustering used in HRP on a risk-adjusted basis. Adaptive tree methods show mixed results, and most generic seriation-based approaches underperform. Key Findings ▪ The authors introduce the adaptive seriational risk parity (ASRP) framework as a hierarchy of decisions to implement the quasi-diagonalization step of hierarchical risk parity (HRP) with seriation-based and tree-based variations as alternatives to single linkage. Tree-based variations are further separated in static and adaptive versions. Altogether, 57 variations are discussed and connected to the literature. ▪ Backtests of the 57 different HRP-type asset allocation variations applied to a multi-asset futures universe lead to a correlation matrix of the resulting 57 portfolio return time series. This portfolio return correlation matrix can be visualized as a dendrogram using single-linkage clustering. The correlation hierarchy reflected by the dendrogram is similar to the construction hierarchy of the quasi-diagonalization step. Most seriation-based strategies seem to underperform HRP on a risk-adjusted basis. Most static tree-based variations outperform HRP, whereas adaptive tree-based methods show mixed results. ▪ The presented variations fit into a triple artificial intelligence approach to connect synthetic data generation with explainable machine learning. This approach generates synthetic market data in the first step. The second step applies an HRP-type portfolio allocation approach as discussed in this article. The third step uses a model-agnostic explanation such as the SHAP framework to explain the resulting performance with features of the synthetic market data and with model selection in the second step.
在本文中,作者提出了一个名为自适应序列风险奇偶(ASRP)的概念框架,以扩展分层风险奇偶(HRP)作为资产配置启发式方法。HRP(准对角化)的第一步,确定资产的层次结构,是在第二步(递归对分)中进行实际分配所必需的。在原来的HRP方案中,该层次结构是通过对相关矩阵进行单链接分层聚类得到的,这是一种基于静态树的方法。这组作者将标准HRP的性能与其他静态的、自适应的基于树的方法以及不依赖于树的基于序列的方法进行了比较。序列化是一个更广泛的概念,允许对矩阵的行或列进行重新排序,以最好地表达元素之间的相似性。每个讨论的变化导致不同的时间序列反映投资组合的表现,使用20年的多资产期货宇宙回测。基于这些时间序列的无监督学习创建了一种分类法,该分类法将策略分组为与各种类型的ASRP的构建层次结构高度对应的策略。对这些变化的性能分析表明,在风险调整的基础上,大多数基于静态树的HRP替代方案的性能优于HRP中使用的单链接聚类。自适应树方法的结果好坏参半,大多数通用的基于序列化的方法表现不佳。▪作者介绍了自适应序列风险奇偶(ASRP)框架作为决策的层次结构,以实现基于序列和基于树的变化作为单一链接的替代的分层风险奇偶(HRP)的准对角化步骤。基于树的变体进一步分为静态和自适应版本。总共讨论了57种变奏,并与文献联系起来。▪对57种不同的hrp型资产配置变量进行回测,得到57种投资组合回报时间序列的相关矩阵。这个投资组合收益相关矩阵可以用单链接聚类可视化为树形图。树状图反映的相关层次与拟对角化步骤的构造层次相似。在风险调整的基础上,大多数基于序列的战略似乎表现不如HRP。大多数基于静态树的变化优于HRP,而基于自适应树的方法显示出混合的结果。▪提出的变化符合三重人工智能方法,将合成数据生成与可解释的机器学习联系起来。这种方法在第一步生成合成的市场数据。第二步应用本文中讨论的hrp类型的投资组合分配方法。第三步使用模型不可知的解释(如SHAP框架)来解释合成市场数据的特征和第二步中的模型选择的结果性能。
{"title":"Adaptive Seriational Risk Parity and Other Extensions for Heuristic Portfolio Construction Using Machine Learning and Graph Theory","authors":"Peter Schwendner, Jochen Papenbrock, Markus Jaeger, Stephan Krügel","doi":"10.3905/jfds.2021.1.078","DOIUrl":"https://doi.org/10.3905/jfds.2021.1.078","url":null,"abstract":"In this article, the authors present a conceptual framework named adaptive seriational risk parity (ASRP) to extend hierarchical risk parity (HRP) as an asset allocation heuristic. The first step of HRP (quasi-diagonalization), determining the hierarchy of assets, is required for the actual allocation done in the second step (recursive bisectioning). In the original HRP scheme, this hierarchy is found using single-linkage hierarchical clustering of the correlation matrix, which is a static tree-based method. The authors compare the performance of the standard HRP with other static and adaptive tree-based methods, as well as seriation-based methods that do not rely on trees. Seriation is a broader concept allowing reordering of the rows or columns of a matrix to best express similarities between the elements. Each discussed variation leads to a different time series reflecting portfolio performance using a 20-year backtest of a multi-asset futures universe. Unsupervised learningbased on these time-series creates a taxonomy that groups the strategies in high correspondence to the construction hierarchy of the various types of ASRP. Performance analysis of the variations shows that most of the static tree-based alternatives to HRP outperform the single-linkage clustering used in HRP on a risk-adjusted basis. Adaptive tree methods show mixed results, and most generic seriation-based approaches underperform. Key Findings ▪ The authors introduce the adaptive seriational risk parity (ASRP) framework as a hierarchy of decisions to implement the quasi-diagonalization step of hierarchical risk parity (HRP) with seriation-based and tree-based variations as alternatives to single linkage. Tree-based variations are further separated in static and adaptive versions. Altogether, 57 variations are discussed and connected to the literature. ▪ Backtests of the 57 different HRP-type asset allocation variations applied to a multi-asset futures universe lead to a correlation matrix of the resulting 57 portfolio return time series. This portfolio return correlation matrix can be visualized as a dendrogram using single-linkage clustering. The correlation hierarchy reflected by the dendrogram is similar to the construction hierarchy of the quasi-diagonalization step. Most seriation-based strategies seem to underperform HRP on a risk-adjusted basis. Most static tree-based variations outperform HRP, whereas adaptive tree-based methods show mixed results. ▪ The presented variations fit into a triple artificial intelligence approach to connect synthetic data generation with explainable machine learning. This approach generates synthetic market data in the first step. The second step applies an HRP-type portfolio allocation approach as discussed in this article. The third step uses a model-agnostic explanation such as the SHAP framework to explain the resulting performance with features of the synthetic market data and with model selection in the second step.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"77 10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131204804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Interpretable, Transparent, and Auditable Machine Learning: An Alternative to Factor Investing 可解释、透明和可审计的机器学习:因子投资的替代方案
Pub Date : 2021-09-22 DOI: 10.3905/jfds.2021.1.077
Daniel Philps, D. Tilles, Timothy P. Law
Interpretability, transparency, and auditability of machine learning (ML)-driven investment has become a key issue for investment managers as many look to enhance or replace traditional factor-based investing. The authors show that symbolic artificial intelligence (SAI) provides a solution to this conundrum, with superior return characteristics compared to traditional factor-based stock selection, while producing interpretable outcomes. Their SAI approach is a form of satisficing that systematically learns investment decision rules (symbols) for stock selection, using an a priori algorithm, avoiding the need for error-prone approaches for secondary explanations (known as XAI). The authors compare the empirical performance of an SAI approach with a traditional factor-based stock selection approach, in an emerging market equities universe. They show that SAI generates superior return characteristics and would provide a viable and interpretable alternative to factor-based stock selection. Their approach has significant implications for investment managers, providing an ML alternative to factor investing but with interpretable outcomes that could satisfy internal and external stakeholders. Key Findings ▪ Symbolic artificial intelligence (SAI) for stock selection, a form of satisficing, provides an alternative to factor investing and overcomes the interpretability issues of many machine learning (ML) approaches. ▪ An SAI that could be applied at scale is shown to produce superior return characteristics to traditional factor-based stock selection. ▪ SAI’s superior stock selection is examined using notional visualizations of its decision boundaries.
机器学习驱动投资的可解释性、透明度和可审计性已成为投资经理的一个关键问题,因为许多人希望增强或取代传统的基于因素的投资。作者表明,符号人工智能(SAI)为这一难题提供了解决方案,与传统的基于因素的选股相比,它具有优越的回报特征,同时产生可解释的结果。他们的SAI方法是一种满足形式,系统地学习股票选择的投资决策规则(符号),使用先验算法,避免了对容易出错的二次解释方法(称为XAI)的需要。作者比较了SAI方法与传统的基于因素的选股方法在新兴市场股票领域的实证表现。他们表明,SAI产生了优越的回报特征,并将提供一个可行的和可解释的替代因素为基础的股票选择。他们的方法对投资经理具有重要意义,为要素投资提供了一种ML替代方案,但具有可解释的结果,可以满足内部和外部利益相关者。▪用于选股的符号人工智能(SAI)是一种满足形式,它提供了因素投资的替代方案,并克服了许多机器学习(ML)方法的可解释性问题。可以大规模应用的SAI被证明比传统的基于因素的股票选择产生更好的回报特征。▪SAI的优质股票选择使用其决策边界的概念可视化进行检查。
{"title":"Interpretable, Transparent, and Auditable Machine Learning: An Alternative to Factor Investing","authors":"Daniel Philps, D. Tilles, Timothy P. Law","doi":"10.3905/jfds.2021.1.077","DOIUrl":"https://doi.org/10.3905/jfds.2021.1.077","url":null,"abstract":"Interpretability, transparency, and auditability of machine learning (ML)-driven investment has become a key issue for investment managers as many look to enhance or replace traditional factor-based investing. The authors show that symbolic artificial intelligence (SAI) provides a solution to this conundrum, with superior return characteristics compared to traditional factor-based stock selection, while producing interpretable outcomes. Their SAI approach is a form of satisficing that systematically learns investment decision rules (symbols) for stock selection, using an a priori algorithm, avoiding the need for error-prone approaches for secondary explanations (known as XAI). The authors compare the empirical performance of an SAI approach with a traditional factor-based stock selection approach, in an emerging market equities universe. They show that SAI generates superior return characteristics and would provide a viable and interpretable alternative to factor-based stock selection. Their approach has significant implications for investment managers, providing an ML alternative to factor investing but with interpretable outcomes that could satisfy internal and external stakeholders. Key Findings ▪ Symbolic artificial intelligence (SAI) for stock selection, a form of satisficing, provides an alternative to factor investing and overcomes the interpretability issues of many machine learning (ML) approaches. ▪ An SAI that could be applied at scale is shown to produce superior return characteristics to traditional factor-based stock selection. ▪ SAI’s superior stock selection is examined using notional visualizations of its decision boundaries.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121526441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Classification Methods for Market Making in Auction Markets 拍卖市场做市的分类方法
Pub Date : 2021-09-17 DOI: 10.3905/jfds.2021.1.076
Nikolaj Normann Holm, Mansoor Hussain, M. Kulahci
Can machines learn to reliably predict auction outcomes in financial markets? The authors study this question using classification methods from machine learning and auction data from the request-for-quote protocol used in many multi-dealer-to-client markets. Their answer is affirmative. The highest performance is achieved using gradient-boosted decision trees coupled with preprocessing tools to handle class imbalance. Competition level, client identity, and bid–ask quotes are shown to be the most important features. To illustrate the usefulness of these findings, the authors create a profit-maximizing agent to suggest price quotes. Results show more aggressive behavior compared to human dealers. Key Findings ▪ We propose a machine learning–based approach for determining auction outcomes by exploring the use of classification algorithms for outcome predictions and show that gradient-boosted decision trees obtain the best performance on an industrial data set. ▪ We uncover bid–ask normalized spread levels and competition level as the most important features and evaluate their influence on predictions through Shapley value estimation. ▪ We demonstrate the usefulness of our approach by creating a profit-maximizing agent using a classifier for win probability predictions. Our agent’s behavior is aggressive compared to human dealers.
机器能学会可靠地预测金融市场的拍卖结果吗?作者使用机器学习的分类方法和拍卖数据来研究这个问题,这些数据来自许多多交易商对客户市场中使用的报价请求协议。他们的回答是肯定的。使用梯度增强决策树和预处理工具来处理类不平衡,可以实现最高的性能。竞争水平、客户身份和买卖报价是最重要的特征。为了说明这些发现的有用性,作者创建了一个利润最大化的代理来建议报价。结果显示,与人类经销商相比,他们的行为更具攻击性。▪我们提出了一种基于机器学习的方法,通过探索使用分类算法进行结果预测来确定拍卖结果,并表明梯度增强决策树在工业数据集上获得了最佳性能。▪我们发现买卖标准化价差水平和竞争水平是最重要的特征,并通过Shapley值估计评估它们对预测的影响。▪我们通过使用一个用于获胜概率预测的分类器创建一个利润最大化的代理来证明我们方法的有效性。与人类商人相比,我们代理人的行为更具侵略性。
{"title":"Classification Methods for Market Making in Auction Markets","authors":"Nikolaj Normann Holm, Mansoor Hussain, M. Kulahci","doi":"10.3905/jfds.2021.1.076","DOIUrl":"https://doi.org/10.3905/jfds.2021.1.076","url":null,"abstract":"Can machines learn to reliably predict auction outcomes in financial markets? The authors study this question using classification methods from machine learning and auction data from the request-for-quote protocol used in many multi-dealer-to-client markets. Their answer is affirmative. The highest performance is achieved using gradient-boosted decision trees coupled with preprocessing tools to handle class imbalance. Competition level, client identity, and bid–ask quotes are shown to be the most important features. To illustrate the usefulness of these findings, the authors create a profit-maximizing agent to suggest price quotes. Results show more aggressive behavior compared to human dealers. Key Findings ▪ We propose a machine learning–based approach for determining auction outcomes by exploring the use of classification algorithms for outcome predictions and show that gradient-boosted decision trees obtain the best performance on an industrial data set. ▪ We uncover bid–ask normalized spread levels and competition level as the most important features and evaluate their influence on predictions through Shapley value estimation. ▪ We demonstrate the usefulness of our approach by creating a profit-maximizing agent using a classifier for win probability predictions. Our agent’s behavior is aggressive compared to human dealers.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124076596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fairness Measures for Machine Learning in Finance 金融中机器学习的公平性措施
Pub Date : 2021-09-14 DOI: 10.3905/jfds.2021.1.075
Sanjiv Ranjan Das, Michele Donini, J. Gelman, Kevin Haas, Mila Hardt, Jared Katzman, K. Kenthapadi, Pedro Larroy, Pinar Yilmaz, Bilal Zafar
The authors present a machine learning pipeline for fairness-aware machine learning (FAML) in finance that encompasses metrics for fairness (and accuracy). Whereas accuracy metrics are well understood and the principal ones are used frequently, there is no consensus as to which of several available measures for fairness should be used in a generic manner in the financial services industry. The authors explore these measures and discuss which ones to focus on at various stages in the ML pipeline, pre-training and post-training, and they examine simple bias mitigation approaches. Using a standard dataset, they show that the sequencing in their FAML pipeline offers a cogent approach to arriving at a fair and accurate ML model. The authors discuss the intersection of bias metrics with legal considerations in the United States, and the entanglement of explainability and fairness is exemplified in the case study. They discuss possible approaches for training ML models while satisfying constraints imposed from various fairness metrics and the role of causality in assessing fairness. Key Findings ▪ Sources of bias are presented and a range of metrics is considered for machine learning applications in finance, both pre-training and post-training of models. ▪ A process of using the metrics to arrive at fair models is discussed. ▪ Various considerations for the choice of specific metrics are also analyzed.
作者提出了一个用于金融领域公平感知机器学习(FAML)的机器学习管道,该管道包含公平性(和准确性)指标。虽然准确性指标被很好地理解,主要指标也经常被使用,但对于在金融服务行业中应以通用方式使用几种可用的公平措施中的哪一种,并没有达成共识。作者探讨了这些措施,并讨论了在机器学习管道的各个阶段(训练前和训练后)关注哪些措施,并研究了简单的偏见缓解方法。使用标准数据集,他们表明FAML管道中的测序提供了一种令人信服的方法来达到公平和准确的ML模型。作者讨论了偏见指标与美国法律考虑的交集,并在案例研究中举例说明了可解释性和公平性的纠缠。他们讨论了训练ML模型的可能方法,同时满足各种公平指标和因果关系在评估公平中的作用所施加的约束。▪提出了偏差的来源,并考虑了金融领域机器学习应用的一系列指标,包括模型的预训练和后训练。▪讨论了使用指标得出公平模型的过程。▪还分析了选择具体指标的各种考虑因素。
{"title":"Fairness Measures for Machine Learning in Finance","authors":"Sanjiv Ranjan Das, Michele Donini, J. Gelman, Kevin Haas, Mila Hardt, Jared Katzman, K. Kenthapadi, Pedro Larroy, Pinar Yilmaz, Bilal Zafar","doi":"10.3905/jfds.2021.1.075","DOIUrl":"https://doi.org/10.3905/jfds.2021.1.075","url":null,"abstract":"The authors present a machine learning pipeline for fairness-aware machine learning (FAML) in finance that encompasses metrics for fairness (and accuracy). Whereas accuracy metrics are well understood and the principal ones are used frequently, there is no consensus as to which of several available measures for fairness should be used in a generic manner in the financial services industry. The authors explore these measures and discuss which ones to focus on at various stages in the ML pipeline, pre-training and post-training, and they examine simple bias mitigation approaches. Using a standard dataset, they show that the sequencing in their FAML pipeline offers a cogent approach to arriving at a fair and accurate ML model. The authors discuss the intersection of bias metrics with legal considerations in the United States, and the entanglement of explainability and fairness is exemplified in the case study. They discuss possible approaches for training ML models while satisfying constraints imposed from various fairness metrics and the role of causality in assessing fairness. Key Findings ▪ Sources of bias are presented and a range of metrics is considered for machine learning applications in finance, both pre-training and post-training of models. ▪ A process of using the metrics to arrive at fair models is discussed. ▪ Various considerations for the choice of specific metrics are also analyzed.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117224704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Benchmark Dataset for Short-Term Market Prediction of Limit Order Book in China Markets 中国市场限价单短期市场预测基准数据集
Pub Date : 2021-09-05 DOI: 10.3905/jfds.2021.1.074
Charles Huang, Weifeng Ge, Hongsong Chou, Xin Du
Limit order books (LOBs) have generated big financial data for analysis and prediction from both academic community and industry practitioners. This article presents a benchmark LOB dataset from the Chinese stock market, covering a few thousand stocks for the period of June to September 2020. Experiment protocols are designed for model performance evaluation: at the end of every second, to forecast the upcoming volume-weighted average price change and volume over 12 horizons ranging from 1 second to 300 seconds. Results based on a linear regression model and deep learning models are compared. A practical short-term trading strategy framework based on the alpha signal generated is presented. The data and code are available on Github (github.com/HKGSAS). Key Findings ▪ There is a gap between benchmarking a high-frequency LOB dataset and model for researchers to objectively assess prediction performances, which this article serves to bridge. ▪ A more practically effective set of features is proposed to capture both LOB snapshots and periodic data. The prediction target is similarly too simplistic in the published literature—mid-price direction change for the next few events, which is not suitable for a practical trading strategy. The authors propose to predict the price change and volume magnitude over 12 short-term horizons. ▪ This article proposes comparing the performance of baseline linear regression and state-of-the-art deep learning models, based on both accuracy statistics and trading profits.
限价订单(lob)产生了大量的财务数据,可供学术界和行业从业者进行分析和预测。本文介绍了中国股市的基准LOB数据集,涵盖了2020年6月至9月期间的数千只股票。实验方案设计用于模型性能评估:在每秒钟结束时,预测1秒至300秒12个视界内即将到来的交易量加权平均价格变化和交易量。对基于线性回归模型和深度学习模型的结果进行了比较。提出了一种实用的基于α信号生成的短期交易策略框架。数据和代码可在Github (github.com/HKGSAS)上获得。▪对高频LOB数据集进行基准测试与研究人员客观评估预测性能的模型之间存在差距,本文旨在弥合这一差距。▪提出了一套更实际有效的功能来捕获LOB快照和周期性数据。在已发表的文献中,预测目标同样过于简单化——未来几个事件的中间价格方向变化,这并不适合实际的交易策略。作者建议在12个短期内预测价格变化和交易量。▪本文建议比较基线线性回归和最先进的深度学习模型的性能,基于准确性统计和交易利润。
{"title":"Benchmark Dataset for Short-Term Market Prediction of Limit Order Book in China Markets","authors":"Charles Huang, Weifeng Ge, Hongsong Chou, Xin Du","doi":"10.3905/jfds.2021.1.074","DOIUrl":"https://doi.org/10.3905/jfds.2021.1.074","url":null,"abstract":"Limit order books (LOBs) have generated big financial data for analysis and prediction from both academic community and industry practitioners. This article presents a benchmark LOB dataset from the Chinese stock market, covering a few thousand stocks for the period of June to September 2020. Experiment protocols are designed for model performance evaluation: at the end of every second, to forecast the upcoming volume-weighted average price change and volume over 12 horizons ranging from 1 second to 300 seconds. Results based on a linear regression model and deep learning models are compared. A practical short-term trading strategy framework based on the alpha signal generated is presented. The data and code are available on Github (github.com/HKGSAS). Key Findings ▪ There is a gap between benchmarking a high-frequency LOB dataset and model for researchers to objectively assess prediction performances, which this article serves to bridge. ▪ A more practically effective set of features is proposed to capture both LOB snapshots and periodic data. The prediction target is similarly too simplistic in the published literature—mid-price direction change for the next few events, which is not suitable for a practical trading strategy. The authors propose to predict the price change and volume magnitude over 12 short-term horizons. ▪ This article proposes comparing the performance of baseline linear regression and state-of-the-art deep learning models, based on both accuracy statistics and trading profits.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126046727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Factor Momentum and Regime-Switching Overlay Strategy 因子动量与状态切换叠加策略
Pub Date : 2021-08-31 DOI: 10.3905/jfds.2021.1.072
Junhan Gu, J. Mulvey
Investors are faced with challenges in diversifying risks and protecting capital during crash periods. In this article, the authors incorporate regime information in the portfolio optimization context by identifying regimes for historical time periods using an ℓ1-trend filtering algorithm and exploring different machine learning techniques to forecast the probability of an upcoming stock market crash. They then apply a regime-based asset allocation to nominal risk parity strategy. Investors can further improve their investment performance by implementing a dollar-neutral factor momentum strategy as an overlay in conjunction with the core portfolio. The authors demonstrate that the time-series factor momentum strategy generates high risk-adjusted returns and exhibits pronounced defensive characteristics during market crashes. A volatility scaling approach is employed to manage the risk and further magnify the benefits of factor momentum. Empirical results suggest that the approach improves risk-adjusted returns by a substantial amount over the benchmark from both the standalone perspective and the contributory perspective. Key Findings ▪ The authors identify historical regimes with ℓ1-trend filtering and implement a regime-switching risk parity strategy with supervised learning methods to optimize the core portfolio allocation. ▪ By adding a long–short factor momentum strategy on top of the core diversified portfolios, the authors are able to further enhance the portfolio’s risk-adjusted return. ▪ The factor momentum strategy exhibits defensive characteristics during crashes, and its risks can be further managed by scaling the leverage based on the realized volatility.
在危机时期,投资者面临着分散风险和保护资本的挑战。在本文中,作者通过使用1趋势过滤算法识别历史时间段的制度,并探索不同的机器学习技术来预测即将到来的股市崩盘的概率,从而将制度信息纳入投资组合优化环境中。然后,他们将基于制度的资产配置应用于名义风险平价策略。投资者可以通过实施美元中性因素动量策略作为与核心投资组合相结合的覆盖来进一步改善其投资绩效。作者证明了时间序列因子动量策略产生了高的风险调整收益,并在市场崩溃期间表现出明显的防御特征。采用波动率标度方法来管理风险,进一步放大因子动量的收益。实证结果表明,无论从独立角度还是从贡献角度来看,该方法都比基准提高了大量的风险调整收益。▪作者利用1-趋势滤波识别历史制度,并利用监督学习方法实现制度转换风险平价策略以优化核心投资组合配置。▪通过在核心多元化投资组合之上添加多空因素动量策略,作者能够进一步提高投资组合的风险调整回报。▪因子动量策略在崩溃期间表现出防御特征,其风险可以通过根据实现的波动性缩放杠杆来进一步管理。
{"title":"Factor Momentum and Regime-Switching Overlay Strategy","authors":"Junhan Gu, J. Mulvey","doi":"10.3905/jfds.2021.1.072","DOIUrl":"https://doi.org/10.3905/jfds.2021.1.072","url":null,"abstract":"Investors are faced with challenges in diversifying risks and protecting capital during crash periods. In this article, the authors incorporate regime information in the portfolio optimization context by identifying regimes for historical time periods using an ℓ1-trend filtering algorithm and exploring different machine learning techniques to forecast the probability of an upcoming stock market crash. They then apply a regime-based asset allocation to nominal risk parity strategy. Investors can further improve their investment performance by implementing a dollar-neutral factor momentum strategy as an overlay in conjunction with the core portfolio. The authors demonstrate that the time-series factor momentum strategy generates high risk-adjusted returns and exhibits pronounced defensive characteristics during market crashes. A volatility scaling approach is employed to manage the risk and further magnify the benefits of factor momentum. Empirical results suggest that the approach improves risk-adjusted returns by a substantial amount over the benchmark from both the standalone perspective and the contributory perspective. Key Findings ▪ The authors identify historical regimes with ℓ1-trend filtering and implement a regime-switching risk parity strategy with supervised learning methods to optimize the core portfolio allocation. ▪ By adding a long–short factor momentum strategy on top of the core diversified portfolios, the authors are able to further enhance the portfolio’s risk-adjusted return. ▪ The factor momentum strategy exhibits defensive characteristics during crashes, and its risks can be further managed by scaling the leverage based on the realized volatility.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"9 31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131095759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Managing Editor’s Letter 总编辑的信
Pub Date : 2021-07-31 DOI: 10.3905/jfds.2021.3.3.001
F. Fabozzi
n asset management, alternative data are diverse nontraditional datasets utilized by quantitative and fundamental institutional investors that is expected to enhance portfolio returns. In the opening article, “Alternative Data in Investment Manage-ment: Usage, Challenges, and Valuation,” Gene Ekster and Petter N. Kolm elaborate on what alternative data are, how they are used in asset management, key challenges that arise when working with alternative data, and how to assess the value of alternative databases. The key challenges include entity mapping, ticker-tagging, panel stabilization, and debiasing with modern statistical and machine learning approaches. There are several methodologies described for assessing the value of alternative datasets, including an event study methodology (which Ekster and Kolm refer to as the “golden triangle”), the application of report cards, and the relationship between a dataset’s structure of information content and its potential to enhance investment returns. The effectiveness of these methods is illustrated using a case study. In “Fairness Measures for Machine Learning in Finance,” by the team of Sanjiv Das, Michele Donini, Jason Gelman, Kevin Haas, Mila Hardt, Jared Katzman, Krishnaram Kenthapadi, Pedro Larroy, Pinar Yilmaz, and Muhammad Bilal Zafar, propose a machine learning (ML) pipeline for fairness-aware machine learning (FAML) in finance that encompasses metrics for fairness (and accuracy). Various considerations for the choice of specific metrics are also analyzed. The authors discuss which of these measures to focus on at various stages in the ML pipeline, pre-training and post-training, as well as examining simple bias mitigation approaches. Using a stan-dard dataset, they show that the sequencing in of satisficing that systematically learns investment decision rules (symbols) for stock selection—provides a solution for dealing with these important issues while providing superior return characteristics compared to traditional factor-based stock selection and allowing for interpretable outcomes. Empirically comparing the performance of the proposed SAI approach with a traditional factor-based stock selection approach for an emerging market equities universe, the authors show that SAI generates superior return characteristics while providing a viable and interpretable alternative to factor-based stock selection. Their approach has significant implications for investment managers, providing an ML alternative to factor investing but with interpretable outcomes that could satisfy internal and external stakeholders.
在资产管理中,替代数据是定量和基本机构投资者使用的多种非传统数据集,有望提高投资组合回报。在开篇文章“投资管理中的替代数据:使用、挑战和估值”中,Gene Ekster和Petter N. Kolm详细阐述了什么是替代数据,它们如何在资产管理中使用,处理替代数据时出现的主要挑战,以及如何评估替代数据库的价值。关键的挑战包括实体映射、报价标签、面板稳定以及使用现代统计和机器学习方法消除偏见。有几种方法描述了评估替代数据集的价值,包括事件研究方法(Ekster和Kolm称之为“金三角”),报告卡的应用,以及数据集信息内容结构与其提高投资回报潜力之间的关系。通过一个案例分析说明了这些方法的有效性。在由Sanjiv Das、Michele Donini、Jason Gelman、Kevin Haas、Mila Hardt、Jared Katzman、Krishnaram Kenthapadi、Pedro Larroy、Pinar Yilmaz和Muhammad Bilal Zafar组成的团队撰写的《金融领域机器学习的公平性措施》中,他们提出了一个用于金融领域公平感知机器学习(FAML)的机器学习(ML)管道,该管道包含公平性(和准确性)指标。还分析了选择特定度量标准的各种考虑因素。作者讨论了在机器学习管道的各个阶段,训练前和训练后,重点关注哪些措施,并研究了简单的偏见缓解方法。使用标准数据集,他们表明,满足系统地学习投资决策规则(符号)的排序为选股提供了处理这些重要问题的解决方案,同时与传统的基于因素的选股相比,提供了优越的回报特征,并允许可解释的结果。通过对新兴市场股票领域的SAI方法与传统的基于因素的选股方法的表现进行实证比较,作者表明SAI产生了优越的回报特征,同时为基于因素的选股提供了一个可行且可解释的替代方案。他们的方法对投资经理具有重要意义,为要素投资提供了一种ML替代方案,但具有可解释的结果,可以满足内部和外部利益相关者。
{"title":"Managing Editor’s Letter","authors":"F. Fabozzi","doi":"10.3905/jfds.2021.3.3.001","DOIUrl":"https://doi.org/10.3905/jfds.2021.3.3.001","url":null,"abstract":"n asset management, alternative data are diverse nontraditional datasets utilized by quantitative and fundamental institutional investors that is expected to enhance portfolio returns. In the opening article, “Alternative Data in Investment Manage-ment: Usage, Challenges, and Valuation,” Gene Ekster and Petter N. Kolm elaborate on what alternative data are, how they are used in asset management, key challenges that arise when working with alternative data, and how to assess the value of alternative databases. The key challenges include entity mapping, ticker-tagging, panel stabilization, and debiasing with modern statistical and machine learning approaches. There are several methodologies described for assessing the value of alternative datasets, including an event study methodology (which Ekster and Kolm refer to as the “golden triangle”), the application of report cards, and the relationship between a dataset’s structure of information content and its potential to enhance investment returns. The effectiveness of these methods is illustrated using a case study. In “Fairness Measures for Machine Learning in Finance,” by the team of Sanjiv Das, Michele Donini, Jason Gelman, Kevin Haas, Mila Hardt, Jared Katzman, Krishnaram Kenthapadi, Pedro Larroy, Pinar Yilmaz, and Muhammad Bilal Zafar, propose a machine learning (ML) pipeline for fairness-aware machine learning (FAML) in finance that encompasses metrics for fairness (and accuracy). Various considerations for the choice of specific metrics are also analyzed. The authors discuss which of these measures to focus on at various stages in the ML pipeline, pre-training and post-training, as well as examining simple bias mitigation approaches. Using a stan-dard dataset, they show that the sequencing in of satisficing that systematically learns investment decision rules (symbols) for stock selection—provides a solution for dealing with these important issues while providing superior return characteristics compared to traditional factor-based stock selection and allowing for interpretable outcomes. Empirically comparing the performance of the proposed SAI approach with a traditional factor-based stock selection approach for an emerging market equities universe, the authors show that SAI generates superior return characteristics while providing a viable and interpretable alternative to factor-based stock selection. Their approach has significant implications for investment managers, providing an ML alternative to factor investing but with interpretable outcomes that could satisfy internal and external stakeholders.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114541062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning for Active Portfolio Management 主动投资组合管理的机器学习
Pub Date : 2021-07-31 DOI: 10.3905/jfds.2021.1.071
Söhnke M. Bartram, J. Branke, Giuliano De Rossi, Mehrshad Motahari
Machine learning (ML) methods are attracting considerable attention among academics in the field of finance. However, it is commonly believed that ML has not transformed the asset management industry to the same extent as other sectors. This survey focuses on the ML methods and empirical results available in the literature that matter most for active portfolio management. ML has asset management applications for signal generation, portfolio construction, and trade execution, and promising findings have been reported. Reinforcement learning (RL), in particular, is expected to play a more significant role in the industry. Nevertheless, the performance of a sample of active exchange-traded funds (ETF) that use ML in their investments tends to be mixed. Overall, ML techniques show great promise for active portfolio management, but investors should be cautioned against their main potential pitfalls. TOPICS: Big data/machine learning, portfolio construction, exchange-traded funds and applications, performance measurement Key Findings ▪ Machine learning (ML) methods have several advantages that can lead to successful applications in active portfolio management, including the ability to capture nonlinear patterns and a focus on prediction through ensemble learning. ▪ ML methods can be applied to different steps of the investment process, including signal generation, portfolio construction, and trade execution, with reinforcement learning expected to play a more significant role in the industry. ▪ Empirically, the investment performance of ML-based active exchange-traded funds is mixed.
机器学习(ML)方法正在引起金融学界的广泛关注。然而,人们普遍认为机器学习并没有像其他行业那样改变资产管理行业。本调查的重点是ML方法和文献中可用的实证结果,这些方法和实证结果对主动投资组合管理最重要。机器学习具有用于信号生成、投资组合构建和交易执行的资产管理应用程序,并且已经报道了有希望的发现。尤其是强化学习(RL),预计将在行业中发挥更重要的作用。然而,在投资中使用机器学习的活跃交易所交易基金(ETF)样本的表现往往好坏参半。总的来说,机器学习技术在积极的投资组合管理方面显示出巨大的前景,但投资者应该警惕它们的主要潜在陷阱。主题:大数据/机器学习,投资组合构建,交易所交易基金和应用,绩效评估主要发现▪机器学习(ML)方法有几个优势,可以成功应用于主动投资组合管理,包括捕获非线性模式的能力和通过集成学习进行预测的重点。机器学习方法可以应用于投资过程的不同步骤,包括信号生成、投资组合构建和交易执行,强化学习有望在行业中发挥更重要的作用。▪从经验上看,基于机器学习的主动型交易所交易基金(etf)的投资表现好坏参半。
{"title":"Machine Learning for Active Portfolio Management","authors":"Söhnke M. Bartram, J. Branke, Giuliano De Rossi, Mehrshad Motahari","doi":"10.3905/jfds.2021.1.071","DOIUrl":"https://doi.org/10.3905/jfds.2021.1.071","url":null,"abstract":"Machine learning (ML) methods are attracting considerable attention among academics in the field of finance. However, it is commonly believed that ML has not transformed the asset management industry to the same extent as other sectors. This survey focuses on the ML methods and empirical results available in the literature that matter most for active portfolio management. ML has asset management applications for signal generation, portfolio construction, and trade execution, and promising findings have been reported. Reinforcement learning (RL), in particular, is expected to play a more significant role in the industry. Nevertheless, the performance of a sample of active exchange-traded funds (ETF) that use ML in their investments tends to be mixed. Overall, ML techniques show great promise for active portfolio management, but investors should be cautioned against their main potential pitfalls. TOPICS: Big data/machine learning, portfolio construction, exchange-traded funds and applications, performance measurement Key Findings ▪ Machine learning (ML) methods have several advantages that can lead to successful applications in active portfolio management, including the ability to capture nonlinear patterns and a focus on prediction through ensemble learning. ▪ ML methods can be applied to different steps of the investment process, including signal generation, portfolio construction, and trade execution, with reinforcement learning expected to play a more significant role in the industry. ▪ Empirically, the investment performance of ML-based active exchange-traded funds is mixed.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128956439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
期刊
The Journal of Financial Data Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1