首页 > 最新文献

The Journal of Financial Data Science最新文献

英文 中文
The Benefit of Narratives for Prediction of the S&P 500 Index 叙述对标准普尔500指数预测的好处
Pub Date : 2022-09-16 DOI: 10.3905/jfds.2022.1.107
Pascal Blanqué, M. Slimane, Amina Cherief, Théo Le Guenedal, Takaya Sekine, Lauren Stagnol
In this article, the authors show that variables from the Global Database of Events, Language, and Tone convey significant information that can improve on a purely macroeconomic approach when modeling the US equity market. Based on these metrics, the authors construct time series that represent and measure how some narratives that appear to be battling each other are changing in the current market environment. Specifically, the authors appraise the strength of the roaring 20s, back to the 70s, secular stagnation, and monetary economic narratives, but they also add topical societal narratives related to environmental or social aspects and a geopolitical risk narrative. The authors formalize an information content framework and show that including quantitative signals that translate into qualitative stories brings added value when determining the stock market’s movement. Indeed, in addition to having higher explanatory power from their underlying variables, narratives can improve the diversification of standard macroeconomic models. As such, the authors’ results advocate a close monitoring of narratives in financial markets.
在本文中,作者表明,来自全球事件、语言和语气数据库的变量传达了重要信息,可以在对美国股市建模时改进纯宏观经济方法。基于这些指标,作者构建了时间序列,以表示和衡量在当前市场环境中一些看似相互竞争的叙事是如何变化的。具体来说,作者评估了繁荣的20年代、回到70年代、长期停滞和货币经济叙事的力量,但他们也增加了与环境或社会方面相关的社会话题叙事,以及地缘政治风险叙事。作者形式化了一个信息内容框架,并表明,在确定股市走势时,包括转化为定性故事的定量信号带来了附加价值。事实上,除了对其潜在变量具有更高的解释力之外,叙事还可以改善标准宏观经济模型的多样化。因此,作者的研究结果提倡对金融市场的叙事进行密切监控。
{"title":"The Benefit of Narratives for Prediction of the S&P 500 Index","authors":"Pascal Blanqué, M. Slimane, Amina Cherief, Théo Le Guenedal, Takaya Sekine, Lauren Stagnol","doi":"10.3905/jfds.2022.1.107","DOIUrl":"https://doi.org/10.3905/jfds.2022.1.107","url":null,"abstract":"In this article, the authors show that variables from the Global Database of Events, Language, and Tone convey significant information that can improve on a purely macroeconomic approach when modeling the US equity market. Based on these metrics, the authors construct time series that represent and measure how some narratives that appear to be battling each other are changing in the current market environment. Specifically, the authors appraise the strength of the roaring 20s, back to the 70s, secular stagnation, and monetary economic narratives, but they also add topical societal narratives related to environmental or social aspects and a geopolitical risk narrative. The authors formalize an information content framework and show that including quantitative signals that translate into qualitative stories brings added value when determining the stock market’s movement. Indeed, in addition to having higher explanatory power from their underlying variables, narratives can improve the diversification of standard macroeconomic models. As such, the authors’ results advocate a close monitoring of narratives in financial markets.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"224 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130709564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Deep Learning Meets Statistical Arbitrage: An Application of Long Short-Term Memory Networks to Algorithmic Trading 深度学习与统计套利:长短期记忆网络在算法交易中的应用
Pub Date : 2022-09-06 DOI: 10.3905/jfds.2022.1.103
Yijun Zhao, Sheng Xu, Jacek Ossowski
In this article, the authors study the utility of deep-learning approaches in statistical arbitrage under the generalized pairs-trading paradigm. Stock returns are regressed on a set of risk factors derived using principal component analysis, and the long short-term memory (LSTM) structure is employed to forecast directions of idiosyncratic residuals. Daily market-neutral trades are constructed based on the predicted signals. The authors compare their results with the influential relative value (RV) model by Avellaneda and Lee (2010) on the universe of S&P 500 Index (S&P 500) stocks. Model evaluations are performed on two distinct periods (2001–2007 and 2015–2021) to alleviate the survivorship bias resulting from the S&P 500 composition changes over time and to study the robustness of these two models in two distinct eras. Their findings suggest that the LSTM model consistently and significantly outperforms the RV model across the two periods when transaction costs are accounted for. However, in the transaction cost–free world, the outperformance is modest even though it is still consistent.
在本文中,作者研究了广义配对交易范式下深度学习方法在统计套利中的效用。利用主成分分析方法对风险因子进行回归,并利用长短期记忆结构预测特殊残差的方向。每日市场中性交易是基于预测信号构建的。作者将他们的结果与Avellaneda和Lee(2010)对标准普尔500指数(S&P 500)股票的影响相对价值(RV)模型进行了比较。模型评估在两个不同的时期(2001-2007年和2015-2021年)进行,以减轻标准普尔500指数组成随时间变化造成的生存偏差,并研究这两个模型在两个不同时代的稳健性。他们的研究结果表明,在考虑交易成本的两个时期,LSTM模型始终显著优于RV模型。然而,在无交易成本的世界里,尽管表现仍然一致,但其表现却并不突出。
{"title":"Deep Learning Meets Statistical Arbitrage: An Application of Long Short-Term Memory Networks to Algorithmic Trading","authors":"Yijun Zhao, Sheng Xu, Jacek Ossowski","doi":"10.3905/jfds.2022.1.103","DOIUrl":"https://doi.org/10.3905/jfds.2022.1.103","url":null,"abstract":"In this article, the authors study the utility of deep-learning approaches in statistical arbitrage under the generalized pairs-trading paradigm. Stock returns are regressed on a set of risk factors derived using principal component analysis, and the long short-term memory (LSTM) structure is employed to forecast directions of idiosyncratic residuals. Daily market-neutral trades are constructed based on the predicted signals. The authors compare their results with the influential relative value (RV) model by Avellaneda and Lee (2010) on the universe of S&P 500 Index (S&P 500) stocks. Model evaluations are performed on two distinct periods (2001–2007 and 2015–2021) to alleviate the survivorship bias resulting from the S&P 500 composition changes over time and to study the robustness of these two models in two distinct eras. Their findings suggest that the LSTM model consistently and significantly outperforms the RV model across the two periods when transaction costs are accounted for. However, in the transaction cost–free world, the outperformance is modest even though it is still consistent.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124026759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Machine Learning–Based Systematic Investing in Agency Mortgage-Backed Securities 基于机器学习的机构抵押贷款支持证券系统投资
Pub Date : 2022-09-02 DOI: 10.3905/jfds.2022.1.102
Nikhil Arvind Jagannathan, Qiulei (Leo) Bao
With a total outstanding balance of more than $8 trillion as of this writing, agency mortgage-backed securities (MBS) represent the second largest segment of the US bond market and the second most liquid fixed-income market after US Treasuries. Institutional investors have long participated in this market to take advantage of its attractive spread over US Treasuries, low credit risk, low transaction cost, and the ability to transact large quantities with ease. MBS are made of individual mortgages extended to US homeowners. The ability for a homeowner to refinance at any point introduces complexity in prepayment analysis and investing in the MBS sector. Traditional prepayment modeling has been able to capture many of the relationships between prepayments and related factors such as the level of interest rates and the value of the embedded prepayment option, yet the manual nature of variable construction and sheer amount of available data make it difficult to capture the dynamics of extremely complex systems. The long history and large amount of data available in MBS make it a prime candidate to leverage machine learning (ML) algorithms to better explain complex relationships between various macro- and microeconomic factors and MBS prepayments. The authors propose a systematic investment strategy using an ML-based mortgage prepayment model approach combined with a coupon allocation optimization model to create an optimal portfolio to capture alpha vs. a benchmark.
截至撰写本文时,机构抵押贷款支持证券(MBS)的未偿余额总计超过8万亿美元,是美国债券市场的第二大细分市场,也是仅次于美国国债的第二大流动性固定收益市场。长期以来,机构投资者一直在参与这个市场,以利用其与美国国债之间诱人的利差、低信用风险、低交易成本以及轻松进行大量交易的能力。MBS是由发放给美国房主的个人抵押贷款构成的。房主在任何时候进行再融资的能力,给提前还款分析和MBS领域的投资带来了复杂性。传统的提前还款模型已经能够捕捉到提前还款和相关因素(如利率水平和嵌入式提前还款选项的价值)之间的许多关系,但是可变结构的手工性质和大量可用数据使得捕捉极端复杂系统的动态变得困难。MBS的悠久历史和大量可用数据使其成为利用机器学习(ML)算法来更好地解释各种宏观和微观经济因素与MBS提前支付之间复杂关系的首选对象。作者提出了一种系统的投资策略,使用基于ml的抵押贷款提前支付模型方法结合券息分配优化模型来创建一个最优投资组合,以捕获α与基准。
{"title":"Machine Learning–Based Systematic Investing in Agency Mortgage-Backed Securities","authors":"Nikhil Arvind Jagannathan, Qiulei (Leo) Bao","doi":"10.3905/jfds.2022.1.102","DOIUrl":"https://doi.org/10.3905/jfds.2022.1.102","url":null,"abstract":"With a total outstanding balance of more than $8 trillion as of this writing, agency mortgage-backed securities (MBS) represent the second largest segment of the US bond market and the second most liquid fixed-income market after US Treasuries. Institutional investors have long participated in this market to take advantage of its attractive spread over US Treasuries, low credit risk, low transaction cost, and the ability to transact large quantities with ease. MBS are made of individual mortgages extended to US homeowners. The ability for a homeowner to refinance at any point introduces complexity in prepayment analysis and investing in the MBS sector. Traditional prepayment modeling has been able to capture many of the relationships between prepayments and related factors such as the level of interest rates and the value of the embedded prepayment option, yet the manual nature of variable construction and sheer amount of available data make it difficult to capture the dynamics of extremely complex systems. The long history and large amount of data available in MBS make it a prime candidate to leverage machine learning (ML) algorithms to better explain complex relationships between various macro- and microeconomic factors and MBS prepayments. The authors propose a systematic investment strategy using an ML-based mortgage prepayment model approach combined with a coupon allocation optimization model to create an optimal portfolio to capture alpha vs. a benchmark.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123531857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Managing Editor’s Letter 总编辑的信
Pub Date : 2022-07-31 DOI: 10.3905/jfds.2022.4.3.001
F. Fabozzi
{"title":"Managing Editor’s Letter","authors":"F. Fabozzi","doi":"10.3905/jfds.2022.4.3.001","DOIUrl":"https://doi.org/10.3905/jfds.2022.4.3.001","url":null,"abstract":"","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124159899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning for Econometricians: The Readme Manual 计量经济学家的机器学习:自述手册
Pub Date : 2022-07-09 DOI: 10.3905/jfds.2022.1.101
Marcos Lopez de Prado
One of the most exciting recent developments in financial research is the availability of new administrative, private sector, and micro-level datasets that did not exist a few years ago. The unstructured nature of many of these observations, along with the complexity of the phenomena they measure, means that many of these datasets are beyond the grasp of econometric analysis. Machine learning (ML) techniques offer the numerical power and functional flexibility needed to identify complex patterns in a high-dimensional space. ML is often perceived as a black box, however, in contrast to the transparency of econometric approaches. In this article, the author demonstrates that each analytical step of the econometric process has a homologous step in ML analyses. By clearly stating this correspondence, the author’s goal is to facilitate and reconcile the adoption of ML techniques among econometricians.
金融研究领域最近最令人兴奋的发展之一,是几年前不存在的新的行政、私营部门和微观层面的数据集的可用性。许多这些观察的非结构化性质,以及它们测量的现象的复杂性,意味着许多这些数据集超出了计量经济学分析的掌握范围。机器学习(ML)技术提供了在高维空间中识别复杂模式所需的数值能力和功能灵活性。然而,与计量经济学方法的透明度相比,机器学习通常被视为一个黑盒子。在本文中,作者论证了计量经济学过程的每个分析步骤在机器学习分析中都有一个相应的步骤。通过清楚地说明这种对应关系,作者的目标是促进和协调计量经济学家对ML技术的采用。
{"title":"Machine Learning for Econometricians: The Readme Manual","authors":"Marcos Lopez de Prado","doi":"10.3905/jfds.2022.1.101","DOIUrl":"https://doi.org/10.3905/jfds.2022.1.101","url":null,"abstract":"One of the most exciting recent developments in financial research is the availability of new administrative, private sector, and micro-level datasets that did not exist a few years ago. The unstructured nature of many of these observations, along with the complexity of the phenomena they measure, means that many of these datasets are beyond the grasp of econometric analysis. Machine learning (ML) techniques offer the numerical power and functional flexibility needed to identify complex patterns in a high-dimensional space. ML is often perceived as a black box, however, in contrast to the transparency of econometric approaches. In this article, the author demonstrates that each analytical step of the econometric process has a homologous step in ML analyses. By clearly stating this correspondence, the author’s goal is to facilitate and reconcile the adoption of ML techniques among econometricians.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123779303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Machine Learning in Behavioral Finance: A Systematic Literature Review 行为金融学中的机器学习:系统文献综述
Pub Date : 2022-07-06 DOI: 10.3905/jfds.2022.1.100
S. N. Hojaji, M. Yahyazadehfar, B. Abedin
This article endeavors to investigate the application of machine learning in behavioral economics and behavioral finance to represent a profile of studies conducted in this field. To accomplish this task, 90 scientific studies were systematically extracted between 2000 and June 1, 2020. Utilizing the text analysis techniques and related statistical methods, the abstracts of the extracted studies were reviewed and analyzed. First, it was found that attention to this field has developed in recent years with an accelerating trend. Second, it was demonstrated that specialized journals have also bestowed more curiosity in these studies than in the past by publishing more relevant studies. Third, results revealed that machine learning has been applied in areas such as investor sentiment, decision making, consumer behavior, trading strategies, game theory, and other areas in the field of behavioral economics and behavioral finance. In this regard, the application of machine learning has included techniques such as support vector machine, regression, neural networks, random forest, and so on. Despite the expanding consideration adjusted to this field by researchers and specialized journals, there are still many research gaps in this field. Accordingly, there is a relatively significant distance until fully unleashing the superior powers of machine learning, like prediction and classification in behavioral economics and behavioral finance. Finally, this research completed its mission by suggesting implications for the future of this field based on the acquired outcomes.
本文试图探讨机器学习在行为经济学和行为金融学中的应用,以代表在该领域进行的研究概况。为了完成这项任务,系统地提取了2000年至2020年6月1日期间的90项科学研究。利用文本分析技术和相关的统计方法,对提取的研究摘要进行了回顾和分析。首先,人们发现近年来对这一领域的关注有加速发展的趋势。其次,通过发表更多的相关研究,专业期刊也比过去更能赋予这些研究更多的好奇心。第三,结果显示,机器学习已经应用于投资者情绪、决策、消费者行为、交易策略、博弈论以及行为经济学和行为金融学领域的其他领域。在这方面,机器学习的应用包括支持向量机、回归、神经网络、随机森林等技术。尽管研究人员和专业期刊对这一领域的考虑越来越广泛,但这一领域的研究仍存在许多空白。因此,距离完全释放机器学习的优越能力,比如行为经济学和行为金融学中的预测和分类,还有相当大的距离。最后,本研究完成了它的使命,根据所获得的结果对该领域的未来提出了建议。
{"title":"Machine Learning in Behavioral Finance: A Systematic Literature Review","authors":"S. N. Hojaji, M. Yahyazadehfar, B. Abedin","doi":"10.3905/jfds.2022.1.100","DOIUrl":"https://doi.org/10.3905/jfds.2022.1.100","url":null,"abstract":"This article endeavors to investigate the application of machine learning in behavioral economics and behavioral finance to represent a profile of studies conducted in this field. To accomplish this task, 90 scientific studies were systematically extracted between 2000 and June 1, 2020. Utilizing the text analysis techniques and related statistical methods, the abstracts of the extracted studies were reviewed and analyzed. First, it was found that attention to this field has developed in recent years with an accelerating trend. Second, it was demonstrated that specialized journals have also bestowed more curiosity in these studies than in the past by publishing more relevant studies. Third, results revealed that machine learning has been applied in areas such as investor sentiment, decision making, consumer behavior, trading strategies, game theory, and other areas in the field of behavioral economics and behavioral finance. In this regard, the application of machine learning has included techniques such as support vector machine, regression, neural networks, random forest, and so on. Despite the expanding consideration adjusted to this field by researchers and specialized journals, there are still many research gaps in this field. Accordingly, there is a relatively significant distance until fully unleashing the superior powers of machine learning, like prediction and classification in behavioral economics and behavioral finance. Finally, this research completed its mission by suggesting implications for the future of this field based on the acquired outcomes.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125533570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Meta-Labeling: Theory and Framework 元标签:理论与框架
Pub Date : 2022-06-23 DOI: 10.3905/jfds.2022.1.098
J. Joubert
Meta-labeling is a machine learning (ML) layer that sits on top of a base primary strategy to help size positions, filter out false-positive signals, and improve metrics such as the Sharpe ratio and maximum drawdown. This article consolidates the knowledge of several publications into a single work, providing practitioners with a clear framework to support the application of meta-labeling to investment strategies. The relationships between binary classification metrics and strategy performance are explained, alongside answers to many frequently asked questions regarding the technique. The author also deconstructs meta-labeling into three components, using a controlled experiment to show how each component helps to improve strategy metrics and what types of features should be considered in the model specification phase.
元标记(Meta-labeling)是一个机器学习(ML)层,它位于基本主要策略之上,可以帮助调整头寸大小,过滤掉假阳性信号,并改善夏普比率和最大收缩等指标。本文将几个出版物的知识整合到一个工作中,为从业者提供了一个明确的框架,以支持元标签在投资策略中的应用。本文解释了二元分类指标与策略绩效之间的关系,并回答了有关该技术的许多常见问题。作者还将元标签分解为三个组件,使用一个控制实验来展示每个组件如何帮助改进策略度量,以及在模型规范阶段应该考虑哪些类型的特征。
{"title":"Meta-Labeling: Theory and Framework","authors":"J. Joubert","doi":"10.3905/jfds.2022.1.098","DOIUrl":"https://doi.org/10.3905/jfds.2022.1.098","url":null,"abstract":"Meta-labeling is a machine learning (ML) layer that sits on top of a base primary strategy to help size positions, filter out false-positive signals, and improve metrics such as the Sharpe ratio and maximum drawdown. This article consolidates the knowledge of several publications into a single work, providing practitioners with a clear framework to support the application of meta-labeling to investment strategies. The relationships between binary classification metrics and strategy performance are explained, alongside answers to many frequently asked questions regarding the technique. The author also deconstructs meta-labeling into three components, using a controlled experiment to show how each component helps to improve strategy metrics and what types of features should be considered in the model specification phase.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115247953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the Black–Litterman Model: Learning to Do Better 论Black-Litterman模型:学会做得更好
Pub Date : 2022-06-18 DOI: 10.3905/jfds.2022.1.096
Ren‐Raw Chen, S. Yeh, Xiaohu Zhang
In this article, the authors study the performance of the Black–Litterman model (BLM) and compare it to the traditional mean–variance theory (MVT) of Markowitz (1952) and Sharpe (1964). They begin with the standard Bayesian learning on which the BLM is based (but the existing literature does not follow). Then, they perform a series of tests of the BLM using machine learning tools and view specifications consistent with the existing literature. Their empirical evidence (which uses 30 years of monthly data from January 1991 till December 2020) suggests that the BLM is highly sensitive to the specification of the view. Given that the view is arbitrary (even though in our article, they are rule based), it is quite a challenge to use the BLM in an actual situation. A great amount of caution must be exercised in specifying the view and its corresponding required return. This validates the previous result that BLM specification of views is very important and there is no consistent manner how one can specify a winning portfolio.
本文研究了Black-Litterman模型(BLM)的性能,并将其与Markowitz(1952)和Sharpe(1964)的传统均值-方差理论(MVT)进行了比较。他们从BLM所基于的标准贝叶斯学习开始(但现有文献没有遵循)。然后,他们使用机器学习工具对BLM进行一系列测试,并查看与现有文献一致的规范。他们的经验证据(使用了从1991年1月到2020年12月的30年月度数据)表明,BLM对观点的具体化高度敏感。假设视图是任意的(尽管在我们的文章中,它们是基于规则的),那么在实际情况中使用BLM是一个相当大的挑战。在指定视图及其相应的所需返回时,必须非常谨慎。这验证了之前的结果,即BLM的观点规范非常重要,并且没有一致的方式可以指定一个获胜的投资组合。
{"title":"On the Black–Litterman Model: Learning to Do Better","authors":"Ren‐Raw Chen, S. Yeh, Xiaohu Zhang","doi":"10.3905/jfds.2022.1.096","DOIUrl":"https://doi.org/10.3905/jfds.2022.1.096","url":null,"abstract":"In this article, the authors study the performance of the Black–Litterman model (BLM) and compare it to the traditional mean–variance theory (MVT) of Markowitz (1952) and Sharpe (1964). They begin with the standard Bayesian learning on which the BLM is based (but the existing literature does not follow). Then, they perform a series of tests of the BLM using machine learning tools and view specifications consistent with the existing literature. Their empirical evidence (which uses 30 years of monthly data from January 1991 till December 2020) suggests that the BLM is highly sensitive to the specification of the view. Given that the view is arbitrary (even though in our article, they are rule based), it is quite a challenge to use the BLM in an actual situation. A great amount of caution must be exercised in specifying the view and its corresponding required return. This validates the previous result that BLM specification of views is very important and there is no consistent manner how one can specify a winning portfolio.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132327383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Health State Risk Categorization: A Machine Learning Clustering Approach Using Health and Retirement Study Data 健康状态风险分类:使用健康和退休研究数据的机器学习聚类方法
Pub Date : 2022-04-30 DOI: 10.3905/jfds.2022.4.2.139
F. Tan, D. Mehta
For countries such as the United States, which lacks a universal health care system, future health care costs can create significant uncertainty that a retirement investment strategy must be built to manage. One of the most important factors determining health care costs is the individual’s health status. Hence, categorizing individuals into meaningful health risk types is an essential task. The conventional approach is to use individuals’ self-rated health state categorization. In this work, the authors provide an objective and data-driven machine learning (ML)–based approach to categorize heath state risk by using the most widely used US household surveys on older Americans, the Health and Retirement Study (HRS). The authors propose an approach of employing the K-modes clustering method to algorithmically cluster on an exhaustive list of categorical health-related variables in the HRS. The resulting clusters are shown to provide an objective, interpretable, and practical health state risk categorization. The authors then compare and contrast the ML-based and self-rated health state categorizations and discuss the implications of the differences. They also illustrate the difficulty in predicting out-of-pocket costs based on self-rated health status and how ML-based categorizations can generate more-accurate health care cost estimates for personalized retirement planning. The results in this article open different avenues of research, including behavioral science analysis for health and retirement study.
对于像美国这样缺乏全民医疗保健体系的国家来说,未来的医疗保健成本可能会产生巨大的不确定性,必须建立退休投资策略来管理。决定医疗费用的最重要因素之一是个人的健康状况。因此,将个体划分为有意义的健康风险类型是一项重要任务。传统的方法是使用个体自评健康状态分类。在这项工作中,作者提供了一种客观的、基于数据驱动的机器学习(ML)的方法,通过使用最广泛使用的美国老年人家庭调查,即健康与退休研究(HRS),对健康状态风险进行分类。作者提出了一种方法,采用k模式聚类方法,对HRS中与健康相关的分类变量的详尽列表进行算法聚类。结果显示,集群提供了一个客观的,可解释的,实用的健康状态风险分类。然后,作者比较和对比了基于ml的和自评的健康状态分类,并讨论了差异的含义。他们还说明了基于自评健康状况预测自付费用的困难,以及基于ml的分类如何为个性化退休计划生成更准确的医疗保健费用估计。本文的研究结果开辟了不同的研究途径,包括对健康和退休研究的行为科学分析。
{"title":"Health State Risk Categorization: A Machine Learning Clustering Approach Using Health and Retirement Study Data","authors":"F. Tan, D. Mehta","doi":"10.3905/jfds.2022.4.2.139","DOIUrl":"https://doi.org/10.3905/jfds.2022.4.2.139","url":null,"abstract":"For countries such as the United States, which lacks a universal health care system, future health care costs can create significant uncertainty that a retirement investment strategy must be built to manage. One of the most important factors determining health care costs is the individual’s health status. Hence, categorizing individuals into meaningful health risk types is an essential task. The conventional approach is to use individuals’ self-rated health state categorization. In this work, the authors provide an objective and data-driven machine learning (ML)–based approach to categorize heath state risk by using the most widely used US household surveys on older Americans, the Health and Retirement Study (HRS). The authors propose an approach of employing the K-modes clustering method to algorithmically cluster on an exhaustive list of categorical health-related variables in the HRS. The resulting clusters are shown to provide an objective, interpretable, and practical health state risk categorization. The authors then compare and contrast the ML-based and self-rated health state categorizations and discuss the implications of the differences. They also illustrate the difficulty in predicting out-of-pocket costs based on self-rated health status and how ML-based categorizations can generate more-accurate health care cost estimates for personalized retirement planning. The results in this article open different avenues of research, including behavioral science analysis for health and retirement study.","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125915027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Managing Editor’s Letter 总编辑的信
Pub Date : 2022-04-30 DOI: 10.3905/jfds.2022.4.2.001
F. Fabozzi
Cathy Scott General Manager and Publisher The lead article in this issue is by the co-editor of this journal, Marcos López de Prado, “Machine Learning for Econometricians: The Readme Manual.” As he notes, econometric tools are typically applied in investment research despite the fact that they are poorly suited for uncovering statistical patterns in financial data. This is because of the unstructured nature of financial datasets, as well as the complex relationships involved in financial markets. Researchers and analysts working for asset managers overlook these limitations as they take the view that econometric approaches are more appropriate than machine learning methods. One of their objections to using machine learning is that their tools are not transparent (i.e., it is a black box approach to problem solving). López de Prado demonstrates why it is not the case that machine learning is a black box. For each analytical step of the econometric process, he identifies a corresponding step in machine learning analysis. By clearly stating this correspondence, López de Prado has facilitated and reconciled the adoption of machine techniques among econometricians, offering a bridge from classical statistics to machine learning. The process of meta-labeling, introduced by López de Prado, is used as the machine learning layer of an investment strategy that can determine the size of positions, filter out false-positive signals from backtests, and improve performance metrics. In “Meta-Labeling: Theory and Framework,” Jacques Francois Joubert provides an overview of meta-labeling’s theoretical framework (including its architecture and applications). Then the author describes the methodology for three controlled experiments designed to break meta-labeling down into three components: information advantage, modeling for false positives, and position sizing. The three experiments validated that meta-labeling not only improves classification metrics but also significantly improves the performance of various types of primary investment strategies. Because of this attribute of meta-labeling, this article provides a good case study of how machine learning can be applied in financial markets. Studies have shown that security prices are driven by information beyond the financial information reported by companies in their filings with the Securities and Exchange Commission. This information includes news and investor-based sentiment. In “FinEAS: Financial Embedding Analysis of Sentiment,” a new language representation model for sentiment analysis of financial text called “financial embedding analysis of sentiment” (FinEAS) is introduced by Asier Gutiérrez-Fandiño, Petter N. Kolm, Miquel Noguer i Alonso, and Jordi Armengol-Estapé. Their approach is based on transformer language models that are explicitly developed for sentence-level analysis which builds on Sentence-BERT, a sentence-level extension of vanilla BERT. The authors argue that the new approach generates se
这期的主要文章是由本刊的联合编辑Marcos López de Prado撰写的“计量经济学家的机器学习:自述手册”。正如他所指出的,计量经济学工具通常应用于投资研究,尽管它们不太适合揭示金融数据中的统计模式。这是因为金融数据集的非结构化性质,以及金融市场中涉及的复杂关系。为资产管理公司工作的研究人员和分析师忽视了这些限制,因为他们认为计量经济学方法比机器学习方法更合适。他们反对使用机器学习的原因之一是他们的工具不透明(即,它是解决问题的黑箱方法)。López de Prado展示了为什么机器学习不是一个黑盒子。对于计量经济学过程的每个分析步骤,他确定了机器学习分析中的相应步骤。通过清楚地说明这种对应关系,López de Prado促进和协调了计量经济学家对机器技术的采用,提供了从经典统计学到机器学习的桥梁。由López de Prado引入的元标记过程被用作投资策略的机器学习层,可以确定头寸的大小,过滤回测中的假阳性信号,并提高绩效指标。在“元标签:理论和框架”一文中,Jacques Francois Joubert概述了元标签的理论框架(包括其架构和应用)。然后,作者描述了三个对照实验的方法,旨在将元标签分解为三个组成部分:信息优势、假阳性建模和位置大小。三个实验验证了元标记不仅提高了分类指标,而且显著提高了各类主要投资策略的绩效。由于元标签的这种属性,本文提供了一个很好的案例研究,说明机器学习如何应用于金融市场。研究表明,证券价格是由公司在提交给美国证券交易委员会(Securities and Exchange Commission)的文件中报告的财务信息以外的信息驱动的。这些信息包括新闻和投资者情绪。在“FinEAS:情感的金融嵌入分析”一文中,Asier Gutiérrez-Fandiño、Petter N. Kolm、Miquel Noguer i Alonso和Jordi armengol - estapeer提出了一种新的金融文本情感分析的语言表示模型“情感的金融嵌入分析”(FinEAS)。他们的方法基于为句子级分析而明确开发的转换语言模型,该模型建立在Sentence-BERT (vanilla BERT的句子级扩展)之上。作者认为,新方法生成的句子嵌入质量更高,可以显著提高句子/文档级别的任务,如金融情绪分析。使用来自RavenPack的大规模金融新闻数据集,作者证明,对于金融情绪分析,新模型优于几个最先进的模型。作者公开了模型代码。深度强化学习(DRL)已经引起了实践者的极大兴趣。然而,它的应用一直受到从业者需要的限制
{"title":"Managing Editor’s Letter","authors":"F. Fabozzi","doi":"10.3905/jfds.2022.4.2.001","DOIUrl":"https://doi.org/10.3905/jfds.2022.4.2.001","url":null,"abstract":"Cathy Scott General Manager and Publisher The lead article in this issue is by the co-editor of this journal, Marcos López de Prado, “Machine Learning for Econometricians: The Readme Manual.” As he notes, econometric tools are typically applied in investment research despite the fact that they are poorly suited for uncovering statistical patterns in financial data. This is because of the unstructured nature of financial datasets, as well as the complex relationships involved in financial markets. Researchers and analysts working for asset managers overlook these limitations as they take the view that econometric approaches are more appropriate than machine learning methods. One of their objections to using machine learning is that their tools are not transparent (i.e., it is a black box approach to problem solving). López de Prado demonstrates why it is not the case that machine learning is a black box. For each analytical step of the econometric process, he identifies a corresponding step in machine learning analysis. By clearly stating this correspondence, López de Prado has facilitated and reconciled the adoption of machine techniques among econometricians, offering a bridge from classical statistics to machine learning. The process of meta-labeling, introduced by López de Prado, is used as the machine learning layer of an investment strategy that can determine the size of positions, filter out false-positive signals from backtests, and improve performance metrics. In “Meta-Labeling: Theory and Framework,” Jacques Francois Joubert provides an overview of meta-labeling’s theoretical framework (including its architecture and applications). Then the author describes the methodology for three controlled experiments designed to break meta-labeling down into three components: information advantage, modeling for false positives, and position sizing. The three experiments validated that meta-labeling not only improves classification metrics but also significantly improves the performance of various types of primary investment strategies. Because of this attribute of meta-labeling, this article provides a good case study of how machine learning can be applied in financial markets. Studies have shown that security prices are driven by information beyond the financial information reported by companies in their filings with the Securities and Exchange Commission. This information includes news and investor-based sentiment. In “FinEAS: Financial Embedding Analysis of Sentiment,” a new language representation model for sentiment analysis of financial text called “financial embedding analysis of sentiment” (FinEAS) is introduced by Asier Gutiérrez-Fandiño, Petter N. Kolm, Miquel Noguer i Alonso, and Jordi Armengol-Estapé. Their approach is based on transformer language models that are explicitly developed for sentence-level analysis which builds on Sentence-BERT, a sentence-level extension of vanilla BERT. The authors argue that the new approach generates se","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126420722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
The Journal of Financial Data Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1