利用高维数据对机器学习预测进行汇集和胜选,以预测股票回报率

IF 2.1 2区 经济学 Q2 BUSINESS, FINANCE Journal of Empirical Finance Pub Date : 2024-09-02 DOI:10.1016/j.jempfin.2024.101538
Erik Mekelburg , Jack Strauss
{"title":"利用高维数据对机器学习预测进行汇集和胜选,以预测股票回报率","authors":"Erik Mekelburg ,&nbsp;Jack Strauss","doi":"10.1016/j.jempfin.2024.101538","DOIUrl":null,"url":null,"abstract":"<div><p>We evaluate US market return predictability using a novel data set of several hundred ag- gregated firm-level characteristics. We apply LASSO, Elastic Net, Random Forest, Neural Net, Extreme Gradient Boosting, and Light Gradient Boosting Machine methods and find these models experience large prediction errors that lead to forecast failures. However, winsorizing and pooling machine learning model forecasts provides consistent out-of-sample predictability. To assess robustness, we apply machine learning methods to high-dimensional data for Canada, China, Germany and the UK as well as the Goyal–Welch data. All machine learning models we consider, except for the ensemble pooled methods, fail to significantly predict returns across our samples, highlighting the importance of pooling, evaluating additional economies, and the fragility of individual machine learning methods. Our results shed light on the sparsity versus density debate as the degree of sparsity and variable importance evolves over time.</p></div>","PeriodicalId":15704,"journal":{"name":"Journal of Empirical Finance","volume":"79 ","pages":"Article 101538"},"PeriodicalIF":2.1000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0927539824000732/pdfft?md5=a9db7e6e4ae641bec07f185220532c35&pid=1-s2.0-S0927539824000732-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Pooling and winsorizing machine learning forecasts to predict stock returns with high-dimensional data\",\"authors\":\"Erik Mekelburg ,&nbsp;Jack Strauss\",\"doi\":\"10.1016/j.jempfin.2024.101538\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>We evaluate US market return predictability using a novel data set of several hundred ag- gregated firm-level characteristics. We apply LASSO, Elastic Net, Random Forest, Neural Net, Extreme Gradient Boosting, and Light Gradient Boosting Machine methods and find these models experience large prediction errors that lead to forecast failures. However, winsorizing and pooling machine learning model forecasts provides consistent out-of-sample predictability. To assess robustness, we apply machine learning methods to high-dimensional data for Canada, China, Germany and the UK as well as the Goyal–Welch data. All machine learning models we consider, except for the ensemble pooled methods, fail to significantly predict returns across our samples, highlighting the importance of pooling, evaluating additional economies, and the fragility of individual machine learning methods. Our results shed light on the sparsity versus density debate as the degree of sparsity and variable importance evolves over time.</p></div>\",\"PeriodicalId\":15704,\"journal\":{\"name\":\"Journal of Empirical Finance\",\"volume\":\"79 \",\"pages\":\"Article 101538\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2024-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0927539824000732/pdfft?md5=a9db7e6e4ae641bec07f185220532c35&pid=1-s2.0-S0927539824000732-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Empirical Finance\",\"FirstCategoryId\":\"96\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0927539824000732\",\"RegionNum\":2,\"RegionCategory\":\"经济学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BUSINESS, FINANCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Empirical Finance","FirstCategoryId":"96","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0927539824000732","RegionNum":2,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BUSINESS, FINANCE","Score":null,"Total":0}
引用次数: 0

摘要

我们使用一个包含数百个公司级特征的新数据集来评估美国市场回报率的可预测性。我们应用了 LASSO、Elastic Net、Random Forest、Neural Net、Extreme Gradient Boosting 和 Light Gradient Boosting Machine 方法,发现这些模型的预测误差较大,导致预测失败。然而,对机器学习模型预测进行胜选和池化可提供一致的样本外预测能力。为了评估稳健性,我们将机器学习方法应用于加拿大、中国、德国和英国的高维数据以及 Goyal-Welch 数据。我们所考虑的所有机器学习模型,除了集合汇集方法外,都无法显著预测整个样本的回报率,这凸显了汇集、评估其他经济体的重要性,以及单个机器学习方法的脆弱性。随着稀疏程度和变量重要性的不断变化,我们的结果揭示了稀疏性与密度之间的争论。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Pooling and winsorizing machine learning forecasts to predict stock returns with high-dimensional data

We evaluate US market return predictability using a novel data set of several hundred ag- gregated firm-level characteristics. We apply LASSO, Elastic Net, Random Forest, Neural Net, Extreme Gradient Boosting, and Light Gradient Boosting Machine methods and find these models experience large prediction errors that lead to forecast failures. However, winsorizing and pooling machine learning model forecasts provides consistent out-of-sample predictability. To assess robustness, we apply machine learning methods to high-dimensional data for Canada, China, Germany and the UK as well as the Goyal–Welch data. All machine learning models we consider, except for the ensemble pooled methods, fail to significantly predict returns across our samples, highlighting the importance of pooling, evaluating additional economies, and the fragility of individual machine learning methods. Our results shed light on the sparsity versus density debate as the degree of sparsity and variable importance evolves over time.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
3.40
自引率
3.80%
发文量
59
期刊介绍: The Journal of Empirical Finance is a financial economics journal whose aim is to publish high quality articles in empirical finance. Empirical finance is interpreted broadly to include any type of empirical work in financial economics, financial econometrics, and also theoretical work with clear empirical implications, even when there is no empirical analysis. The Journal welcomes articles in all fields of finance, such as asset pricing, corporate finance, financial econometrics, banking, international finance, microstructure, behavioural finance, etc. The Editorial Team is willing to take risks on innovative research, controversial papers, and unusual approaches. We are also particularly interested in work produced by young scholars. The composition of the editorial board reflects such goals.
期刊最新文献
High-frequency realized stochastic volatility model Jump tail risk exposure and the cross-section of stock returns Time-varying variance decomposition of macro-finance term structure models Technological shocks and stock market volatility over a century Is firm-level political risk priced in the corporate bond market?
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1