机器学习、时间序列和混合方法对低频和高频时间序列的预测性能

IF 1.4 3区数学 Q2 STATISTICS & PROBABILITY Statistica Neerlandica Pub Date : 2023-11-03 DOI:10.1111/stan.12326

Ozancan Ozdemir, Ceylan Yozgatlıgil

{"title":"机器学习、时间序列和混合方法对低频和高频时间序列的预测性能","authors":"Ozancan Ozdemir, Ceylan Yozgatlıgil","doi":"10.1111/stan.12326","DOIUrl":null,"url":null,"abstract":"One of the main objectives of the time series analysis is forecasting, so both Machine Learning methods and statistical methods have been proposed in the literature. In this study, we compare the forecasting performance of some of these approaches. In addition to traditional forecasting methods, which are the Naive and Seasonal Naive Methods, S/ARIMA, Exponential Smoothing, TBATS, Bayesian Exponential Smoothing Models with Trend Modifications and STL Decomposition, the forecasts are also obtained using seven different machine learning methods, which are Random Forest, Support Vector Regression, XGBoosting, BNN, RNN, LSTM, and FFNN, and the hybridization of both statistical time series and machine learning methods. The data set is selected proportionally from various time domains in M4 Competition data set. Thereby, we aim to create a forecasting guide by considering different preprocessing approaches, methods, and data sets having various time domains. After the experiment, the performance and impact of all methods are discussed. Therefore, most of the best models are mainly selected from machine learning methods for forecasting. Moreover, the forecasting performance of the model is affected by both the time frequency and forecast horizon. Lastly, the study suggests that the hybrid approach is not always the best model for forecasting. Hence, this study provides guidelines to understand which method will perform better at different time series frequencies.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":"180 S455","pages":"0"},"PeriodicalIF":1.4000,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Forecasting Performance of Machine Learning, Time Series and Hybrid Methods for Low and High Frequency Time Series\",\"authors\":\"Ozancan Ozdemir, Ceylan Yozgatlıgil\",\"doi\":\"10.1111/stan.12326\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One of the main objectives of the time series analysis is forecasting, so both Machine Learning methods and statistical methods have been proposed in the literature. In this study, we compare the forecasting performance of some of these approaches. In addition to traditional forecasting methods, which are the Naive and Seasonal Naive Methods, S/ARIMA, Exponential Smoothing, TBATS, Bayesian Exponential Smoothing Models with Trend Modifications and STL Decomposition, the forecasts are also obtained using seven different machine learning methods, which are Random Forest, Support Vector Regression, XGBoosting, BNN, RNN, LSTM, and FFNN, and the hybridization of both statistical time series and machine learning methods. The data set is selected proportionally from various time domains in M4 Competition data set. Thereby, we aim to create a forecasting guide by considering different preprocessing approaches, methods, and data sets having various time domains. After the experiment, the performance and impact of all methods are discussed. Therefore, most of the best models are mainly selected from machine learning methods for forecasting. Moreover, the forecasting performance of the model is affected by both the time frequency and forecast horizon. Lastly, the study suggests that the hybrid approach is not always the best model for forecasting. Hence, this study provides guidelines to understand which method will perform better at different time series frequencies.\",\"PeriodicalId\":51178,\"journal\":{\"name\":\"Statistica Neerlandica\",\"volume\":\"180 S455\",\"pages\":\"0\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2023-11-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Statistica Neerlandica\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1111/stan.12326\",\"RegionNum\":3,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistica Neerlandica","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1111/stan.12326","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}

引用次数: 0

摘要

时间序列分析的主要目标之一是预测，因此文献中提出了机器学习方法和统计方法。在本研究中，我们比较了其中一些方法的预测性能。除了传统的朴素法和季节朴素法、S/ARIMA、指数平滑、TBATS、趋势修正贝叶斯指数平滑模型和STL分解等预测方法外，还采用随机森林、支持向量回归、XGBoosting、BNN、RNN、LSTM和FFNN等7种不同的机器学习方法，以及统计时间序列和机器学习方法的混合方法进行预测。数据集是按比例从M4比赛数据集中的各个时域中选择的。因此，我们的目标是通过考虑不同的预处理方法、方法和具有不同时域的数据集来创建一个预测指南。通过实验，讨论了各种方法的性能和影响。因此，大多数最好的模型主要是从机器学习方法中选择的。此外，模型的预测效果受时间频率和预测范围的影响。最后，研究表明，混合方法并不总是预测的最佳模型。因此，本研究提供了指导方针，以了解哪种方法在不同的时间序列频率下表现更好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Forecasting Performance of Machine Learning, Time Series and Hybrid Methods for Low and High Frequency Time Series

One of the main objectives of the time series analysis is forecasting, so both Machine Learning methods and statistical methods have been proposed in the literature. In this study, we compare the forecasting performance of some of these approaches. In addition to traditional forecasting methods, which are the Naive and Seasonal Naive Methods, S/ARIMA, Exponential Smoothing, TBATS, Bayesian Exponential Smoothing Models with Trend Modifications and STL Decomposition, the forecasts are also obtained using seven different machine learning methods, which are Random Forest, Support Vector Regression, XGBoosting, BNN, RNN, LSTM, and FFNN, and the hybridization of both statistical time series and machine learning methods. The data set is selected proportionally from various time domains in M4 Competition data set. Thereby, we aim to create a forecasting guide by considering different preprocessing approaches, methods, and data sets having various time domains. After the experiment, the performance and impact of all methods are discussed. Therefore, most of the best models are mainly selected from machine learning methods for forecasting. Moreover, the forecasting performance of the model is affected by both the time frequency and forecast horizon. Lastly, the study suggests that the hybrid approach is not always the best model for forecasting. Hence, this study provides guidelines to understand which method will perform better at different time series frequencies.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Statistica Neerlandica 数学-统计学与概率论

CiteScore

2.60

自引率

6.70%

发文量

审稿时长

>12 weeks

期刊介绍： Statistica Neerlandica has been the journal of the Netherlands Society for Statistics and Operations Research since 1946. It covers all areas of statistics, from theoretical to applied, with a special emphasis on mathematical statistics, statistics for the behavioural sciences and biostatistics. This wide scope is reflected by the expertise of the journal’s editors representing these areas. The diverse editorial board is committed to a fast and fair reviewing process, and will judge submissions on quality, correctness, relevance and originality. Statistica Neerlandica encourages transparency and reproducibility, and offers online resources to make data, code, simulation results and other additional materials publicly available.