Statistical analysis of extremes can be used to predict the probability of future extreme events, such as large rainfalls or devastating windstorms. The quality of these forecasts can be measured through scoring rules. Locally scale invariant scoring rules give equal importance to the forecasts at different locations regardless of differences in the prediction uncertainty. This is a useful feature when computing average scores but can be an unnecessarily strict requirement when one is mostly concerned with extremes. We propose the concept of local weight-scale invariance, describing scoring rules fulfilling local scale invariance in a certain region of interest, and as a special case, local tail-scale invariance for large events. Moreover, a new version of the weighted continuous ranked probability score (wCRPS) called the scaled wCRPS (swCRPS) that possesses this property is developed and studied. The score is a suitable alternative for scoring extreme value models over areas with a varying scale of extreme events, and we derive explicit formulas of the score for the generalised extreme value distribution. The scoring rules are compared through simulations, and their usage is illustrated by modelling extreme water levels and annual maximum rainfall, and in an application to non-extreme forecasts for the prediction of air pollution.
In forecasting, there is a tradeoff between in-sample fit and out-of-sample forecast accuracy. Parsimonious model specifications typically outperform richer model specifications. Consequently, information is often withheld from a forecast to prevent over-fitting the data. We show that one way to exploit this information is through forecast combination. Optimal combination weights in this environment minimize the conditional mean squared error that balances the conditional bias and the conditional variance of the combination. The bias-adjusted conditionally optimal forecast weights are time varying and forward looking. Real-time tests of conditionally optimal combinations of model-based forecasts and surveys of professional forecasters show significant gains in forecast accuracy relative to standard benchmarks for inflation and other macroeconomic variables.
We introduce a loss discounting framework for model and forecast combination, which generalises and combines Bayesian model synthesis and generalized Bayes methodologies. We use a loss function to score the performance of different models and introduce a multilevel discounting scheme that allows for a flexible specification of the dynamics of the model weights. This novel and simple model combination approach can be easily applied to large-scale model averaging/selection, handle unusual features such as sudden regime changes and be tailored to different forecasting problems. We compare our method to established and state-of-the-art methods for several macroeconomic forecasting examples. The proposed method offers an attractive, computationally efficient alternative to the benchmark methodologies and often outperforms more complex techniques.
Hierarchical forecasting techniques allow for the creation of forecasts that are coherent with respect to a pre-specified hierarchy of the underlying time series. This targets a key problem in e-commerce, where we often find millions of products across many product hierarchies, and forecasts must be made for individual products and product aggregations. However, existing hierarchical forecasting techniques scale poorly when the number of time series increases, which limits their applicability at a scale of millions of products.
In this paper, we propose to learn a coherent forecast for millions of products with a single bottom-level forecast model by using a loss function that directly optimizes the hierarchical product structure. We implement our loss function using sparse linear algebra, such that the number of operations in our loss function scales quadratically rather than cubically with the number of products and levels in the hierarchical structure. The benefit of our sparse hierarchical loss function is that it provides practitioners with a method of producing bottom-level forecasts that are coherent to any chosen cross-sectional or temporal hierarchy. In addition, removing the need for a post-processing step as required in traditional hierarchical forecasting techniques reduces the computational cost of the prediction phase in the forecasting pipeline and its deployment complexity.
In our tests on the public M5 dataset, our sparse hierarchical loss function performs up to 10% better as measured by RMSE and MAE than the baseline loss function. Next, we implement our sparse hierarchical loss function within a gradient boosting-based forecasting model at bol.com, a large European e-commerce platform. At bol.com, each day, a forecast for the weekly demand of every product for the next twelve weeks is required. In this setting, our sparse hierarchical loss resulted in an improved forecasting performance as measured by RMSE of about 2% at the product level, compared to the baseline model, and an improvement of about 10% at the product level as measured by MAE. Finally, we found an increase in forecasting performance of about 5%–10% (both RMSE and MAE) when evaluating the forecasting performance across the cross-sectional hierarchies we defined. These results demonstrate the usefulness of our sparse hierarchical loss applied to a production forecasting system at a major e-commerce platform.
This paper evaluates the predictive performance of various factor estimation methods in big data. Extensive forecasting experiments are examined using seven factor estimation methods with 13 decision rules determining the number of factors. The out-of-sample forecasting results show that the first Partial Least Squares factor (1-PLS) tends to be the best-performing method among all the possible alternatives. This finding is prevalent in many target variables under different forecasting horizons and models. This significant improvement can be explained by the PLS factor estimation strategy that considers the covariance with the target variable. Second, using a consistently estimated number of factors may not necessarily improve forecasting performance. The greatest predictive gain often derives from decision rules that do not consistently estimate the true number of factors.