This paper details the approach of the team Kohrrelation in the 2021 Extreme Value Analysis data challenge, dealing with the prediction of wildfire counts and sizes over the contiguous US. Our approach uses ideas from extreme-value theory in a machine learning context with theoretically justified loss functions for gradient boosting. We devise a spatial cross-validation scheme and show that in our setting it provides a better proxy for test set performance than naive cross-validation. The predictions are benchmarked against boosting approaches with different loss functions, and perform competitively in terms of the score criterion, finally placing second in the competition ranking.
Confounding variables are a recurrent challenge for causal discovery and inference. In many situations, complex causal mechanisms only manifest themselves in extreme events, or take simpler forms in the extremes. Stimulated by data on extreme river flows and precipitation, we introduce a new causal discovery methodology for heavy-tailed variables that allows the effect of a known potential confounder to be almost entirely removed when the variables have comparable tails, and also decreases it sufficiently to enable correct causal inference when the confounder has a heavier tail. We also introduce a new parametric estimator for the existing causal tail coefficient and a permutation test. Simulations show that the methods work well and the ideas are applied to the motivating dataset.
Supplementary information: The online version contains supplementary material available at 10.1007/s10687-022-00456-4.
A bivariate extreme-value copula is characterized by its Pickands dependence function, i.e., a convex function defined on the unit interval satisfying boundary conditions. This paper investigates the large-sample behavior of a nonparametric estimator of this function due to Cormier et al. (Extremes 17:633-659, 2014). These authors showed how to construct this estimator through constrained quadratic median B-spline smoothing of pairs of pseudo-observations derived from a random sample. Their estimator is shown here to exist whatever the order of the B-spline basis, and its consistency is established under minimal conditions. The large-sample distribution of this estimator is also determined under the additional assumption that the underlying Pickands dependence function is a B-spline of given order with a known set of knots.

