In statistical learning, identifying underlying structures of true target functions based on observed data plays a crucial role to facilitate subsequent modeling and analysis. Unlike most of those existing methods that focus on some specific settings under certain model assumptions, a general and novel framework is proposed for recovering the true structures of target functions by using unstructured M-estimation in a reproducing kernel Hilbert space (RKHS) in this paper. This framework is inspired by the fact that gradient functions can be employed as a valid tool to learn underlying structures, including sparse learning, interaction selection and model identification, and it is easy to implement by taking advantage of some nice properties of the RKHS. More importantly, it admits a wide range of loss functions, and thus includes many commonly used methods as special cases, such as mean regression, quantile regression, likelihood-based classification, and margin-based classification, which is also computationally efficient by solving convex optimization tasks. The asymptotic results of the proposed framework are established within a rich family of loss functions without any explicit model specifications. The superior performance of the proposed framework is also demonstrated by a variety of simulated examples and a real case study.
{"title":"Structure learning via unstructured kernel-based M-estimation","authors":"Xin He, Yeheng Ge, Xingdong Feng","doi":"10.1214/23-ejs2153","DOIUrl":"https://doi.org/10.1214/23-ejs2153","url":null,"abstract":"In statistical learning, identifying underlying structures of true target functions based on observed data plays a crucial role to facilitate subsequent modeling and analysis. Unlike most of those existing methods that focus on some specific settings under certain model assumptions, a general and novel framework is proposed for recovering the true structures of target functions by using unstructured M-estimation in a reproducing kernel Hilbert space (RKHS) in this paper. This framework is inspired by the fact that gradient functions can be employed as a valid tool to learn underlying structures, including sparse learning, interaction selection and model identification, and it is easy to implement by taking advantage of some nice properties of the RKHS. More importantly, it admits a wide range of loss functions, and thus includes many commonly used methods as special cases, such as mean regression, quantile regression, likelihood-based classification, and margin-based classification, which is also computationally efficient by solving convex optimization tasks. The asymptotic results of the proposed framework are established within a rich family of loss functions without any explicit model specifications. The superior performance of the proposed framework is also demonstrated by a variety of simulated examples and a real case study.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135958438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01Epub Date: 2023-12-07DOI: 10.1214/23-ejs2188
Tianyu Zhang, Noah Simon
Estimation of a conditional mean (linking a set of features to an outcome of interest) is a fundamental statistical task. While there is an appeal to flexible nonparametric procedures, effective estimation in many classical nonparametric function spaces, e.g., multivariate Sobolev spaces, can be prohibitively difficult - both statistically and computationally - especially when the number of features is large. In this paper, we present some sieve estimators for regression in multivariate product spaces. We take Sobolev-type smoothness spaces as an example, though our general framework can be applied to many reproducing kernel Hilbert spaces. These spaces are more amenable to multivariate regression, and allow us to, inpart, avoid the curse of dimensionality. Our estimator can be easily applied to multivariate nonparametric problems and has appealing statistical and computational properties. Moreover, it can effectively leverage additional structure such as feature sparsity.
{"title":"Regression in tensor product spaces by the method of sieves.","authors":"Tianyu Zhang, Noah Simon","doi":"10.1214/23-ejs2188","DOIUrl":"10.1214/23-ejs2188","url":null,"abstract":"<p><p>Estimation of a conditional mean (linking a set of features to an outcome of interest) is a fundamental statistical task. While there is an appeal to flexible nonparametric procedures, effective estimation in many classical nonparametric function spaces, e.g., multivariate Sobolev spaces, can be prohibitively difficult - both statistically and computationally - especially when the number of features is large. In this paper, we present some sieve estimators for regression in multivariate product spaces. We take Sobolev-type smoothness spaces as an example, though our general framework can be applied to many reproducing kernel Hilbert spaces. These spaces are more amenable to multivariate regression, and allow us to, inpart, avoid the curse of dimensionality. Our estimator can be easily applied to multivariate nonparametric problems and has appealing statistical and computational properties. Moreover, it can effectively leverage additional structure such as feature sparsity.</p>","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":"17 2","pages":"3660-3727"},"PeriodicalIF":1.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11784939/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143081816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper addresses the problem of estimating the Hurst exponent of the fractional Brownian motion from continuous time noisy sample. When the Hurst parameter is greater than 3∕4, consistent estimation is possible only if either the length of the observation interval increases to infinity or intensity of the noise decreases to zero. The main result is a proof of the Local Asymptotic Normality (LAN) of the model in these two regimes which reveals the optimal minimax estimation rates.
{"title":"Estimation of the Hurst parameter from continuous noisy data","authors":"Pavel Chigansky, Marina Kleptsyna","doi":"10.1214/23-ejs2156","DOIUrl":"https://doi.org/10.1214/23-ejs2156","url":null,"abstract":"This paper addresses the problem of estimating the Hurst exponent of the fractional Brownian motion from continuous time noisy sample. When the Hurst parameter is greater than 3∕4, consistent estimation is possible only if either the length of the observation interval increases to infinity or intensity of the noise decreases to zero. The main result is a proof of the Local Asymptotic Normality (LAN) of the model in these two regimes which reveals the optimal minimax estimation rates.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135954992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Two-phase outcome dependent sampling (ODS) is widely used in many fields, especially when certain covariates are expensive and/or difficult to measure. For two-phase ODS, the conditional maximum likelihood (CML) method is very attractive because it can handle zero Phase 2 selection probabilities and avoids modeling the covariate distribution. However, most existing CML-based methods use only the Phase 2 sample and thus may be less efficient than other methods. We propose a general empirical likelihood method that uses CML augmented with additional information in the whole Phase 1 sample to improve estimation efficiency. The proposed method maintains the ability to handle zero selection probabilities and avoids modeling the covariate distribution, but can lead to substantial efficiency gains over CML in the inexpensive covariates, or in the influential covariate when a surrogate is available, because of an effective use of the Phase 1 data. Simulations and a real data illustration using NHANES data are presented.
{"title":"Improving estimation efficiency for two-phase, outcome-dependent sampling studies","authors":"Menglu Che, Peisong Han, J. Lawless","doi":"10.1214/23-ejs2124","DOIUrl":"https://doi.org/10.1214/23-ejs2124","url":null,"abstract":"Two-phase outcome dependent sampling (ODS) is widely used in many fields, especially when certain covariates are expensive and/or difficult to measure. For two-phase ODS, the conditional maximum likelihood (CML) method is very attractive because it can handle zero Phase 2 selection probabilities and avoids modeling the covariate distribution. However, most existing CML-based methods use only the Phase 2 sample and thus may be less efficient than other methods. We propose a general empirical likelihood method that uses CML augmented with additional information in the whole Phase 1 sample to improve estimation efficiency. The proposed method maintains the ability to handle zero selection probabilities and avoids modeling the covariate distribution, but can lead to substantial efficiency gains over CML in the inexpensive covariates, or in the influential covariate when a surrogate is available, because of an effective use of the Phase 1 data. Simulations and a real data illustration using NHANES data are presented.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45404637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We consider the nonparametric regression problem with multiple predictors and an additive error, where the regression function is assumed to be coordinatewise nondecreasing. We propose a Bayesian approach to make an inference on the multivariate monotone regression function, obtain the posterior contraction rate, and construct a universally consistent Bayesian testing procedure for multivariate monotonicity. To facilitate posterior analysis, we set aside the shape restrictions temporarily, and endow a prior on blockwise constant regression functions with heights independently normally distributed. The unknown variance of the error term is either estimated by the marginal maximum likelihood estimate or is equipped with an inverse-gamma prior. Then the unrestricted block heights are a posteriori also independently normally distributed given the error variance, by conjugacy. To comply with the shape restrictions, we project samples from the unrestricted posterior onto the class of multivariate monotone functions, inducing the"projection-posterior distribution", to be used for making an inference. Under an $mathbb{L}_1$-metric, we show that the projection-posterior based on $n$ independent samples contracts around the true monotone regression function at the optimal rate $n^{-1/(2+d)}$. Then we construct a Bayesian test for multivariate monotonicity based on the posterior probability of a shrinking neighborhood of the class of multivariate monotone functions. We show that the test is universally consistent, that is, the level of the Bayesian test goes to zero, and the power at any fixed alternative goes to one. Moreover, we show that for a smooth alternative function, power goes to one as long as its distance to the class of multivariate monotone functions is at least of the order of the estimation error for a smooth function.
{"title":"Posterior contraction and testing for multivariate isotonic regression","authors":"Kang-Kang Wang, S. Ghosal","doi":"10.1214/23-ejs2115","DOIUrl":"https://doi.org/10.1214/23-ejs2115","url":null,"abstract":"We consider the nonparametric regression problem with multiple predictors and an additive error, where the regression function is assumed to be coordinatewise nondecreasing. We propose a Bayesian approach to make an inference on the multivariate monotone regression function, obtain the posterior contraction rate, and construct a universally consistent Bayesian testing procedure for multivariate monotonicity. To facilitate posterior analysis, we set aside the shape restrictions temporarily, and endow a prior on blockwise constant regression functions with heights independently normally distributed. The unknown variance of the error term is either estimated by the marginal maximum likelihood estimate or is equipped with an inverse-gamma prior. Then the unrestricted block heights are a posteriori also independently normally distributed given the error variance, by conjugacy. To comply with the shape restrictions, we project samples from the unrestricted posterior onto the class of multivariate monotone functions, inducing the\"projection-posterior distribution\", to be used for making an inference. Under an $mathbb{L}_1$-metric, we show that the projection-posterior based on $n$ independent samples contracts around the true monotone regression function at the optimal rate $n^{-1/(2+d)}$. Then we construct a Bayesian test for multivariate monotonicity based on the posterior probability of a shrinking neighborhood of the class of multivariate monotone functions. We show that the test is universally consistent, that is, the level of the Bayesian test goes to zero, and the power at any fixed alternative goes to one. Moreover, we show that for a smooth alternative function, power goes to one as long as its distance to the class of multivariate monotone functions is at least of the order of the estimation error for a smooth function.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48939928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Although there is an extensive literature on the eigenvalues of high-dimensional sample covariance matrices, much of it is specialized to independent components (IC) models -- in which observations are represented as linear transformations of random vectors with independent entries. By contrast, less is known in the context of elliptical models, which violate the independence structure of IC models and exhibit quite different statistical phenomena. In particular, very little is known about the scope of bootstrap methods for doing inference with spectral statistics in high-dimensional elliptical models. To fill this gap, we show how a bootstrap approach developed previously for IC models can be extended to handle the different properties of elliptical models. Within this setting, our main theoretical result guarantees that the proposed method consistently approximates the distributions of linear spectral statistics, which play a fundamental role in multivariate analysis. We also provide empirical results showing that the proposed method performs well for a variety of nonlinear spectral statistics.
{"title":"A bootstrap method for spectral statistics in high-dimensional elliptical models","authors":"Si-Ying Wang, Miles E. Lopes","doi":"10.1214/23-ejs2140","DOIUrl":"https://doi.org/10.1214/23-ejs2140","url":null,"abstract":"Although there is an extensive literature on the eigenvalues of high-dimensional sample covariance matrices, much of it is specialized to independent components (IC) models -- in which observations are represented as linear transformations of random vectors with independent entries. By contrast, less is known in the context of elliptical models, which violate the independence structure of IC models and exhibit quite different statistical phenomena. In particular, very little is known about the scope of bootstrap methods for doing inference with spectral statistics in high-dimensional elliptical models. To fill this gap, we show how a bootstrap approach developed previously for IC models can be extended to handle the different properties of elliptical models. Within this setting, our main theoretical result guarantees that the proposed method consistently approximates the distributions of linear spectral statistics, which play a fundamental role in multivariate analysis. We also provide empirical results showing that the proposed method performs well for a variety of nonlinear spectral statistics.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42952281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The training of high-dimensional regression models on comparably sparse data is an important yet complicated topic, especially when there are many more model parameters than observations in the data. From a Bayesian perspective, inference in such cases can be achieved with the help of shrinkage prior distributions, at least for generalized linear models. However, real-world data usually possess multilevel structures, such as repeated measurements or natural groupings of individuals, which existing shrinkage priors are not built to deal with. We generalize and extend one of these priors, the R2D2 prior by Zhang et al. (2020), to linear multilevel models leading to what we call the R2D2M2 prior. The proposed prior enables both local and global shrinkage of the model parameters. It comes with interpretable hyperparameters, which we show to be intrinsically related to vital properties of the prior, such as rates of concentration around the origin, tail behavior, and amount of shrinkage the prior exerts. We offer guidelines on how to select the prior's hyperparameters by deriving shrinkage factors and measuring the effective number of non-zero model coefficients. Hence, the user can readily evaluate and interpret the amount of shrinkage implied by a specific choice of hyperparameters. Finally, we perform extensive experiments on simulated and real data, showing that our inference procedure for the prior is well calibrated, has desirable global and local regularization properties and enables the reliable and interpretable estimation of much more complex Bayesian multilevel models than was previously possible.
{"title":"Intuitive joint priors for Bayesian linear multilevel models: The R2D2M2 prior","authors":"Javier Enrique Aguilar, Paul-Christian Burkner","doi":"10.1214/23-ejs2136","DOIUrl":"https://doi.org/10.1214/23-ejs2136","url":null,"abstract":"The training of high-dimensional regression models on comparably sparse data is an important yet complicated topic, especially when there are many more model parameters than observations in the data. From a Bayesian perspective, inference in such cases can be achieved with the help of shrinkage prior distributions, at least for generalized linear models. However, real-world data usually possess multilevel structures, such as repeated measurements or natural groupings of individuals, which existing shrinkage priors are not built to deal with. We generalize and extend one of these priors, the R2D2 prior by Zhang et al. (2020), to linear multilevel models leading to what we call the R2D2M2 prior. The proposed prior enables both local and global shrinkage of the model parameters. It comes with interpretable hyperparameters, which we show to be intrinsically related to vital properties of the prior, such as rates of concentration around the origin, tail behavior, and amount of shrinkage the prior exerts. We offer guidelines on how to select the prior's hyperparameters by deriving shrinkage factors and measuring the effective number of non-zero model coefficients. Hence, the user can readily evaluate and interpret the amount of shrinkage implied by a specific choice of hyperparameters. Finally, we perform extensive experiments on simulated and real data, showing that our inference procedure for the prior is well calibrated, has desirable global and local regularization properties and enables the reliable and interpretable estimation of much more complex Bayesian multilevel models than was previously possible.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44330955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a new autocorrelation measure for functional time series that we term spherical autocorrelation. It is based on measuring the average angle between lagged pairs of series after having been projected onto the unit sphere. This new measure enjoys several complimentary advantages compared to existing autocorrelation measures for functional data, since it both 1) describes a notion of sign or direction of serial dependence in the series, and 2) is more robust to outliers. The asymptotic properties of estimators of the spherical autocorrelation are established, and are used to construct confidence intervals and portmanteau white noise tests. These confidence intervals and tests are shown to be effective in simulation experiments, and demonstrated in applications to model selection for daily electricity price curves, and measuring the volatility in densely observed asset price data.
{"title":"Functional spherical autocorrelation: A robust estimate of the autocorrelation of a functional time series","authors":"Chi-Kuang Yeh, Gregory Rice, J. Dubin","doi":"10.1214/23-ejs2112","DOIUrl":"https://doi.org/10.1214/23-ejs2112","url":null,"abstract":"We propose a new autocorrelation measure for functional time series that we term spherical autocorrelation. It is based on measuring the average angle between lagged pairs of series after having been projected onto the unit sphere. This new measure enjoys several complimentary advantages compared to existing autocorrelation measures for functional data, since it both 1) describes a notion of sign or direction of serial dependence in the series, and 2) is more robust to outliers. The asymptotic properties of estimators of the spherical autocorrelation are established, and are used to construct confidence intervals and portmanteau white noise tests. These confidence intervals and tests are shown to be effective in simulation experiments, and demonstrated in applications to model selection for daily electricity price curves, and measuring the volatility in densely observed asset price data.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42080559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The field of distribution-free predictive inference provides tools for provably valid prediction without any assumptions on the distribution of the data, which can be paired with any regression algorithm to provide accurate and reliable predictive intervals. The guarantees provided by these methods are typically marginal, meaning that predictive accuracy holds on average over both the training data set and the test point that is queried. However, it may be preferable to obtain a stronger guarantee of training-conditional coverage, which would ensure that most draws of the training data set result in accurate predictive accuracy on future test points. This property is known to hold for the split conformal prediction method. In this work, we examine the training-conditional coverage properties of several other distribution-free predictive inference methods, and find that training-conditional coverage is achieved by some methods but is impossible to guarantee without further assumptions for others.
{"title":"Training-conditional coverage for distribution-free predictive inference","authors":"Michael Bian, R. Barber","doi":"10.1214/23-ejs2145","DOIUrl":"https://doi.org/10.1214/23-ejs2145","url":null,"abstract":"The field of distribution-free predictive inference provides tools for provably valid prediction without any assumptions on the distribution of the data, which can be paired with any regression algorithm to provide accurate and reliable predictive intervals. The guarantees provided by these methods are typically marginal, meaning that predictive accuracy holds on average over both the training data set and the test point that is queried. However, it may be preferable to obtain a stronger guarantee of training-conditional coverage, which would ensure that most draws of the training data set result in accurate predictive accuracy on future test points. This property is known to hold for the split conformal prediction method. In this work, we examine the training-conditional coverage properties of several other distribution-free predictive inference methods, and find that training-conditional coverage is achieved by some methods but is impossible to guarantee without further assumptions for others.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44104591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Extreme U-statistics arise when the kernel of a U-statistic has a high degree but depends only on its arguments through a small number of top order statistics. As the kernel degree of the U-statistic grows to infinity with the sample size, estimators built out of such statistics form an intermediate family in between those constructed in the block maxima and peaks-over-threshold frameworks in extreme value analysis. The asymptotic normality of extreme U-statistics based on location-scale invariant kernels is established. Although the asymptotic variance coincides with the one of the H'ajek projection, the proof goes beyond considering the first term in Hoeffding's variance decomposition. We propose a kernel depending on the three highest order statistics leading to a location-scale invariant estimator of the extreme value index resembling the Pickands estimator. This extreme Pickands U-estimator is asymptotically normal and its finite-sample performance is competitive with that of the pseudo-maximum likelihood estimator.
{"title":"Tail inference using extreme U-statistics","authors":"Jochem Oorschot, J. Segers, Chen Zhou","doi":"10.1214/23-ejs2129","DOIUrl":"https://doi.org/10.1214/23-ejs2129","url":null,"abstract":"Extreme U-statistics arise when the kernel of a U-statistic has a high degree but depends only on its arguments through a small number of top order statistics. As the kernel degree of the U-statistic grows to infinity with the sample size, estimators built out of such statistics form an intermediate family in between those constructed in the block maxima and peaks-over-threshold frameworks in extreme value analysis. The asymptotic normality of extreme U-statistics based on location-scale invariant kernels is established. Although the asymptotic variance coincides with the one of the H'ajek projection, the proof goes beyond considering the first term in Hoeffding's variance decomposition. We propose a kernel depending on the three highest order statistics leading to a location-scale invariant estimator of the extreme value index resembling the Pickands estimator. This extreme Pickands U-estimator is asymptotically normal and its finite-sample performance is competitive with that of the pseudo-maximum likelihood estimator.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47501723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}