{"title":"A statistical learning view of simple Kriging","authors":"Emilia Siviero, Emilie Chautru, Stephan Clémençon","doi":"10.1007/s11749-023-00891-w","DOIUrl":null,"url":null,"abstract":"<p>In the Big Data era, with the ubiquity of geolocation sensors in particular, massive datasets exhibiting a possibly complex spatial dependence structure are becoming increasingly available. In this context, the standard probabilistic theory of statistical learning does not apply directly and guarantees of the generalization capacity of predictive rules learned from such data are left to establish. We analyze here the <i>simple Kriging</i> task, the flagship problem in Geostatistics, from a statistical learning perspective, i.e., by carrying out a nonparametric finite-sample predictive analysis. Given <span>\\(d\\ge 1\\)</span> values taken by a realization of a square integrable random field <span>\\(X=\\{X_s\\}_{s\\in S}\\)</span>, <span>\\(S\\subset {\\mathbb {R}}^2\\)</span>, with unknown covariance structure, at sites <span>\\(s_1,\\; \\ldots ,\\; s_d\\)</span> in <i>S</i>, the goal is to predict the unknown values it takes at any other location <span>\\(s\\in S\\)</span> with minimum quadratic risk. The prediction rule being derived from a training spatial dataset: a single realization <span>\\(X'\\)</span> of <i>X</i>, is independent from those to be predicted, observed at <span>\\(n\\ge 1\\)</span> locations <span>\\(\\sigma _1,\\; \\ldots ,\\; \\sigma _n\\)</span> in <i>S</i>. Despite the connection of this minimization problem with kernel ridge regression, establishing the generalization capacity of empirical risk minimizers is far from straightforward, due to the non-independent and identically distributed nature of the training data <span>\\(X'_{\\sigma _1},\\; \\ldots ,\\; X'_{\\sigma _n}\\)</span> involved in the learning procedure. In this article, non-asymptotic bounds of order <span>\\(O_{{\\mathbb {P}}}(1/\\sqrt{n})\\)</span> are proved for the excess risk of a <i>plug-in</i> predictive rule mimicking the true minimizer in the case of isotropic stationary Gaussian processes, observed at locations forming a regular grid in the learning stage. These theoretical results, as well as the role played by the technical conditions required to establish them, are illustrated by various numerical experiments, on simulated data and on real-world datasets, and hopefully pave the way for further developments in statistical learning based on spatial data.\n</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"4 7","pages":""},"PeriodicalIF":1.2000,"publicationDate":"2023-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Test","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s11749-023-00891-w","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
Abstract
In the Big Data era, with the ubiquity of geolocation sensors in particular, massive datasets exhibiting a possibly complex spatial dependence structure are becoming increasingly available. In this context, the standard probabilistic theory of statistical learning does not apply directly and guarantees of the generalization capacity of predictive rules learned from such data are left to establish. We analyze here the simple Kriging task, the flagship problem in Geostatistics, from a statistical learning perspective, i.e., by carrying out a nonparametric finite-sample predictive analysis. Given \(d\ge 1\) values taken by a realization of a square integrable random field \(X=\{X_s\}_{s\in S}\), \(S\subset {\mathbb {R}}^2\), with unknown covariance structure, at sites \(s_1,\; \ldots ,\; s_d\) in S, the goal is to predict the unknown values it takes at any other location \(s\in S\) with minimum quadratic risk. The prediction rule being derived from a training spatial dataset: a single realization \(X'\) of X, is independent from those to be predicted, observed at \(n\ge 1\) locations \(\sigma _1,\; \ldots ,\; \sigma _n\) in S. Despite the connection of this minimization problem with kernel ridge regression, establishing the generalization capacity of empirical risk minimizers is far from straightforward, due to the non-independent and identically distributed nature of the training data \(X'_{\sigma _1},\; \ldots ,\; X'_{\sigma _n}\) involved in the learning procedure. In this article, non-asymptotic bounds of order \(O_{{\mathbb {P}}}(1/\sqrt{n})\) are proved for the excess risk of a plug-in predictive rule mimicking the true minimizer in the case of isotropic stationary Gaussian processes, observed at locations forming a regular grid in the learning stage. These theoretical results, as well as the role played by the technical conditions required to establish them, are illustrated by various numerical experiments, on simulated data and on real-world datasets, and hopefully pave the way for further developments in statistical learning based on spatial data.
期刊介绍:
TEST is an international journal of Statistics and Probability, sponsored by the Spanish Society of Statistics and Operations Research. English is the official language of the journal.
The emphasis of TEST is placed on papers containing original theoretical contributions of direct or potential value in applications. In this respect, the methodological contents are considered to be crucial for the papers published in TEST, but the practical implications of the methodological aspects are also relevant. Original sound manuscripts on either well-established or emerging areas in the scope of the journal are welcome.
One volume is published annually in four issues. In addition to the regular contributions, each issue of TEST contains an invited paper from a world-wide recognized outstanding statistician on an up-to-date challenging topic, including discussions.