{"title":"Robust estimation of a regression function in exponential families","authors":"Yannick Baraud, Juntong Chen","doi":"10.1016/j.jspi.2024.106167","DOIUrl":null,"url":null,"abstract":"<div><p>We observe <span><math><mi>n</mi></math></span> pairs of independent (but not necessarily i.i.d.) random variables <span><math><mrow><msub><mrow><mi>X</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>=</mo><mrow><mo>(</mo><msub><mrow><mi>W</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>,</mo><msub><mrow><mi>Y</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>)</mo></mrow><mo>,</mo><mo>…</mo><mo>,</mo><msub><mrow><mi>X</mi></mrow><mrow><mi>n</mi></mrow></msub><mo>=</mo><mrow><mo>(</mo><msub><mrow><mi>W</mi></mrow><mrow><mi>n</mi></mrow></msub><mo>,</mo><msub><mrow><mi>Y</mi></mrow><mrow><mi>n</mi></mrow></msub><mo>)</mo></mrow></mrow></math></span> and tackle the problem of estimating the conditional distributions <span><math><mrow><msubsup><mrow><mi>Q</mi></mrow><mrow><mi>i</mi></mrow><mrow><mo>⋆</mo></mrow></msubsup><mrow><mo>(</mo><msub><mrow><mi>w</mi></mrow><mrow><mi>i</mi></mrow></msub><mo>)</mo></mrow></mrow></math></span> of <span><math><msub><mrow><mi>Y</mi></mrow><mrow><mi>i</mi></mrow></msub></math></span> given <span><math><mrow><msub><mrow><mi>W</mi></mrow><mrow><mi>i</mi></mrow></msub><mo>=</mo><msub><mrow><mi>w</mi></mrow><mrow><mi>i</mi></mrow></msub></mrow></math></span> for all <span><math><mrow><mi>i</mi><mo>∈</mo><mrow><mo>{</mo><mn>1</mn><mo>,</mo><mo>…</mo><mo>,</mo><mi>n</mi><mo>}</mo></mrow></mrow></math></span>. Even though these might not be true, we base our estimator on the assumptions that the data are i.i.d. and the conditional distributions of <span><math><msub><mrow><mi>Y</mi></mrow><mrow><mi>i</mi></mrow></msub></math></span> given <span><math><mrow><msub><mrow><mi>W</mi></mrow><mrow><mi>i</mi></mrow></msub><mo>=</mo><msub><mrow><mi>w</mi></mrow><mrow><mi>i</mi></mrow></msub></mrow></math></span> belong to a one parameter exponential family <span><math><mover><mrow><mi>Q</mi></mrow><mo>¯</mo></mover></math></span> with parameter space given by an interval <span><math><mi>I</mi></math></span>. More precisely, we pretend that these conditional distributions take the form <span><math><mrow><msub><mrow><mi>Q</mi></mrow><mrow><mi>θ</mi><mrow><mo>(</mo><msub><mrow><mi>w</mi></mrow><mrow><mi>i</mi></mrow></msub><mo>)</mo></mrow></mrow></msub><mo>∈</mo><mover><mrow><mi>Q</mi></mrow><mo>¯</mo></mover></mrow></math></span> for some <span><math><mi>θ</mi></math></span> that belongs to a VC-class <span><math><mover><mrow><mi>Θ</mi></mrow><mo>¯</mo></mover></math></span> of functions with values in <span><math><mi>I</mi></math></span>. For each <span><math><mrow><mi>i</mi><mo>∈</mo><mrow><mo>{</mo><mn>1</mn><mo>,</mo><mo>…</mo><mo>,</mo><mi>n</mi><mo>}</mo></mrow></mrow></math></span>, we estimate <span><math><mrow><msubsup><mrow><mi>Q</mi></mrow><mrow><mi>i</mi></mrow><mrow><mo>⋆</mo></mrow></msubsup><mrow><mo>(</mo><msub><mrow><mi>w</mi></mrow><mrow><mi>i</mi></mrow></msub><mo>)</mo></mrow></mrow></math></span> by a distribution of the same form, i.e. <span><math><mrow><msub><mrow><mi>Q</mi></mrow><mrow><mover><mrow><mi>θ</mi></mrow><mrow><mo>̂</mo></mrow></mover><mrow><mo>(</mo><msub><mrow><mi>w</mi></mrow><mrow><mi>i</mi></mrow></msub><mo>)</mo></mrow></mrow></msub><mo>∈</mo><mover><mrow><mi>Q</mi></mrow><mo>¯</mo></mover></mrow></math></span>, where <span><math><mrow><mover><mrow><mi>θ</mi></mrow><mrow><mo>̂</mo></mrow></mover><mo>=</mo><mover><mrow><mi>θ</mi></mrow><mrow><mo>̂</mo></mrow></mover><mrow><mo>(</mo><msub><mrow><mi>X</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>,</mo><mo>…</mo><mo>,</mo><msub><mrow><mi>X</mi></mrow><mrow><mi>n</mi></mrow></msub><mo>)</mo></mrow></mrow></math></span> is a well-chosen estimator with values in <span><math><mover><mrow><mi>Θ</mi></mrow><mo>¯</mo></mover></math></span>. We establish non-asymptotic exponential inequalities for the upper deviations of a Hellinger-type distance between the true conditional distributions of the data and the estimated one based on the exponential family <span><math><mover><mrow><mi>Q</mi></mrow><mo>¯</mo></mover></math></span> and the class of functions <span><math><mover><mrow><mi>Θ</mi></mrow><mo>¯</mo></mover></math></span> we chose. We show that our estimation strategy is robust to model misspecification, contamination and the presence of outliers. Besides, when the data are truly i.i.d., the exponential family <span><math><mover><mrow><mi>Q</mi></mrow><mo>¯</mo></mover></math></span> is suitably parametrized and the conditional distributions <span><math><mrow><msubsup><mrow><mi>Q</mi></mrow><mrow><mi>i</mi></mrow><mrow><mo>⋆</mo></mrow></msubsup><mrow><mo>(</mo><msub><mrow><mi>w</mi></mrow><mrow><mi>i</mi></mrow></msub><mo>)</mo></mrow></mrow></math></span> of the form <span><math><mrow><msub><mrow><mi>Q</mi></mrow><mrow><msup><mrow><mi>θ</mi></mrow><mrow><mo>⋆</mo></mrow></msup><mrow><mo>(</mo><msub><mrow><mi>w</mi></mrow><mrow><mi>i</mi></mrow></msub><mo>)</mo></mrow></mrow></msub><mo>∈</mo><mover><mrow><mi>Q</mi></mrow><mo>¯</mo></mover></mrow></math></span> for some unknown Hölderian function <span><math><msup><mrow><mi>θ</mi></mrow><mrow><mo>⋆</mo></mrow></msup></math></span> with values in <span><math><mi>I</mi></math></span>, we prove that the estimator <span><math><mover><mrow><mi>θ</mi></mrow><mrow><mo>̂</mo></mrow></mover></math></span> of <span><math><msup><mrow><mi>θ</mi></mrow><mrow><mo>⋆</mo></mrow></msup></math></span> is minimax (up to a logarithmic factor). Finally, we provide an algorithm for calculating <span><math><mover><mrow><mi>θ</mi></mrow><mrow><mo>̂</mo></mrow></mover></math></span> when <span><math><mover><mrow><mi>Θ</mi></mrow><mo>¯</mo></mover></math></span> is a VC-class of functions of low or moderate dimension and we carry out a simulation study to compare its performance to that of the MLE and median-based estimators. The proof of our main result relies on an upper bound, with explicit numerical constants, on the expectation of the supremum of an empirical process over a VC-subgraph class. This bound can be of independent interest.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"233 ","pages":"Article 106167"},"PeriodicalIF":0.8000,"publicationDate":"2024-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Statistical Planning and Inference","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378375824000247","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
Abstract
We observe pairs of independent (but not necessarily i.i.d.) random variables and tackle the problem of estimating the conditional distributions of given for all . Even though these might not be true, we base our estimator on the assumptions that the data are i.i.d. and the conditional distributions of given belong to a one parameter exponential family with parameter space given by an interval . More precisely, we pretend that these conditional distributions take the form for some that belongs to a VC-class of functions with values in . For each , we estimate by a distribution of the same form, i.e. , where is a well-chosen estimator with values in . We establish non-asymptotic exponential inequalities for the upper deviations of a Hellinger-type distance between the true conditional distributions of the data and the estimated one based on the exponential family and the class of functions we chose. We show that our estimation strategy is robust to model misspecification, contamination and the presence of outliers. Besides, when the data are truly i.i.d., the exponential family is suitably parametrized and the conditional distributions of the form for some unknown Hölderian function with values in , we prove that the estimator of is minimax (up to a logarithmic factor). Finally, we provide an algorithm for calculating when is a VC-class of functions of low or moderate dimension and we carry out a simulation study to compare its performance to that of the MLE and median-based estimators. The proof of our main result relies on an upper bound, with explicit numerical constants, on the expectation of the supremum of an empirical process over a VC-subgraph class. This bound can be of independent interest.
期刊介绍:
The Journal of Statistical Planning and Inference offers itself as a multifaceted and all-inclusive bridge between classical aspects of statistics and probability, and the emerging interdisciplinary aspects that have a potential of revolutionizing the subject. While we maintain our traditional strength in statistical inference, design, classical probability, and large sample methods, we also have a far more inclusive and broadened scope to keep up with the new problems that confront us as statisticians, mathematicians, and scientists.
We publish high quality articles in all branches of statistics, probability, discrete mathematics, machine learning, and bioinformatics. We also especially welcome well written and up to date review articles on fundamental themes of statistics, probability, machine learning, and general biostatistics. Thoughtful letters to the editors, interesting problems in need of a solution, and short notes carrying an element of elegance or beauty are equally welcome.