{"title":"On the robustness to adversarial corruption and to heavy-tailed data of the Stahel–Donoho median of means","authors":"Jules Depersin;Guillaume Lecué","doi":"10.1093/imaiai/iaac026","DOIUrl":null,"url":null,"abstract":"We consider median of means (MOM) versions of the Stahel–Donoho outlyingness (SDO) [23, 66] and of the Median Absolute Deviation (MAD) [30] functions to construct subgaussian estimators of a mean vector under adversarial contamination and heavy-tailed data. We develop a single analysis of the MOM version of the SDO which covers all cases ranging from the Gaussian case to the \n<tex>$L_2$</tex>\n case. It is based on isomorphic and almost isometric properties of the MOM versions of SDO and MAD. This analysis also covers cases where the mean does not even exist but a location parameter does; in those cases we still recover the same subgaussian rates and the same price for adversarial contamination even though there is not even a first moment. These properties are achieved by the classical SDO median and are therefore the first non-asymptotic statistical bounds on the Stahel–Donoho median complementing the \n<tex>$\\sqrt{n}$</tex>\n-consistency [58] and asymptotic normality [74] of the Stahel–Donoho estimators. We also show that the MOM version of MAD can be used to construct an estimator of the covariance matrix only under the existence of a second moment or of a scatter matrix if a second moment does not exist.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"12 2","pages":"814-850"},"PeriodicalIF":1.4000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Inference-A Journal of the Ima","FirstCategoryId":"100","ListUrlMain":"https://ieeexplore.ieee.org/document/10058610/","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
引用次数: 0
Abstract
We consider median of means (MOM) versions of the Stahel–Donoho outlyingness (SDO) [23, 66] and of the Median Absolute Deviation (MAD) [30] functions to construct subgaussian estimators of a mean vector under adversarial contamination and heavy-tailed data. We develop a single analysis of the MOM version of the SDO which covers all cases ranging from the Gaussian case to the
$L_2$
case. It is based on isomorphic and almost isometric properties of the MOM versions of SDO and MAD. This analysis also covers cases where the mean does not even exist but a location parameter does; in those cases we still recover the same subgaussian rates and the same price for adversarial contamination even though there is not even a first moment. These properties are achieved by the classical SDO median and are therefore the first non-asymptotic statistical bounds on the Stahel–Donoho median complementing the
$\sqrt{n}$
-consistency [58] and asymptotic normality [74] of the Stahel–Donoho estimators. We also show that the MOM version of MAD can be used to construct an estimator of the covariance matrix only under the existence of a second moment or of a scatter matrix if a second moment does not exist.