{"title":"Fast robust location and scatter estimation: a depth-based method","authors":"Maoyu Zhang, Yan Song, Wenlin Dai","doi":"10.1080/00401706.2023.2216246","DOIUrl":null,"url":null,"abstract":"The minimum covariance determinant (MCD) estimator is ubiquitous in multivariate analysis, the critical step of which is to select a subset of a given size with the lowest sample covariance determinant. The concentration step (C-step) is a common tool for subset-seeking; however, it becomes computationally demanding for high-dimensional data. To alleviate the challenge, we propose a depth-based algorithm, termed as \\texttt{FDB}, which replaces the optimal subset with the trimmed region induced by statistical depth. We show that the depth-based region is consistent with the MCD-based subset under a specific class of depth notions, for instance, the projection depth. With the two suggested depths, the \\texttt{FDB} estimator is not only computationally more efficient but also reaches the same level of robustness as the MCD estimator. Extensive simulation studies are conducted to assess the empirical performance of our estimators. We also validate the computational efficiency and robustness of our estimators under several typical tasks such as principal component analysis, linear discriminant analysis, image denoise and outlier detection on real-life datasets. A R package \\textit{FDB} and potential extensions are available in the Supplementary Materials.","PeriodicalId":22208,"journal":{"name":"Technometrics","volume":" ","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2023-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Technometrics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1080/00401706.2023.2216246","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
Abstract
The minimum covariance determinant (MCD) estimator is ubiquitous in multivariate analysis, the critical step of which is to select a subset of a given size with the lowest sample covariance determinant. The concentration step (C-step) is a common tool for subset-seeking; however, it becomes computationally demanding for high-dimensional data. To alleviate the challenge, we propose a depth-based algorithm, termed as \texttt{FDB}, which replaces the optimal subset with the trimmed region induced by statistical depth. We show that the depth-based region is consistent with the MCD-based subset under a specific class of depth notions, for instance, the projection depth. With the two suggested depths, the \texttt{FDB} estimator is not only computationally more efficient but also reaches the same level of robustness as the MCD estimator. Extensive simulation studies are conducted to assess the empirical performance of our estimators. We also validate the computational efficiency and robustness of our estimators under several typical tasks such as principal component analysis, linear discriminant analysis, image denoise and outlier detection on real-life datasets. A R package \textit{FDB} and potential extensions are available in the Supplementary Materials.
期刊介绍:
Technometrics is a Journal of Statistics for the Physical, Chemical, and Engineering Sciences, and is published Quarterly by the American Society for Quality and the American Statistical Association.Since its inception in 1959, the mission of Technometrics has been to contribute to the development and use of statistical methods in the physical, chemical, and engineering sciences.