{"title":"Robust sparse principal component analysis: situation of full sparseness","authors":"B. Alkan, I. Ünaldi","doi":"10.2478/jamsi-2022-0001","DOIUrl":null,"url":null,"abstract":"Abstract Principal Component Analysis (PCA) is the main method of dimension reduction and data processing when the dataset is of high dimension. Therefore, PCA is a widely used method in almost all scientific fields. Because PCA is a linear combination of the original variables, the interpretation process of the analysis results is often encountered with some difficulties. The approaches proposed for solving these problems are called to as Sparse Principal Component Analysis (SPCA). Sparse approaches are not robust in existence of outliers in the data set. In this study, the performance of the approach proposed by Croux et al. (2013), which combines the advantageous properties of SPCA and Robust Principal Component Analysis (RPCA), will be examined through one real and three artificial datasets in the situation of full sparseness. In the light of the findings, it is recommended to use robust sparse PCA based on projection pursuit in analyzing the data. Another important finding obtained from the study is that the BIC and TPO criteria used in determining lambda are not much superior to each other. We suggest choosing one of these two criteria that give an optimal result.","PeriodicalId":43016,"journal":{"name":"Journal of Applied Mathematics Statistics and Informatics","volume":"18 1","pages":"5 - 20"},"PeriodicalIF":0.3000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Mathematics Statistics and Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/jamsi-2022-0001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract Principal Component Analysis (PCA) is the main method of dimension reduction and data processing when the dataset is of high dimension. Therefore, PCA is a widely used method in almost all scientific fields. Because PCA is a linear combination of the original variables, the interpretation process of the analysis results is often encountered with some difficulties. The approaches proposed for solving these problems are called to as Sparse Principal Component Analysis (SPCA). Sparse approaches are not robust in existence of outliers in the data set. In this study, the performance of the approach proposed by Croux et al. (2013), which combines the advantageous properties of SPCA and Robust Principal Component Analysis (RPCA), will be examined through one real and three artificial datasets in the situation of full sparseness. In the light of the findings, it is recommended to use robust sparse PCA based on projection pursuit in analyzing the data. Another important finding obtained from the study is that the BIC and TPO criteria used in determining lambda are not much superior to each other. We suggest choosing one of these two criteria that give an optimal result.