{"title":"Comparison of Correction Factors and Sample Size Required to Test the Equality of the Smallest Eigenvalues in Principal Component Analysis","authors":"Eduard Gañan-Cardenas, J. C. Correa-Morales","doi":"10.15446/RCE.V44N1.83987","DOIUrl":null,"url":null,"abstract":"In the inferential process of Principal Component Analysis (PCA), one of the main challenges for researchers is establishing the correct number of components to represent the sample. For that purpose, heuristic and statistical strategies have been proposed. One statistical approach consists in testing the hypothesis of the equality of the smallest eigenvalues in the covariance or correlation matrix using a Likelihood-Ratio Test (LRT) that follows a χ2 limit distribution. Different correction factors have been proposed to improve the approximation of the sampling distribution of the statistic. We use simulation to study the significance level and power of the test under the use of these different factors and analyze the sample size required for an dequate approximation. The results indicate that for covariance matrix, the factor proposed by Bartlett offers the best balance between the objectives of low probability of Type I Error and high Power.\n \n \n \n \n \n \nIf the correlation matrix is used, the factors W ∗\n \n \n \n \n \n \nand cχ2\n \n \n \n \n \n \nare the most\n \n \n \n \n \n \n \nrecommended. Empirically, we can observe that most factors require sample sizes 10 or 20 times the number of variables if covariance or correlationmatrices, respectively, are implemented.","PeriodicalId":54477,"journal":{"name":"Revista Colombiana De Estadistica","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Revista Colombiana De Estadistica","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15446/RCE.V44N1.83987","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 3
Abstract
In the inferential process of Principal Component Analysis (PCA), one of the main challenges for researchers is establishing the correct number of components to represent the sample. For that purpose, heuristic and statistical strategies have been proposed. One statistical approach consists in testing the hypothesis of the equality of the smallest eigenvalues in the covariance or correlation matrix using a Likelihood-Ratio Test (LRT) that follows a χ2 limit distribution. Different correction factors have been proposed to improve the approximation of the sampling distribution of the statistic. We use simulation to study the significance level and power of the test under the use of these different factors and analyze the sample size required for an dequate approximation. The results indicate that for covariance matrix, the factor proposed by Bartlett offers the best balance between the objectives of low probability of Type I Error and high Power.
If the correlation matrix is used, the factors W ∗
and cχ2
are the most
recommended. Empirically, we can observe that most factors require sample sizes 10 or 20 times the number of variables if covariance or correlationmatrices, respectively, are implemented.
期刊介绍:
The Colombian Journal of Statistics publishes original articles of theoretical, methodological and educational kind in any branch of Statistics. Purely theoretical papers should include illustration of the techniques presented with real data or at least simulation experiments in order to verify the usefulness of the contents presented. Informative articles of high quality methodologies or statistical techniques applied in different fields of knowledge are also considered. Only articles in English language are considered for publication.
The Editorial Committee assumes that the works submitted for evaluation
have not been previously published and are not being given simultaneously for publication elsewhere, and will not be without prior consent of the Committee, unless, as a result of the assessment, decides not publish in the journal. It is further assumed that when the authors deliver a document for publication in the Colombian Journal of Statistics, they know the above conditions and agree with them.