{"title":"高维条件下两样本均值的Neyman截断检验","authors":"Ping Dong, Lu Lin","doi":"10.1214/21-bjps519","DOIUrl":null,"url":null,"abstract":"Abstract. Multivariate two-sample testing problems often arise from the statistical analysis for scientific data, especially for bioinformatics data. To detect components with different values between two mean vectors, well-known procedures are to apply Sum-of-Squares type tests, such as Hotelling’s T 2-test. However, such a test is not suitable to high dimensional settings because of singular covariance matrix and accumulated errors. Nowadays, a lot of test methods for high dimensional data are developed, mainly including two types, Sum-of-Squares type and Max type. The Sum-of-Squares type test statistics have poor performance against sparse alternatives. And the Max type test statistic is not powerful enough to deal with non-sparse datasets. In this paper, we propose a Max-Partial-Sum type statistic named Neyman’s Truncation test, which is conducted by maximum partial sums of marginal test statistics. Besides non-sparse datasets, Neyman’s Truncation test also has great power against dense and sparse alternatives. The asymptotic distribution of the test statistic under null hypothesis is obtained and the power of the test is analyzed. To avoid the slow convergence rate of the asymptotic distribution, we realize our method by Bootstrap procedures. Simulation studies and the analysis of leukemia dataset are carried out to verify the numerical performance.","PeriodicalId":51242,"journal":{"name":"Brazilian Journal of Probability and Statistics","volume":" ","pages":""},"PeriodicalIF":0.6000,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Neyman’s truncation test for two-sample means under high dimensional setting\",\"authors\":\"Ping Dong, Lu Lin\",\"doi\":\"10.1214/21-bjps519\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract. Multivariate two-sample testing problems often arise from the statistical analysis for scientific data, especially for bioinformatics data. To detect components with different values between two mean vectors, well-known procedures are to apply Sum-of-Squares type tests, such as Hotelling’s T 2-test. However, such a test is not suitable to high dimensional settings because of singular covariance matrix and accumulated errors. Nowadays, a lot of test methods for high dimensional data are developed, mainly including two types, Sum-of-Squares type and Max type. The Sum-of-Squares type test statistics have poor performance against sparse alternatives. And the Max type test statistic is not powerful enough to deal with non-sparse datasets. In this paper, we propose a Max-Partial-Sum type statistic named Neyman’s Truncation test, which is conducted by maximum partial sums of marginal test statistics. Besides non-sparse datasets, Neyman’s Truncation test also has great power against dense and sparse alternatives. The asymptotic distribution of the test statistic under null hypothesis is obtained and the power of the test is analyzed. To avoid the slow convergence rate of the asymptotic distribution, we realize our method by Bootstrap procedures. Simulation studies and the analysis of leukemia dataset are carried out to verify the numerical performance.\",\"PeriodicalId\":51242,\"journal\":{\"name\":\"Brazilian Journal of Probability and Statistics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2022-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Brazilian Journal of Probability and Statistics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1214/21-bjps519\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Brazilian Journal of Probability and Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1214/21-bjps519","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
Neyman’s truncation test for two-sample means under high dimensional setting
Abstract. Multivariate two-sample testing problems often arise from the statistical analysis for scientific data, especially for bioinformatics data. To detect components with different values between two mean vectors, well-known procedures are to apply Sum-of-Squares type tests, such as Hotelling’s T 2-test. However, such a test is not suitable to high dimensional settings because of singular covariance matrix and accumulated errors. Nowadays, a lot of test methods for high dimensional data are developed, mainly including two types, Sum-of-Squares type and Max type. The Sum-of-Squares type test statistics have poor performance against sparse alternatives. And the Max type test statistic is not powerful enough to deal with non-sparse datasets. In this paper, we propose a Max-Partial-Sum type statistic named Neyman’s Truncation test, which is conducted by maximum partial sums of marginal test statistics. Besides non-sparse datasets, Neyman’s Truncation test also has great power against dense and sparse alternatives. The asymptotic distribution of the test statistic under null hypothesis is obtained and the power of the test is analyzed. To avoid the slow convergence rate of the asymptotic distribution, we realize our method by Bootstrap procedures. Simulation studies and the analysis of leukemia dataset are carried out to verify the numerical performance.
期刊介绍:
The Brazilian Journal of Probability and Statistics aims to publish high quality research papers in applied probability, applied statistics, computational statistics, mathematical statistics, probability theory and stochastic processes.
More specifically, the following types of contributions will be considered:
(i) Original articles dealing with methodological developments, comparison of competing techniques or their computational aspects.
(ii) Original articles developing theoretical results.
(iii) Articles that contain novel applications of existing methodologies to practical problems. For these papers the focus is in the importance and originality of the applied problem, as well as, applications of the best available methodologies to solve it.
(iv) Survey articles containing a thorough coverage of topics of broad interest to probability and statistics. The journal will occasionally publish book reviews, invited papers and essays on the teaching of statistics.