{"title":"分布式框架中的扩展 Hotelling $$T^2$$ 检验","authors":"Bin Du, Xiumin Liu, Junlong Zhao","doi":"10.1007/s11749-024-00939-5","DOIUrl":null,"url":null,"abstract":"<p>Hypothesis test for a mean vector is a classical problem in data analysis but has been highly underinvestigated in distributed frameworks where samples of size <i>n</i> are located on <i>k</i> local sites. This paper focuses on the one-sample mean test, proposing synthesized test statistics with a much lower communication cost than the centralized Hotelling <span>\\(T^2\\)</span> test. For the homogeneous case, where data on different local sites are independent and identically distributed, the efficiency of our proposed test is comparable to that of the centralized one, and much better than the test constructed from the divide and conquer method. Besides, three heterogeneous cases are considered, where the distributions of the data on local sites can be different. Heterogeneous cases are much more challenging because the local sample means and covariance matrices may be inconsistent estimators. We construct communication-efficient testing procedures for heterogeneous cases, and the power of the proposed test statistics is comparable to that of the centralized one under some conditions. Simulation results verify the effectiveness of the proposed testing procedures.\n</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"74 1","pages":""},"PeriodicalIF":1.2000,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Extended Hotelling $$T^2$$ test in distributed frameworks\",\"authors\":\"Bin Du, Xiumin Liu, Junlong Zhao\",\"doi\":\"10.1007/s11749-024-00939-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Hypothesis test for a mean vector is a classical problem in data analysis but has been highly underinvestigated in distributed frameworks where samples of size <i>n</i> are located on <i>k</i> local sites. This paper focuses on the one-sample mean test, proposing synthesized test statistics with a much lower communication cost than the centralized Hotelling <span>\\\\(T^2\\\\)</span> test. For the homogeneous case, where data on different local sites are independent and identically distributed, the efficiency of our proposed test is comparable to that of the centralized one, and much better than the test constructed from the divide and conquer method. Besides, three heterogeneous cases are considered, where the distributions of the data on local sites can be different. Heterogeneous cases are much more challenging because the local sample means and covariance matrices may be inconsistent estimators. We construct communication-efficient testing procedures for heterogeneous cases, and the power of the proposed test statistics is comparable to that of the centralized one under some conditions. Simulation results verify the effectiveness of the proposed testing procedures.\\n</p>\",\"PeriodicalId\":51189,\"journal\":{\"name\":\"Test\",\"volume\":\"74 1\",\"pages\":\"\"},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2024-07-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Test\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1007/s11749-024-00939-5\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Test","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s11749-024-00939-5","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
摘要
均值向量的假设检验是数据分析中的一个经典问题,但在分布式框架中,大小为 n 的样本分布在 k 个本地站点上,对这个问题的研究却非常不够。本文重点关注单样本均值检验,提出了通信成本远低于集中式 Hotelling \(T^2\) 检验的合成检验统计量。对于不同本地站点数据独立且同分布的同质情况,我们提出的检验效率与集中式检验效率相当,且远优于用分而治之法构建的检验。此外,我们还考虑了三种异构情况,即本地站点的数据分布可能不同。异质情况更具挑战性,因为本地样本均值和协方差矩阵可能是不一致的估计值。我们为异构情况构建了通信效率高的测试程序,在某些条件下,所提出的测试统计量的功率与集中式统计量的功率相当。仿真结果验证了所提测试程序的有效性。
Extended Hotelling $$T^2$$ test in distributed frameworks
Hypothesis test for a mean vector is a classical problem in data analysis but has been highly underinvestigated in distributed frameworks where samples of size n are located on k local sites. This paper focuses on the one-sample mean test, proposing synthesized test statistics with a much lower communication cost than the centralized Hotelling \(T^2\) test. For the homogeneous case, where data on different local sites are independent and identically distributed, the efficiency of our proposed test is comparable to that of the centralized one, and much better than the test constructed from the divide and conquer method. Besides, three heterogeneous cases are considered, where the distributions of the data on local sites can be different. Heterogeneous cases are much more challenging because the local sample means and covariance matrices may be inconsistent estimators. We construct communication-efficient testing procedures for heterogeneous cases, and the power of the proposed test statistics is comparable to that of the centralized one under some conditions. Simulation results verify the effectiveness of the proposed testing procedures.
期刊介绍:
TEST is an international journal of Statistics and Probability, sponsored by the Spanish Society of Statistics and Operations Research. English is the official language of the journal.
The emphasis of TEST is placed on papers containing original theoretical contributions of direct or potential value in applications. In this respect, the methodological contents are considered to be crucial for the papers published in TEST, but the practical implications of the methodological aspects are also relevant. Original sound manuscripts on either well-established or emerging areas in the scope of the journal are welcome.
One volume is published annually in four issues. In addition to the regular contributions, each issue of TEST contains an invited paper from a world-wide recognized outstanding statistician on an up-to-date challenging topic, including discussions.