{"title":"多源海量异构数据的自适应分布式推理","authors":"Xin Yang, Qi Jing Yan, Mi Xia Wu","doi":"10.1007/s10114-024-2524-4","DOIUrl":null,"url":null,"abstract":"<div><p>In this paper, we consider the distributed inference for heterogeneous linear models with massive datasets. Noting that heterogeneity may exist not only in the expectations of the subpopulations, but also in their variances, we propose the heteroscedasticity-adaptive distributed aggregation (HADA) estimation, which is shown to be communication-efficient and asymptotically optimal, regardless of homoscedasticity or heteroscedasticity. Furthermore, a distributed test for parameter heterogeneity across subpopulations is constructed based on the HADA estimator. The finite-sample performance of the proposed methods is evaluated using simulation studies and the NYC flight data.</p></div>","PeriodicalId":50893,"journal":{"name":"Acta Mathematica Sinica-English Series","volume":"40 11","pages":"2751 - 2770"},"PeriodicalIF":0.8000,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adaptive Distributed Inference for Multi-source Massive Heterogeneous Data\",\"authors\":\"Xin Yang, Qi Jing Yan, Mi Xia Wu\",\"doi\":\"10.1007/s10114-024-2524-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In this paper, we consider the distributed inference for heterogeneous linear models with massive datasets. Noting that heterogeneity may exist not only in the expectations of the subpopulations, but also in their variances, we propose the heteroscedasticity-adaptive distributed aggregation (HADA) estimation, which is shown to be communication-efficient and asymptotically optimal, regardless of homoscedasticity or heteroscedasticity. Furthermore, a distributed test for parameter heterogeneity across subpopulations is constructed based on the HADA estimator. The finite-sample performance of the proposed methods is evaluated using simulation studies and the NYC flight data.</p></div>\",\"PeriodicalId\":50893,\"journal\":{\"name\":\"Acta Mathematica Sinica-English Series\",\"volume\":\"40 11\",\"pages\":\"2751 - 2770\"},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2024-11-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Acta Mathematica Sinica-English Series\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10114-024-2524-4\",\"RegionNum\":3,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATHEMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Mathematica Sinica-English Series","FirstCategoryId":"100","ListUrlMain":"https://link.springer.com/article/10.1007/s10114-024-2524-4","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS","Score":null,"Total":0}
Adaptive Distributed Inference for Multi-source Massive Heterogeneous Data
In this paper, we consider the distributed inference for heterogeneous linear models with massive datasets. Noting that heterogeneity may exist not only in the expectations of the subpopulations, but also in their variances, we propose the heteroscedasticity-adaptive distributed aggregation (HADA) estimation, which is shown to be communication-efficient and asymptotically optimal, regardless of homoscedasticity or heteroscedasticity. Furthermore, a distributed test for parameter heterogeneity across subpopulations is constructed based on the HADA estimator. The finite-sample performance of the proposed methods is evaluated using simulation studies and the NYC flight data.
期刊介绍:
Acta Mathematica Sinica, established by the Chinese Mathematical Society in 1936, is the first and the best mathematical journal in China. In 1985, Acta Mathematica Sinica is divided into English Series and Chinese Series. The English Series is a monthly journal, publishing significant research papers from all branches of pure and applied mathematics. It provides authoritative reviews of current developments in mathematical research. Contributions are invited from researchers from all over the world.