多源海量异构数据的自适应分布式推理

IF 0.8 3区 数学 Q2 MATHEMATICS Acta Mathematica Sinica-English Series Pub Date : 2024-11-15 DOI:10.1007/s10114-024-2524-4
Xin Yang, Qi Jing Yan, Mi Xia Wu
{"title":"多源海量异构数据的自适应分布式推理","authors":"Xin Yang,&nbsp;Qi Jing Yan,&nbsp;Mi Xia Wu","doi":"10.1007/s10114-024-2524-4","DOIUrl":null,"url":null,"abstract":"<div><p>In this paper, we consider the distributed inference for heterogeneous linear models with massive datasets. Noting that heterogeneity may exist not only in the expectations of the subpopulations, but also in their variances, we propose the heteroscedasticity-adaptive distributed aggregation (HADA) estimation, which is shown to be communication-efficient and asymptotically optimal, regardless of homoscedasticity or heteroscedasticity. Furthermore, a distributed test for parameter heterogeneity across subpopulations is constructed based on the HADA estimator. The finite-sample performance of the proposed methods is evaluated using simulation studies and the NYC flight data.</p></div>","PeriodicalId":50893,"journal":{"name":"Acta Mathematica Sinica-English Series","volume":"40 11","pages":"2751 - 2770"},"PeriodicalIF":0.8000,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adaptive Distributed Inference for Multi-source Massive Heterogeneous Data\",\"authors\":\"Xin Yang,&nbsp;Qi Jing Yan,&nbsp;Mi Xia Wu\",\"doi\":\"10.1007/s10114-024-2524-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In this paper, we consider the distributed inference for heterogeneous linear models with massive datasets. Noting that heterogeneity may exist not only in the expectations of the subpopulations, but also in their variances, we propose the heteroscedasticity-adaptive distributed aggregation (HADA) estimation, which is shown to be communication-efficient and asymptotically optimal, regardless of homoscedasticity or heteroscedasticity. Furthermore, a distributed test for parameter heterogeneity across subpopulations is constructed based on the HADA estimator. The finite-sample performance of the proposed methods is evaluated using simulation studies and the NYC flight data.</p></div>\",\"PeriodicalId\":50893,\"journal\":{\"name\":\"Acta Mathematica Sinica-English Series\",\"volume\":\"40 11\",\"pages\":\"2751 - 2770\"},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2024-11-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Acta Mathematica Sinica-English Series\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10114-024-2524-4\",\"RegionNum\":3,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATHEMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Mathematica Sinica-English Series","FirstCategoryId":"100","ListUrlMain":"https://link.springer.com/article/10.1007/s10114-024-2524-4","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS","Score":null,"Total":0}
引用次数: 0

摘要

在本文中,我们考虑了具有海量数据集的异质性线性模型的分布式推断。考虑到异质性不仅可能存在于子群体的期望中,也可能存在于它们的方差中,我们提出了异方差自适应分布式聚合(HADA)估计,结果表明,无论同方差还是异方差,HADA 估计都具有通信效率和渐近最优性。此外,基于 HADA 估计器还构建了一种跨子群体的分布式参数异质性检验。利用模拟研究和纽约市的飞行数据对所提方法的有限样本性能进行了评估。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Adaptive Distributed Inference for Multi-source Massive Heterogeneous Data

In this paper, we consider the distributed inference for heterogeneous linear models with massive datasets. Noting that heterogeneity may exist not only in the expectations of the subpopulations, but also in their variances, we propose the heteroscedasticity-adaptive distributed aggregation (HADA) estimation, which is shown to be communication-efficient and asymptotically optimal, regardless of homoscedasticity or heteroscedasticity. Furthermore, a distributed test for parameter heterogeneity across subpopulations is constructed based on the HADA estimator. The finite-sample performance of the proposed methods is evaluated using simulation studies and the NYC flight data.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
1.00
自引率
0.00%
发文量
138
审稿时长
14.5 months
期刊介绍: Acta Mathematica Sinica, established by the Chinese Mathematical Society in 1936, is the first and the best mathematical journal in China. In 1985, Acta Mathematica Sinica is divided into English Series and Chinese Series. The English Series is a monthly journal, publishing significant research papers from all branches of pure and applied mathematics. It provides authoritative reviews of current developments in mathematical research. Contributions are invited from researchers from all over the world.
期刊最新文献
Compactness of Extremals for Trudinger-Moser Functionals on the Unit Ball in ℝ2 On the Centralizers of Rescaling Separating Differentiable Vector Fields Variable Degeneracy of Planar Graphs without Chorded 6-Cycles Adaptive Distributed Inference for Multi-source Massive Heterogeneous Data L2 Schrödinger Maximal Estimates Associated with Finite Type Phases in ℝ2
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1