Stabilizing and Accelerating Federated Learning on Heterogeneous Data With Partial Client Participation

Hao Zhang;Chenglin Li;Wenrui Dai;Ziyang Zheng;Junni Zou;Hongkai Xiong
{"title":"Stabilizing and Accelerating Federated Learning on Heterogeneous Data With Partial Client Participation","authors":"Hao Zhang;Chenglin Li;Wenrui Dai;Ziyang Zheng;Junni Zou;Hongkai Xiong","doi":"10.1109/TPAMI.2024.3469188","DOIUrl":null,"url":null,"abstract":"Federated learning (FL) commonly encourages the clients to perform multiple local updates before the global aggregation, thus avoiding frequent model exchanges and relieving the communication bottleneck between the server and clients. Though empirically effective, the negative impact of multiple local updates on the stability of FL is not thoroughly studied, which may result in a globally unstable and slow convergence. Based on sensitivity analysis, we define in this paper a local-update stability index for the general FL, as measured by the maximum inter-client model discrepancy after the multiple local updates that mainly stems from the data heterogeneity. It enables to determine how much the variation of client’s models with multiple local updates may influence the global model, and can also be linked with the convergence and generalization. We theoretically derive the proposed local-update stability for current state-of-the-art FL methods, providing possible insight to understanding their motivation and limitation from a new perspective of stability. For example, naively executing the parallel acceleration locally at clients would harm the local-update stability. Motivated by this, we then propose a novel accelerated yet stabilized FL algorithm (named FedANAG) based on the server- and client-level Nesterov accelerated gradient (NAG). In FedANAG, the global and local momenta are elaborately designed and alternatively updated, while the stability of local update is enhanced with help of the global momentum. We prove the convergence of FedANAG for strongly convex, general convex and non-convex settings. We then conduct evaluations on both the synthetic and real-world datasets to first validate our proposed local-update stability. The results further show that across various data heterogeneity and client participation ratios, FedANAG not only accelerates the global convergence by reducing the required number of communication rounds to a target accuracy, but converges to an eventually higher accuracy.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 1","pages":"67-83"},"PeriodicalIF":18.6000,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10696955/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Federated learning (FL) commonly encourages the clients to perform multiple local updates before the global aggregation, thus avoiding frequent model exchanges and relieving the communication bottleneck between the server and clients. Though empirically effective, the negative impact of multiple local updates on the stability of FL is not thoroughly studied, which may result in a globally unstable and slow convergence. Based on sensitivity analysis, we define in this paper a local-update stability index for the general FL, as measured by the maximum inter-client model discrepancy after the multiple local updates that mainly stems from the data heterogeneity. It enables to determine how much the variation of client’s models with multiple local updates may influence the global model, and can also be linked with the convergence and generalization. We theoretically derive the proposed local-update stability for current state-of-the-art FL methods, providing possible insight to understanding their motivation and limitation from a new perspective of stability. For example, naively executing the parallel acceleration locally at clients would harm the local-update stability. Motivated by this, we then propose a novel accelerated yet stabilized FL algorithm (named FedANAG) based on the server- and client-level Nesterov accelerated gradient (NAG). In FedANAG, the global and local momenta are elaborately designed and alternatively updated, while the stability of local update is enhanced with help of the global momentum. We prove the convergence of FedANAG for strongly convex, general convex and non-convex settings. We then conduct evaluations on both the synthetic and real-world datasets to first validate our proposed local-update stability. The results further show that across various data heterogeneity and client participation ratios, FedANAG not only accelerates the global convergence by reducing the required number of communication rounds to a target accuracy, but converges to an eventually higher accuracy.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在有部分客户参与的异构数据上稳定并加速联合学习
联邦学习(FL)通常鼓励客户机在全局聚合之前执行多个本地更新,从而避免频繁的模型交换并缓解服务器和客户机之间的通信瓶颈。虽然经验上有效,但多次局部更新对FL稳定性的负面影响尚未深入研究,这可能导致全局不稳定和缓慢收敛。在敏感性分析的基础上,我们定义了一个通用FL的局部更新稳定性指标,该指标主要由数据异质性引起的多次局部更新后客户端模型间差异的最大值来衡量。它能够确定具有多个局部更新的客户模型的变化对全局模型的影响程度,并且还可以将其与收敛和泛化联系起来。我们从理论上推导了当前最先进的FL方法的局部更新稳定性,为从新的稳定性角度理解它们的动机和局限性提供了可能的见解。例如,在客户端本地天真地执行并行加速会损害本地更新的稳定性。基于此,我们提出了一种基于服务器和客户端级Nesterov加速梯度(NAG)的新型加速但稳定的FL算法(命名为FedANAG)。在FedANAG中,对全局动量和局部动量进行了精心设计和交替更新,同时借助全局动量增强了局部更新的稳定性。证明了FedANAG在强凸、一般凸和非凸条件下的收敛性。然后,我们对合成数据集和真实数据集进行评估,首先验证我们提出的本地更新稳定性。结果进一步表明,在不同的数据异质性和客户参与率下,FedANAG不仅通过减少所需的通信轮数来加速全局收敛,而且最终收敛到更高的精度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
CrossEarth: Geospatial Vision Foundation Model for Domain Generalizable Remote Sensing Semantic Segmentation. Continuous Review and Timely Correction: Enhancing the Resistance to Noisy Labels via Self-Not-True and Class-Wise Distillation. On the Transferability and Discriminability of Representation Learning in Unsupervised Domain Adaptation. Fast Multi-view Discrete Clustering via Spectral Embedding Fusion. GrowSP++: Growing Superpoints and Primitives for Unsupervised 3D Semantic Segmentation.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1