Federated Learning Based on Model Discrepancy and Variance Reduction

IF 8.9 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE IEEE transactions on neural networks and learning systems Pub Date : 2025-01-01 DOI:10.1109/TNNLS.2024.3517658

Hao Zhang;Chenglin Li;Wenrui Dai;Junni Zou;Hongkai Xiong

{"title":"Federated Learning Based on Model Discrepancy and Variance Reduction","authors":"Hao Zhang;Chenglin Li;Wenrui Dai;Junni Zou;Hongkai Xiong","doi":"10.1109/TNNLS.2024.3517658","DOIUrl":null,"url":null,"abstract":"In federated learning (FL), the heterogeneity of data and asynchronous participation of clients have been observed to induce the local client’s model discrepancy with high variance, leading to a slow and unstable convergence globally at the server. In this article, motivated by the usefulness of stale client updates, we first propose a general framework, named FedVR, to address this issue. In FedVR, we design an aggregate of both fresh and stale local model updates without additional communication overhead, which is computed at the server as a control variate to reduce the client variance incurred by data heterogeneity and client sampling. In order to further reduce the model discrepancy between local clients, we therefore propose FedMDVR, which broadcasts the designed control variate to all the active clients to help correct their local update directions toward the global optimum, i.e., stationary point of the global objective function. While in the global update at server, the client variance is also decreased as inherited from the variance reduction nature of FedVR. We theoretically prove the convergence of FedVR and FedMDVR in the general nonconvex settings. Through extensive experimental evaluations on several benchmark datasets, we also demonstrate that our proposed FedVR and FedMDVR not only accelerate the convergence by reducing the number of communication rounds required to achieve a certain target accuracy, but more importantly, can converge to a higher accuracy than the baseline algorithms.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"36 6","pages":"10407-10421"},"PeriodicalIF":8.9000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10819958/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

In federated learning (FL), the heterogeneity of data and asynchronous participation of clients have been observed to induce the local client’s model discrepancy with high variance, leading to a slow and unstable convergence globally at the server. In this article, motivated by the usefulness of stale client updates, we first propose a general framework, named FedVR, to address this issue. In FedVR, we design an aggregate of both fresh and stale local model updates without additional communication overhead, which is computed at the server as a control variate to reduce the client variance incurred by data heterogeneity and client sampling. In order to further reduce the model discrepancy between local clients, we therefore propose FedMDVR, which broadcasts the designed control variate to all the active clients to help correct their local update directions toward the global optimum, i.e., stationary point of the global objective function. While in the global update at server, the client variance is also decreased as inherited from the variance reduction nature of FedVR. We theoretically prove the convergence of FedVR and FedMDVR in the general nonconvex settings. Through extensive experimental evaluations on several benchmark datasets, we also demonstrate that our proposed FedVR and FedMDVR not only accelerate the convergence by reducing the number of communication rounds required to achieve a certain target accuracy, but more importantly, can converge to a higher accuracy than the baseline algorithms.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于模型差异和方差缩减的联邦学习

在联邦学习（FL）中，数据的异构性和客户端的异步参与导致本地客户端的模型差异具有高方差，从而导致服务器端的全局收敛缓慢且不稳定。在本文中，由于陈旧客户机更新的有用性，我们首先提出了一个通用框架，名为FedVR，来解决这个问题。在FedVR中，我们设计了一个新的和过时的本地模型更新的集合，没有额外的通信开销，它作为控制变量在服务器上计算，以减少由数据异构和客户端采样引起的客户端方差。为了进一步减少局部客户端之间的模型差异，我们提出了FedMDVR，它将设计的控制变量广播给所有活动客户端，以帮助纠正它们的局部更新方向，以达到全局最优，即全局目标函数的平稳点。而在服务器端的全局更新中，由于继承了FedVR的方差减小特性，客户端的方差也减小了。从理论上证明了FedVR和FedMDVR在一般非凸条件下的收敛性。通过对多个基准数据集的大量实验评估，我们还证明了我们提出的FedVR和FedMDVR不仅通过减少达到一定目标精度所需的通信轮数来加速收敛，而且更重要的是，可以收敛到比基线算法更高的精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE transactions on neural networks and learning systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

CiteScore

23.80

自引率

9.60%

发文量

2102

审稿时长

3-8 weeks

期刊介绍： The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.