{"title":"Federated Learning Based on Model Discrepancy and Variance Reduction","authors":"Hao Zhang;Chenglin Li;Wenrui Dai;Junni Zou;Hongkai Xiong","doi":"10.1109/TNNLS.2024.3517658","DOIUrl":null,"url":null,"abstract":"In federated learning (FL), the heterogeneity of data and asynchronous participation of clients have been observed to induce the local client’s model discrepancy with high variance, leading to a slow and unstable convergence globally at the server. In this article, motivated by the usefulness of stale client updates, we first propose a general framework, named FedVR, to address this issue. In FedVR, we design an aggregate of both fresh and stale local model updates without additional communication overhead, which is computed at the server as a control variate to reduce the client variance incurred by data heterogeneity and client sampling. In order to further reduce the model discrepancy between local clients, we therefore propose FedMDVR, which broadcasts the designed control variate to all the active clients to help correct their local update directions toward the global optimum, i.e., stationary point of the global objective function. While in the global update at server, the client variance is also decreased as inherited from the variance reduction nature of FedVR. We theoretically prove the convergence of FedVR and FedMDVR in the general nonconvex settings. Through extensive experimental evaluations on several benchmark datasets, we also demonstrate that our proposed FedVR and FedMDVR not only accelerate the convergence by reducing the number of communication rounds required to achieve a certain target accuracy, but more importantly, can converge to a higher accuracy than the baseline algorithms.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"36 6","pages":"10407-10421"},"PeriodicalIF":8.9000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10819958/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In federated learning (FL), the heterogeneity of data and asynchronous participation of clients have been observed to induce the local client’s model discrepancy with high variance, leading to a slow and unstable convergence globally at the server. In this article, motivated by the usefulness of stale client updates, we first propose a general framework, named FedVR, to address this issue. In FedVR, we design an aggregate of both fresh and stale local model updates without additional communication overhead, which is computed at the server as a control variate to reduce the client variance incurred by data heterogeneity and client sampling. In order to further reduce the model discrepancy between local clients, we therefore propose FedMDVR, which broadcasts the designed control variate to all the active clients to help correct their local update directions toward the global optimum, i.e., stationary point of the global objective function. While in the global update at server, the client variance is also decreased as inherited from the variance reduction nature of FedVR. We theoretically prove the convergence of FedVR and FedMDVR in the general nonconvex settings. Through extensive experimental evaluations on several benchmark datasets, we also demonstrate that our proposed FedVR and FedMDVR not only accelerate the convergence by reducing the number of communication rounds required to achieve a certain target accuracy, but more importantly, can converge to a higher accuracy than the baseline algorithms.
期刊介绍:
The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.