FedQClip: Accelerating Federated Learning via Quantized Clipped SGD

IF 3.6 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE IEEE Transactions on Computers Pub Date : 2024-10-10 DOI:10.1109/TC.2024.3477972
Zhihao Qu;Ninghui Jia;Baoliu Ye;Shihong Hu;Song Guo
{"title":"FedQClip: Accelerating Federated Learning via Quantized Clipped SGD","authors":"Zhihao Qu;Ninghui Jia;Baoliu Ye;Shihong Hu;Song Guo","doi":"10.1109/TC.2024.3477972","DOIUrl":null,"url":null,"abstract":"Federated Learning (FL) has emerged as a promising technique for collaboratively training machine learning models among multiple participants while preserving privacy-sensitive data. However, the conventional parameter server architecture presents challenges in terms of communication overhead when employing iterative optimization methods such as Stochastic Gradient Descent (SGD). Although communication compression techniques can reduce the traffic cost of FL during each training round, they often lead to degraded convergence rates, mainly due to compression errors and data heterogeneity. To address these issues, this paper presents FedQClip, an innovative approach that combines quantization and Clipped SGD. FedQClip leverages an adaptive step size inversely proportional to the <inline-formula><tex-math>$\\ell_{2}$</tex-math></inline-formula> norm of the gradient, effectively mitigating the negative impacts of quantized errors. Additionally, clipped operations can be applied locally and globally to further expedite training. Theoretical analyses provide evidence that, even under the settings of Non-IID (non-independent and identically distributed) data, FedQClip achieves a convergence rate of <inline-formula><tex-math>$\\mathcal{O}(\\frac{1}{\\sqrt{T}})$</tex-math></inline-formula>, effectively addressing the convergence degradation caused by compression errors. Furthermore, our theoretical analysis highlights the importance of selecting an appropriate number of local updates to enhance the convergence of FL training. Through extensive experiments, we demonstrate that FedQClip outperforms state-of-the-art methods in terms of communication efficiency and convergence rate.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 2","pages":"717-730"},"PeriodicalIF":3.6000,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computers","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10713249/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Federated Learning (FL) has emerged as a promising technique for collaboratively training machine learning models among multiple participants while preserving privacy-sensitive data. However, the conventional parameter server architecture presents challenges in terms of communication overhead when employing iterative optimization methods such as Stochastic Gradient Descent (SGD). Although communication compression techniques can reduce the traffic cost of FL during each training round, they often lead to degraded convergence rates, mainly due to compression errors and data heterogeneity. To address these issues, this paper presents FedQClip, an innovative approach that combines quantization and Clipped SGD. FedQClip leverages an adaptive step size inversely proportional to the $\ell_{2}$ norm of the gradient, effectively mitigating the negative impacts of quantized errors. Additionally, clipped operations can be applied locally and globally to further expedite training. Theoretical analyses provide evidence that, even under the settings of Non-IID (non-independent and identically distributed) data, FedQClip achieves a convergence rate of $\mathcal{O}(\frac{1}{\sqrt{T}})$, effectively addressing the convergence degradation caused by compression errors. Furthermore, our theoretical analysis highlights the importance of selecting an appropriate number of local updates to enhance the convergence of FL training. Through extensive experiments, we demonstrate that FedQClip outperforms state-of-the-art methods in terms of communication efficiency and convergence rate.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
FedQClip:通过量化的裁剪SGD加速联邦学习
联邦学习(FL)已经成为一种有前途的技术,可以在多个参与者之间协作训练机器学习模型,同时保护隐私敏感数据。然而,当采用随机梯度下降(SGD)等迭代优化方法时,传统的参数服务器架构在通信开销方面存在挑战。尽管通信压缩技术可以在每个训练回合中降低FL的流量成本,但它们通常会导致收敛速度下降,主要原因是压缩错误和数据异构。为了解决这些问题,本文提出了FedQClip,一种结合量化和Clipped SGD的创新方法。FedQClip利用与梯度的$\ell_{2}$范数成反比的自适应步长,有效地减轻量化误差的负面影响。此外,剪辑操作可以应用于本地和全球,以进一步加快培训。理论分析证明,即使在非iid(非独立同分布)数据的设置下,FedQClip的收敛率也达到$\mathcal{O}(\frac{1}{\sqrt{T}})$,有效地解决了压缩错误导致的收敛性下降问题。此外,我们的理论分析强调了选择适当数量的局部更新以增强FL训练收敛性的重要性。通过广泛的实验,我们证明FedQClip在通信效率和收敛速度方面优于最先进的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Computers
IEEE Transactions on Computers 工程技术-工程:电子与电气
CiteScore
6.60
自引率
5.40%
发文量
199
审稿时长
6.0 months
期刊介绍: The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field. It publishes papers on research in areas of current interest to the readers. These areas include, but are not limited to, the following: a) computer organizations and architectures; b) operating systems, software systems, and communication protocols; c) real-time systems and embedded systems; d) digital devices, computer components, and interconnection networks; e) specification, design, prototyping, and testing methods and tools; f) performance, fault tolerance, reliability, security, and testability; g) case studies and experimental and theoretical evaluations; and h) new and important applications and trends.
期刊最新文献
2024 Reviewers List Shared Recurrence Floating-Point Divide/Sqrt and Integer Divide/Remainder With Early Termination A System-Level Test Methodology for Communication Peripherals in System-on-Chips Stream: Design Space Exploration of Layer-Fused DNNs on Heterogeneous Dataflow Accelerators Balancing Privacy and Accuracy Using Significant Gradient Protection in Federated Learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1