{"title":"Bandwidth-Aware and Overlap-Weighted Compression for Communication-Efficient Federated Learning","authors":"Zichen Tang, Junlin Huang, Rudan Yan, Yuxin Wang, Zhenheng Tang, Shaohuai Shi, Amelie Chi Zhou, Xiaowen Chu","doi":"arxiv-2408.14736","DOIUrl":null,"url":null,"abstract":"Current data compression methods, such as sparsification in Federated\nAveraging (FedAvg), effectively enhance the communication efficiency of\nFederated Learning (FL). However, these methods encounter challenges such as\nthe straggler problem and diminished model performance due to heterogeneous\nbandwidth and non-IID (Independently and Identically Distributed) data. To\naddress these issues, we introduce a bandwidth-aware compression framework for\nFL, aimed at improving communication efficiency while mitigating the problems\nassociated with non-IID data. First, our strategy dynamically adjusts\ncompression ratios according to bandwidth, enabling clients to upload their\nmodels at a close pace, thus exploiting the otherwise wasted time to transmit\nmore data. Second, we identify the non-overlapped pattern of retained\nparameters after compression, which results in diminished client update signals\ndue to uniformly averaged weights. Based on this finding, we propose a\nparameter mask to adjust the client-averaging coefficients at the parameter\nlevel, thereby more closely approximating the original updates, and improving\nthe training convergence under heterogeneous environments. Our evaluations\nreveal that our method significantly boosts model accuracy, with a maximum\nimprovement of 13% over the uncompressed FedAvg. Moreover, it achieves a\n$3.37\\times$ speedup in reaching the target accuracy compared to FedAvg with a\nTop-K compressor, demonstrating its effectiveness in accelerating convergence\nwith compression. The integration of common compression techniques into our\nframework further establishes its potential as a versatile foundation for\nfuture cross-device, communication-efficient FL research, addressing critical\nchallenges in FL and advancing the field of distributed machine learning.","PeriodicalId":501422,"journal":{"name":"arXiv - CS - Distributed, Parallel, and Cluster Computing","volume":"25 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Distributed, Parallel, and Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.14736","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Current data compression methods, such as sparsification in Federated
Averaging (FedAvg), effectively enhance the communication efficiency of
Federated Learning (FL). However, these methods encounter challenges such as
the straggler problem and diminished model performance due to heterogeneous
bandwidth and non-IID (Independently and Identically Distributed) data. To
address these issues, we introduce a bandwidth-aware compression framework for
FL, aimed at improving communication efficiency while mitigating the problems
associated with non-IID data. First, our strategy dynamically adjusts
compression ratios according to bandwidth, enabling clients to upload their
models at a close pace, thus exploiting the otherwise wasted time to transmit
more data. Second, we identify the non-overlapped pattern of retained
parameters after compression, which results in diminished client update signals
due to uniformly averaged weights. Based on this finding, we propose a
parameter mask to adjust the client-averaging coefficients at the parameter
level, thereby more closely approximating the original updates, and improving
the training convergence under heterogeneous environments. Our evaluations
reveal that our method significantly boosts model accuracy, with a maximum
improvement of 13% over the uncompressed FedAvg. Moreover, it achieves a
$3.37\times$ speedup in reaching the target accuracy compared to FedAvg with a
Top-K compressor, demonstrating its effectiveness in accelerating convergence
with compression. The integration of common compression techniques into our
framework further establishes its potential as a versatile foundation for
future cross-device, communication-efficient FL research, addressing critical
challenges in FL and advancing the field of distributed machine learning.