带宽感知和重叠加权压缩，实现高效通信的联合学习

arXiv - CS - Distributed, Parallel, and Cluster Computing Pub Date : 2024-08-27 DOI:arxiv-2408.14736

Zichen Tang, Junlin Huang, Rudan Yan, Yuxin Wang, Zhenheng Tang, Shaohuai Shi, Amelie Chi Zhou, Xiaowen Chu

{"title":"带宽感知和重叠加权压缩，实现高效通信的联合学习","authors":"Zichen Tang, Junlin Huang, Rudan Yan, Yuxin Wang, Zhenheng Tang, Shaohuai Shi, Amelie Chi Zhou, Xiaowen Chu","doi":"arxiv-2408.14736","DOIUrl":null,"url":null,"abstract":"Current data compression methods, such as sparsification in Federated\nAveraging (FedAvg), effectively enhance the communication efficiency of\nFederated Learning (FL). However, these methods encounter challenges such as\nthe straggler problem and diminished model performance due to heterogeneous\nbandwidth and non-IID (Independently and Identically Distributed) data. To\naddress these issues, we introduce a bandwidth-aware compression framework for\nFL, aimed at improving communication efficiency while mitigating the problems\nassociated with non-IID data. First, our strategy dynamically adjusts\ncompression ratios according to bandwidth, enabling clients to upload their\nmodels at a close pace, thus exploiting the otherwise wasted time to transmit\nmore data. Second, we identify the non-overlapped pattern of retained\nparameters after compression, which results in diminished client update signals\ndue to uniformly averaged weights. Based on this finding, we propose a\nparameter mask to adjust the client-averaging coefficients at the parameter\nlevel, thereby more closely approximating the original updates, and improving\nthe training convergence under heterogeneous environments. Our evaluations\nreveal that our method significantly boosts model accuracy, with a maximum\nimprovement of 13% over the uncompressed FedAvg. Moreover, it achieves a\n$3.37\\times$ speedup in reaching the target accuracy compared to FedAvg with a\nTop-K compressor, demonstrating its effectiveness in accelerating convergence\nwith compression. The integration of common compression techniques into our\nframework further establishes its potential as a versatile foundation for\nfuture cross-device, communication-efficient FL research, addressing critical\nchallenges in FL and advancing the field of distributed machine learning.","PeriodicalId":501422,"journal":{"name":"arXiv - CS - Distributed, Parallel, and Cluster Computing","volume":"25 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Bandwidth-Aware and Overlap-Weighted Compression for Communication-Efficient Federated Learning\",\"authors\":\"Zichen Tang, Junlin Huang, Rudan Yan, Yuxin Wang, Zhenheng Tang, Shaohuai Shi, Amelie Chi Zhou, Xiaowen Chu\",\"doi\":\"arxiv-2408.14736\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Current data compression methods, such as sparsification in Federated\\nAveraging (FedAvg), effectively enhance the communication efficiency of\\nFederated Learning (FL). However, these methods encounter challenges such as\\nthe straggler problem and diminished model performance due to heterogeneous\\nbandwidth and non-IID (Independently and Identically Distributed) data. To\\naddress these issues, we introduce a bandwidth-aware compression framework for\\nFL, aimed at improving communication efficiency while mitigating the problems\\nassociated with non-IID data. First, our strategy dynamically adjusts\\ncompression ratios according to bandwidth, enabling clients to upload their\\nmodels at a close pace, thus exploiting the otherwise wasted time to transmit\\nmore data. Second, we identify the non-overlapped pattern of retained\\nparameters after compression, which results in diminished client update signals\\ndue to uniformly averaged weights. Based on this finding, we propose a\\nparameter mask to adjust the client-averaging coefficients at the parameter\\nlevel, thereby more closely approximating the original updates, and improving\\nthe training convergence under heterogeneous environments. Our evaluations\\nreveal that our method significantly boosts model accuracy, with a maximum\\nimprovement of 13% over the uncompressed FedAvg. Moreover, it achieves a\\n$3.37\\\\times$ speedup in reaching the target accuracy compared to FedAvg with a\\nTop-K compressor, demonstrating its effectiveness in accelerating convergence\\nwith compression. The integration of common compression techniques into our\\nframework further establishes its potential as a versatile foundation for\\nfuture cross-device, communication-efficient FL research, addressing critical\\nchallenges in FL and advancing the field of distributed machine learning.\",\"PeriodicalId\":501422,\"journal\":{\"name\":\"arXiv - CS - Distributed, Parallel, and Cluster Computing\",\"volume\":\"25 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Distributed, Parallel, and Cluster Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.14736\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Distributed, Parallel, and Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.14736","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

当前的数据压缩方法，如联合平均（FedAvg）中的稀疏化，有效地提高了联合学习（FL）的通信效率。然而，这些方法也遇到了一些挑战，比如由于带宽不均和非独立且相同分布（IID）数据导致的杂波问题和模型性能下降。为了解决这些问题，我们为FL 引入了带宽感知压缩框架，旨在提高通信效率，同时缓解与非 IID 数据相关的问题。首先，我们的策略会根据带宽动态调整压缩比，使客户端能以较快的速度上传模型，从而利用原本浪费的时间传输更多数据。其次，我们确定了压缩后保留参数的非重叠模式，这种模式会由于权重的均匀平均而导致客户端更新信号的减少。基于这一发现，我们提出了一种参数掩码来调整参数级的客户端平均系数，从而更接近原始更新，并改善异构环境下的训练收敛性。评估结果表明，我们的方法显著提高了模型的准确性，与未压缩的 FedAvg 相比，最大提高了 13%。此外，与使用 Top-K 压缩器的 FedAvg 相比，它在达到目标精度方面的速度提高了 3.37 倍，这证明了它在通过压缩加速收敛方面的有效性。将常用压缩技术集成到我们的框架中，进一步确立了它作为未来跨设备、高效通信 FL 研究的多功能基础的潜力，解决了 FL 中的关键挑战，推动了分布式机器学习领域的发展。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Bandwidth-Aware and Overlap-Weighted Compression for Communication-Efficient Federated Learning

Current data compression methods, such as sparsification in Federated Averaging (FedAvg), effectively enhance the communication efficiency of Federated Learning (FL). However, these methods encounter challenges such as the straggler problem and diminished model performance due to heterogeneous bandwidth and non-IID (Independently and Identically Distributed) data. To address these issues, we introduce a bandwidth-aware compression framework for FL, aimed at improving communication efficiency while mitigating the problems associated with non-IID data. First, our strategy dynamically adjusts compression ratios according to bandwidth, enabling clients to upload their models at a close pace, thus exploiting the otherwise wasted time to transmit more data. Second, we identify the non-overlapped pattern of retained parameters after compression, which results in diminished client update signals due to uniformly averaged weights. Based on this finding, we propose a parameter mask to adjust the client-averaging coefficients at the parameter level, thereby more closely approximating the original updates, and improving the training convergence under heterogeneous environments. Our evaluations reveal that our method significantly boosts model accuracy, with a maximum improvement of 13% over the uncompressed FedAvg. Moreover, it achieves a $3.37\times$ speedup in reaching the target accuracy compared to FedAvg with a Top-K compressor, demonstrating its effectiveness in accelerating convergence with compression. The integration of common compression techniques into our framework further establishes its potential as a versatile foundation for future cross-device, communication-efficient FL research, addressing critical challenges in FL and advancing the field of distributed machine learning.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助