Model Pruning for Distributed Learning Over the Air

IF 4.6 2区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Signal Processing Pub Date : 2024-10-24 DOI:10.1109/TSP.2024.3486169
Zhongyuan Zhao;Kailei Xu;Wei Hong;Mugen Peng;Zhiguo Ding;Tony Q. S. Quek;Howard H. Yang
{"title":"Model Pruning for Distributed Learning Over the Air","authors":"Zhongyuan Zhao;Kailei Xu;Wei Hong;Mugen Peng;Zhiguo Ding;Tony Q. S. Quek;Howard H. Yang","doi":"10.1109/TSP.2024.3486169","DOIUrl":null,"url":null,"abstract":"Analog over-the-air (A-OTA) computing is an effective approach to achieving distributed learning among multiple end-user devices within a bandwidth-constrained spectrum. In this paradigm, users’ intermediate parameters, such as gradients, are modulated onto a set of common waveforms and concurrently transmitted to the parameter server. Benefiting from the superposition property of multi-access channels, the server can obtain an automatically aggregated global gradient from the received signal without decoding individual user's information. Nonetheless, the scarcity of orthogonal waveforms constrains such a paradigm from adopting complex deep learning models. In this paper, we develop model pruning strategies for A-OTA distributed learning, balancing the tradeoff between communication efficiency and learning performance. Specifically, we design an importance measure to evaluate the contribution of each entry in the model parameter based on the noisy aggregated gradient introduced by A-OTA computing. We also derive an analytical expression for the training error bound, which shows that the proposed scheme can converge even when the aggregated gradient is corrupted by heavy-tailed interference with unbounded variance. We further improve the developed algorithm by incorporating the momentum method to (a) enhance the design of the importance measure and (b) accelerate the model convergence rate. Extensive experiments are conducted to validate the performance gains achieved by our proposed scheme and verify the correctness of analytical results.","PeriodicalId":13330,"journal":{"name":"IEEE Transactions on Signal Processing","volume":"72 ","pages":"5533-5549"},"PeriodicalIF":4.6000,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10734153/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Analog over-the-air (A-OTA) computing is an effective approach to achieving distributed learning among multiple end-user devices within a bandwidth-constrained spectrum. In this paradigm, users’ intermediate parameters, such as gradients, are modulated onto a set of common waveforms and concurrently transmitted to the parameter server. Benefiting from the superposition property of multi-access channels, the server can obtain an automatically aggregated global gradient from the received signal without decoding individual user's information. Nonetheless, the scarcity of orthogonal waveforms constrains such a paradigm from adopting complex deep learning models. In this paper, we develop model pruning strategies for A-OTA distributed learning, balancing the tradeoff between communication efficiency and learning performance. Specifically, we design an importance measure to evaluate the contribution of each entry in the model parameter based on the noisy aggregated gradient introduced by A-OTA computing. We also derive an analytical expression for the training error bound, which shows that the proposed scheme can converge even when the aggregated gradient is corrupted by heavy-tailed interference with unbounded variance. We further improve the developed algorithm by incorporating the momentum method to (a) enhance the design of the importance measure and (b) accelerate the model convergence rate. Extensive experiments are conducted to validate the performance gains achieved by our proposed scheme and verify the correctness of analytical results.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
空中分布式学习的模型剪枝
模拟空中(a - ota)计算是在带宽受限的频谱内实现多个终端用户设备之间分布式学习的有效方法。在这种范例中,用户的中间参数,如梯度,被调制成一组公共波形并并发地传输到参数服务器。利用多址信道的叠加特性,服务器可以在不解码单个用户信息的情况下,从接收信号中自动获得全局梯度。然而,正交波形的稀缺性限制了这种范式采用复杂的深度学习模型。在本文中,我们开发了用于A-OTA分布式学习的模型修剪策略,以平衡通信效率和学习性能之间的权衡。具体而言,我们基于A-OTA计算引入的噪声聚合梯度,设计了一个重要度量来评估模型参数中每个条目的贡献。我们还推导了训练误差界的解析表达式,表明该方法即使在聚集梯度被具有无界方差的重尾干扰破坏时也能收敛。我们进一步改进了所开发的算法,通过引入动量方法来(a)加强重要性度量的设计(b)加快模型的收敛速度。大量的实验验证了我们所提出的方案所获得的性能增益,并验证了分析结果的正确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Signal Processing
IEEE Transactions on Signal Processing 工程技术-工程:电子与电气
CiteScore
11.20
自引率
9.30%
发文量
310
审稿时长
3.0 months
期刊介绍: The IEEE Transactions on Signal Processing covers novel theory, algorithms, performance analyses and applications of techniques for the processing, understanding, learning, retrieval, mining, and extraction of information from signals. The term “signal” includes, among others, audio, video, speech, image, communication, geophysical, sonar, radar, medical and musical signals. Examples of topics of interest include, but are not limited to, information processing and the theory and application of filtering, coding, transmitting, estimating, detecting, analyzing, recognizing, synthesizing, recording, and reproducing signals.
期刊最新文献
A Mirror Descent-Based Algorithm for Corruption-Tolerant Distributed Gradient Descent Tracking Multiple Resolvable Group Targets with Coordinated Motion via Labeled Random Finite Sets Energy-Efficient Flat Precoding for MIMO Systems Successive Refinement in Large-Scale Computation: Expediting Model Inference Applications ALPCAH: Subspace Learning for Sample-wise Heteroscedastic Data
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1