为异构数据联合学习实现价值敏感和防中毒的模型聚合

IF 3.4 3区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Journal of Parallel and Distributed Computing Pub Date : 2024-10-11 DOI:10.1016/j.jpdc.2024.104994
Hui Zeng , Tongqing Zhou , Yeting Guo , Zhiping Cai , Fang Liu
{"title":"为异构数据联合学习实现价值敏感和防中毒的模型聚合","authors":"Hui Zeng ,&nbsp;Tongqing Zhou ,&nbsp;Yeting Guo ,&nbsp;Zhiping Cai ,&nbsp;Fang Liu","doi":"10.1016/j.jpdc.2024.104994","DOIUrl":null,"url":null,"abstract":"<div><div>Federated Learning (FL) enables collaborative model training without sharing data, but traditional static averaging of local updates leads to poor performance on heterogeneous data. The following remedies, either by scheduling data distribution or mitigating local discrepancies, predominately fail to handle fine-grained heterogeneity (e.g., local imbalanced labels). To commence, we reveal that static averaging leads to the global model suffering from the <em>mean fallacy</em>. That is, the averaging process favors the local model with large parameters numerically rather than knowledge. To tackle this, we introduce FedVSA, a simple-yet-effective model aggregation framework sensitive to heterogeneous local data merits. Specifically, we invent a new global loss function for FL by prioritizing the valuable local updates, facilitating efficient convergence. We deduce a softmax-based aggregation rule and prove its convergence property via rigorous theoretical analysis. Additionally, we expose poisoning threats of model replacement that utilize the <em>mean fallacy</em> for attacks. To mitigate this threat, we propose a two-step mechanism involving auditing historic local training statistics and analyzing the <em>Shapley Value</em>. Through extensive experiments, we show that FedVSA achieves faster convergence (~1.52×) and higher accuracy (~1.6%) compared to the baselines. It also effectively mitigates poisoning attacks by agilely recovering and returning to normal aggregation.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.4000,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards value-sensitive and poisoning-proof model aggregation for federated learning on heterogeneous data\",\"authors\":\"Hui Zeng ,&nbsp;Tongqing Zhou ,&nbsp;Yeting Guo ,&nbsp;Zhiping Cai ,&nbsp;Fang Liu\",\"doi\":\"10.1016/j.jpdc.2024.104994\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Federated Learning (FL) enables collaborative model training without sharing data, but traditional static averaging of local updates leads to poor performance on heterogeneous data. The following remedies, either by scheduling data distribution or mitigating local discrepancies, predominately fail to handle fine-grained heterogeneity (e.g., local imbalanced labels). To commence, we reveal that static averaging leads to the global model suffering from the <em>mean fallacy</em>. That is, the averaging process favors the local model with large parameters numerically rather than knowledge. To tackle this, we introduce FedVSA, a simple-yet-effective model aggregation framework sensitive to heterogeneous local data merits. Specifically, we invent a new global loss function for FL by prioritizing the valuable local updates, facilitating efficient convergence. We deduce a softmax-based aggregation rule and prove its convergence property via rigorous theoretical analysis. Additionally, we expose poisoning threats of model replacement that utilize the <em>mean fallacy</em> for attacks. To mitigate this threat, we propose a two-step mechanism involving auditing historic local training statistics and analyzing the <em>Shapley Value</em>. Through extensive experiments, we show that FedVSA achieves faster convergence (~1.52×) and higher accuracy (~1.6%) compared to the baselines. It also effectively mitigates poisoning attacks by agilely recovering and returning to normal aggregation.</div></div>\",\"PeriodicalId\":54775,\"journal\":{\"name\":\"Journal of Parallel and Distributed Computing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-10-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Parallel and Distributed Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0743731524001588\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Parallel and Distributed Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0743731524001588","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

摘要

联合学习(FL)可以在不共享数据的情况下进行协作模型训练,但传统的局部更新静态平均化会导致异构数据性能不佳。以下的补救措施,无论是通过调度数据分布还是缓解局部差异,主要都无法处理细粒度的异质性(如局部不平衡标签)。首先,我们发现静态平均会导致全局模型出现均值谬误。也就是说,平均过程在数值上偏向于参数较大的局部模型,而不是知识。为了解决这个问题,我们引入了 FedVSA,这是一个简单而有效的模型聚合框架,对异构的本地数据优点非常敏感。具体来说,我们通过优先考虑有价值的本地更新,为 FL 发明了一种新的全局损失函数,从而促进了高效收敛。我们推导出一种基于 softmax 的聚合规则,并通过严谨的理论分析证明了其收敛特性。此外,我们还揭露了利用均值谬误进行攻击的模型替换中毒威胁。为了减轻这种威胁,我们提出了一种两步机制,包括审核历史局部训练统计数据和分析 Shapley 值。通过大量实验,我们发现与基线相比,FedVSA 的收敛速度更快(约为 1.52 倍),准确率更高(约为 1.6%)。它还能通过敏捷恢复和返回正常聚合来有效缓解中毒攻击。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Towards value-sensitive and poisoning-proof model aggregation for federated learning on heterogeneous data
Federated Learning (FL) enables collaborative model training without sharing data, but traditional static averaging of local updates leads to poor performance on heterogeneous data. The following remedies, either by scheduling data distribution or mitigating local discrepancies, predominately fail to handle fine-grained heterogeneity (e.g., local imbalanced labels). To commence, we reveal that static averaging leads to the global model suffering from the mean fallacy. That is, the averaging process favors the local model with large parameters numerically rather than knowledge. To tackle this, we introduce FedVSA, a simple-yet-effective model aggregation framework sensitive to heterogeneous local data merits. Specifically, we invent a new global loss function for FL by prioritizing the valuable local updates, facilitating efficient convergence. We deduce a softmax-based aggregation rule and prove its convergence property via rigorous theoretical analysis. Additionally, we expose poisoning threats of model replacement that utilize the mean fallacy for attacks. To mitigate this threat, we propose a two-step mechanism involving auditing historic local training statistics and analyzing the Shapley Value. Through extensive experiments, we show that FedVSA achieves faster convergence (~1.52×) and higher accuracy (~1.6%) compared to the baselines. It also effectively mitigates poisoning attacks by agilely recovering and returning to normal aggregation.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Parallel and Distributed Computing
Journal of Parallel and Distributed Computing 工程技术-计算机:理论方法
CiteScore
10.30
自引率
2.60%
发文量
172
审稿时长
12 months
期刊介绍: This international journal is directed to researchers, engineers, educators, managers, programmers, and users of computers who have particular interests in parallel processing and/or distributed computing. The Journal of Parallel and Distributed Computing publishes original research papers and timely review articles on the theory, design, evaluation, and use of parallel and/or distributed computing systems. The journal also features special issues on these topics; again covering the full range from the design to the use of our targeted systems.
期刊最新文献
Fault-tolerance in biswapped multiprocessor interconnection networks Editorial Board Front Matter 1 - Full Title Page (regular issues)/Special Issue Title page (special issues) Design and experimental evaluation of algorithms for optimizing the throughput of dispersed computing Hands-on parallel & distributed computing with Raspberry Pi devices and clusters
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1