Federated adaptive pruning with differential privacy

IF 6.2 2区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Future Generation Computer Systems-The International Journal of Escience Pub Date : 2025-03-05 DOI:10.1016/j.future.2025.107783

Zhousheng Wang , Jiahe Shen , Hua Dai , Jian Xu , Geng Yang , Hao Zhou

{"title":"Federated adaptive pruning with differential privacy","authors":"Zhousheng Wang , Jiahe Shen , Hua Dai , Jian Xu , Geng Yang , Hao Zhou","doi":"10.1016/j.future.2025.107783","DOIUrl":null,"url":null,"abstract":"<div><div>Federated Learning (FL), as an emerging distributed machine learning technique, reduces the computational burden on the central server through decentralization, while ensuring data privacy. It typically requires client sampling and local training for each iteration, followed by aggregation of the model on a central server. Although this distributed learning approach has positive implications for the preservation of privacy, it also increases the computational load of local clients. Therefore, lightweight efficient schemes become an indispensable tool to help reduce communication and computational costs in FL. In addition, due to the risk of model stealing attacks when uploaded, it is urgent to improve the level of privacy protection further. In this paper, we propose Federated Adaptive Pruning (FAP), a lightweight method that integrates FL with adaptive pruning by adjusting explicit regularization. We keep the model unchanged, but instead try to dynamically prune the data from large datasets during the training process to reduce the computational costs and enhance privacy protection. In each round of training, selected clients train with their local data and prune a portion of the data before uploading the model for server-side aggregation. The remaining data are reserved for subsequent computations. With this approach, selected clients can quickly refine their data at the beginning of training. In addition, we combine FAP with differential privacy to further strengthen data privacy. Through comprehensive experiments, we demonstrate the performance of FAP on different datasets with basic models, <em>e.g.</em>, CNN, and MLP, just to mention a few. Numerous experimental results show that our method is able to significantly prune the datasets to reduce computational overhead with minimal loss of accuracy. Compared to previous methods, we can obtain the lowest training error, and further improve the data privacy of client-side.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"169 ","pages":"Article 107783"},"PeriodicalIF":6.2000,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X25000780","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Federated Learning (FL), as an emerging distributed machine learning technique, reduces the computational burden on the central server through decentralization, while ensuring data privacy. It typically requires client sampling and local training for each iteration, followed by aggregation of the model on a central server. Although this distributed learning approach has positive implications for the preservation of privacy, it also increases the computational load of local clients. Therefore, lightweight efficient schemes become an indispensable tool to help reduce communication and computational costs in FL. In addition, due to the risk of model stealing attacks when uploaded, it is urgent to improve the level of privacy protection further. In this paper, we propose Federated Adaptive Pruning (FAP), a lightweight method that integrates FL with adaptive pruning by adjusting explicit regularization. We keep the model unchanged, but instead try to dynamically prune the data from large datasets during the training process to reduce the computational costs and enhance privacy protection. In each round of training, selected clients train with their local data and prune a portion of the data before uploading the model for server-side aggregation. The remaining data are reserved for subsequent computations. With this approach, selected clients can quickly refine their data at the beginning of training. In addition, we combine FAP with differential privacy to further strengthen data privacy. Through comprehensive experiments, we demonstrate the performance of FAP on different datasets with basic models, e.g., CNN, and MLP, just to mention a few. Numerous experimental results show that our method is able to significantly prune the datasets to reduce computational overhead with minimal loss of accuracy. Compared to previous methods, we can obtain the lowest training error, and further improve the data privacy of client-side.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

具有差分隐私的联邦自适应剪枝

联邦学习（FL）作为一种新兴的分布式机器学习技术，通过去中心化减少了中央服务器的计算负担，同时保证了数据隐私。它通常需要对每次迭代进行客户端采样和本地训练，然后在中央服务器上聚合模型。尽管这种分布式学习方法对保护隐私具有积极意义，但它也增加了本地客户机的计算负荷。因此，轻量级的高效方案成为降低FL通信和计算成本不可或缺的工具。此外，由于上传时存在模型窃取攻击的风险，进一步提高隐私保护水平迫在眉睫。本文提出了联邦自适应剪枝（FAP），这是一种通过调整显式正则化将FL与自适应剪枝相结合的轻量级方法。我们保持模型不变，而是尝试在训练过程中动态地从大型数据集中修剪数据，以减少计算成本并增强隐私保护。在每一轮训练中，选定的客户端使用其本地数据进行训练，并在上传模型用于服务器端聚合之前修剪一部分数据。剩余的数据为后续计算保留。通过这种方法，选定的客户可以在培训开始时快速改进他们的数据。此外，我们将FAP与差分隐私相结合，进一步加强数据隐私。通过综合实验，我们展示了FAP在不同数据集上使用基本模型的性能，例如CNN和MLP，仅举几个例子。大量的实验结果表明，我们的方法能够在最小的精度损失的情况下显著地减少数据集的计算开销。与之前的方法相比，我们可以获得最小的训练误差，进一步提高客户端的数据隐私性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Future Generation Computer Systems-The International Journal of Escience 工程技术-计算机：理论方法

CiteScore

19.90

自引率

2.70%

发文量

376

审稿时长

10.6 months

期刊介绍： Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.