Xiaoli Chen , Youliang Tian , Shuai Wang , Kedi Yang , Wei Zhao , Jinbo Xiong
{"title":"DBFL:异构数据场景下的动态拜占庭鲁棒隐私保护联邦学习","authors":"Xiaoli Chen , Youliang Tian , Shuai Wang , Kedi Yang , Wei Zhao , Jinbo Xiong","doi":"10.1016/j.ins.2024.121849","DOIUrl":null,"url":null,"abstract":"<div><div>Privacy Preserving Federated Learning (PPFL) protects the clients' local data privacy by uploading encrypted gradients to the server. However, in real-world scenarios, the heterogeneous distribution of client data makes it challenging to identify poisoning gradients. During local iterations, the models continuously move in different directions, which causes the boundary between benign and malicious gradients to persistently shift. To address these challenges, we design a Dynamic Byzantine-robust Federated Learning (DBFL) defense strategy based on Two-trapdoor Homomorphic Encryption (THE), which enables the detection of encrypted poisoning attacks in heterogeneous data scenarios. Specifically, we introduce a secure Manhattan distance method that accurately measures the differences between elements in two encrypted gradients, allowing for precise detection of poisoning attacks in heterogeneous data scenarios while maintaining privacy. Furthermore, we design a Byzantine-tolerant aggregation mechanism based on dynamic threshold, where the threshold is capable of adapting to the continuously changing boundary between poisoning gradients and benign gradients in heterogeneous data scenarios. This ensures DBFL to effectively exclude poisoning gradients even when 70% of the clients are malicious and controlled by Byzantine attackers. Security analysis demonstrates that DBFL achieves IND-CPA security. Extensive evaluations on two benchmark datasets (i.e., MNIST and CIFAR-10) show that DBFL outperforms existing defense strategies. In particular, DBFL achieves a 7%-40% accuracy improvement in the non-IID setting compared to existing solutions for defending against untargeted and targeted attacks.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"700 ","pages":"Article 121849"},"PeriodicalIF":6.0000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DBFL: Dynamic Byzantine-Robust Privacy Preserving Federated Learning in Heterogeneous Data Scenario\",\"authors\":\"Xiaoli Chen , Youliang Tian , Shuai Wang , Kedi Yang , Wei Zhao , Jinbo Xiong\",\"doi\":\"10.1016/j.ins.2024.121849\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Privacy Preserving Federated Learning (PPFL) protects the clients' local data privacy by uploading encrypted gradients to the server. However, in real-world scenarios, the heterogeneous distribution of client data makes it challenging to identify poisoning gradients. During local iterations, the models continuously move in different directions, which causes the boundary between benign and malicious gradients to persistently shift. To address these challenges, we design a Dynamic Byzantine-robust Federated Learning (DBFL) defense strategy based on Two-trapdoor Homomorphic Encryption (THE), which enables the detection of encrypted poisoning attacks in heterogeneous data scenarios. Specifically, we introduce a secure Manhattan distance method that accurately measures the differences between elements in two encrypted gradients, allowing for precise detection of poisoning attacks in heterogeneous data scenarios while maintaining privacy. Furthermore, we design a Byzantine-tolerant aggregation mechanism based on dynamic threshold, where the threshold is capable of adapting to the continuously changing boundary between poisoning gradients and benign gradients in heterogeneous data scenarios. This ensures DBFL to effectively exclude poisoning gradients even when 70% of the clients are malicious and controlled by Byzantine attackers. Security analysis demonstrates that DBFL achieves IND-CPA security. Extensive evaluations on two benchmark datasets (i.e., MNIST and CIFAR-10) show that DBFL outperforms existing defense strategies. In particular, DBFL achieves a 7%-40% accuracy improvement in the non-IID setting compared to existing solutions for defending against untargeted and targeted attacks.</div></div>\",\"PeriodicalId\":51063,\"journal\":{\"name\":\"Information Sciences\",\"volume\":\"700 \",\"pages\":\"Article 121849\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Sciences\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0020025524017638\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/6 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"0\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0020025524017638","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/6 0:00:00","PubModel":"Epub","JCR":"0","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
DBFL: Dynamic Byzantine-Robust Privacy Preserving Federated Learning in Heterogeneous Data Scenario
Privacy Preserving Federated Learning (PPFL) protects the clients' local data privacy by uploading encrypted gradients to the server. However, in real-world scenarios, the heterogeneous distribution of client data makes it challenging to identify poisoning gradients. During local iterations, the models continuously move in different directions, which causes the boundary between benign and malicious gradients to persistently shift. To address these challenges, we design a Dynamic Byzantine-robust Federated Learning (DBFL) defense strategy based on Two-trapdoor Homomorphic Encryption (THE), which enables the detection of encrypted poisoning attacks in heterogeneous data scenarios. Specifically, we introduce a secure Manhattan distance method that accurately measures the differences between elements in two encrypted gradients, allowing for precise detection of poisoning attacks in heterogeneous data scenarios while maintaining privacy. Furthermore, we design a Byzantine-tolerant aggregation mechanism based on dynamic threshold, where the threshold is capable of adapting to the continuously changing boundary between poisoning gradients and benign gradients in heterogeneous data scenarios. This ensures DBFL to effectively exclude poisoning gradients even when 70% of the clients are malicious and controlled by Byzantine attackers. Security analysis demonstrates that DBFL achieves IND-CPA security. Extensive evaluations on two benchmark datasets (i.e., MNIST and CIFAR-10) show that DBFL outperforms existing defense strategies. In particular, DBFL achieves a 7%-40% accuracy improvement in the non-IID setting compared to existing solutions for defending against untargeted and targeted attacks.
期刊介绍:
Informatics and Computer Science Intelligent Systems Applications is an esteemed international journal that focuses on publishing original and creative research findings in the field of information sciences. We also feature a limited number of timely tutorial and surveying contributions.
Our journal aims to cater to a diverse audience, including researchers, developers, managers, strategic planners, graduate students, and anyone interested in staying up-to-date with cutting-edge research in information science, knowledge engineering, and intelligent systems. While readers are expected to share a common interest in information science, they come from varying backgrounds such as engineering, mathematics, statistics, physics, computer science, cell biology, molecular biology, management science, cognitive science, neurobiology, behavioral sciences, and biochemistry.