{"title":"CareFL:贡献指导拜占庭式稳健联合学习","authors":"Qihao Dong;Shengyuan Yang;Zhiyang Dai;Yansong Gao;Shang Wang;Yuan Cao;Anmin Fu;Willy Susilo","doi":"10.1109/TIFS.2024.3477912","DOIUrl":null,"url":null,"abstract":"Byzantine-robust federated learning (FL) endeavors to empower service providers in acquiring a precise global model, even in the presence of potentially malicious FL clients. While considerable strides have been taken in the development of robust aggregation algorithms for FL in recent years, their efficacy is confined to addressing particular forms of Byzantine attacks, and they exhibit vulnerabilities when confronted with a spectrum of attack vectors. Notably, a prevailing issue lies in the heavy reliance of these algorithms on the examination of local model gradients. It is worth noting that an attacker possesses the ability to manipulate a carefully chosen small gradient of a model within a context where there could be millions of gradients available, thereby facilitating adaptive attacks. Drawing inspiration from the foundational Shapley value methodology in game theory, we introduce an effective FL scheme named \n<monospace>CareFL</monospace>\n. This scheme is designed to provide robustness against a spectrum of state-of-the-art Byzantine attacks. Unlike approaches that rely on the examination of gradients, \n<monospace>CareFL</monospace>\n employs a universal metric, the loss of the local model—independent of specific gradients, to identify potentially malicious clients. Specifically, in each aggregation round, the FL server trains a reference model using a small auxiliary dataset— the auxiliary dataset can be removed with a slight defense degradation trade-off. It employs the Shapley value to assess the contribution of each client-submitted model in minimizing the global model loss. Subsequently, the server selects client models closer to the reference model in terms of Shapley values for the global model update. To reduce the computational overhead of \n<monospace>CareFL</monospace>\n when the number of clients is relatively scaled-up, we construct its variant, namely \n<monospace>CareFL</monospace>\n+ generally by grouping clients. Extensive experimentation conducted on well-established MNIST and CIFAR-10 datasets, encompassing diverse model architectures, including AlexNet, demonstrates that \n<monospace>CareFL</monospace>\n consistently achieves accuracy levels comparable to those attained under attack-free conditions when faced with five formidable attacks. \n<monospace>CareFL</monospace>\n and CareFL+ outperform six existing state-of-the-art Byzantine-robust FL aggregation methods, including \n<monospace>FLTrust</monospace>\n, across both IID and non-IID data distribution settings.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"19 ","pages":"9714-9729"},"PeriodicalIF":6.3000,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CareFL: Contribution Guided Byzantine-Robust Federated Learning\",\"authors\":\"Qihao Dong;Shengyuan Yang;Zhiyang Dai;Yansong Gao;Shang Wang;Yuan Cao;Anmin Fu;Willy Susilo\",\"doi\":\"10.1109/TIFS.2024.3477912\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Byzantine-robust federated learning (FL) endeavors to empower service providers in acquiring a precise global model, even in the presence of potentially malicious FL clients. While considerable strides have been taken in the development of robust aggregation algorithms for FL in recent years, their efficacy is confined to addressing particular forms of Byzantine attacks, and they exhibit vulnerabilities when confronted with a spectrum of attack vectors. Notably, a prevailing issue lies in the heavy reliance of these algorithms on the examination of local model gradients. It is worth noting that an attacker possesses the ability to manipulate a carefully chosen small gradient of a model within a context where there could be millions of gradients available, thereby facilitating adaptive attacks. Drawing inspiration from the foundational Shapley value methodology in game theory, we introduce an effective FL scheme named \\n<monospace>CareFL</monospace>\\n. This scheme is designed to provide robustness against a spectrum of state-of-the-art Byzantine attacks. Unlike approaches that rely on the examination of gradients, \\n<monospace>CareFL</monospace>\\n employs a universal metric, the loss of the local model—independent of specific gradients, to identify potentially malicious clients. Specifically, in each aggregation round, the FL server trains a reference model using a small auxiliary dataset— the auxiliary dataset can be removed with a slight defense degradation trade-off. It employs the Shapley value to assess the contribution of each client-submitted model in minimizing the global model loss. Subsequently, the server selects client models closer to the reference model in terms of Shapley values for the global model update. To reduce the computational overhead of \\n<monospace>CareFL</monospace>\\n when the number of clients is relatively scaled-up, we construct its variant, namely \\n<monospace>CareFL</monospace>\\n+ generally by grouping clients. Extensive experimentation conducted on well-established MNIST and CIFAR-10 datasets, encompassing diverse model architectures, including AlexNet, demonstrates that \\n<monospace>CareFL</monospace>\\n consistently achieves accuracy levels comparable to those attained under attack-free conditions when faced with five formidable attacks. \\n<monospace>CareFL</monospace>\\n and CareFL+ outperform six existing state-of-the-art Byzantine-robust FL aggregation methods, including \\n<monospace>FLTrust</monospace>\\n, across both IID and non-IID data distribution settings.\",\"PeriodicalId\":13492,\"journal\":{\"name\":\"IEEE Transactions on Information Forensics and Security\",\"volume\":\"19 \",\"pages\":\"9714-9729\"},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2024-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Information Forensics and Security\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10713463/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10713463/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
Byzantine-robust federated learning (FL) endeavors to empower service providers in acquiring a precise global model, even in the presence of potentially malicious FL clients. While considerable strides have been taken in the development of robust aggregation algorithms for FL in recent years, their efficacy is confined to addressing particular forms of Byzantine attacks, and they exhibit vulnerabilities when confronted with a spectrum of attack vectors. Notably, a prevailing issue lies in the heavy reliance of these algorithms on the examination of local model gradients. It is worth noting that an attacker possesses the ability to manipulate a carefully chosen small gradient of a model within a context where there could be millions of gradients available, thereby facilitating adaptive attacks. Drawing inspiration from the foundational Shapley value methodology in game theory, we introduce an effective FL scheme named
CareFL
. This scheme is designed to provide robustness against a spectrum of state-of-the-art Byzantine attacks. Unlike approaches that rely on the examination of gradients,
CareFL
employs a universal metric, the loss of the local model—independent of specific gradients, to identify potentially malicious clients. Specifically, in each aggregation round, the FL server trains a reference model using a small auxiliary dataset— the auxiliary dataset can be removed with a slight defense degradation trade-off. It employs the Shapley value to assess the contribution of each client-submitted model in minimizing the global model loss. Subsequently, the server selects client models closer to the reference model in terms of Shapley values for the global model update. To reduce the computational overhead of
CareFL
when the number of clients is relatively scaled-up, we construct its variant, namely
CareFL
+ generally by grouping clients. Extensive experimentation conducted on well-established MNIST and CIFAR-10 datasets, encompassing diverse model architectures, including AlexNet, demonstrates that
CareFL
consistently achieves accuracy levels comparable to those attained under attack-free conditions when faced with five formidable attacks.
CareFL
and CareFL+ outperform six existing state-of-the-art Byzantine-robust FL aggregation methods, including
FLTrust
, across both IID and non-IID data distribution settings.
期刊介绍:
The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features