Momin Ahmad Khan, Virat Shejwalkar, A. Houmansadr, F. Anwar
{"title":"On the Pitfalls of Security Evaluation of Robust Federated Learning","authors":"Momin Ahmad Khan, Virat Shejwalkar, A. Houmansadr, F. Anwar","doi":"10.1109/SPW59333.2023.00011","DOIUrl":null,"url":null,"abstract":"Prior literature has demonstrated that Federated learning (FL) is vulnerable to poisoning attacks that aim to jeopardize FL performance, and consequently, has introduced numerous defenses and demonstrated their robustness in various FL settings. In this work, we closely investigate a largely over-looked aspect in the robust FL literature, i.e., the experimental setup used to evaluate the robustness of FL poisoning defenses. We thoroughly review 50 defense works and highlight several questionable trends in the experimental setup of FL poisoning defense papers; we discuss the potential repercussions of such experimental setups on the key conclusions made by these works about the robustness of the proposed defenses. As a representative case study, we also evaluate a recent poisoning recovery paper from IEEE S&P'23, called FedRecover. Our case study demonstrates the importance of the experimental setup decisions (e.g., selecting representative and challenging datasets) in the validity of the robustness claims; For instance, while FedRecover performs well for MNIST and FashionMNIST (used in the original paper), in our experiments it performed poorly for FEMNIST and CIFAR10.","PeriodicalId":308378,"journal":{"name":"2023 IEEE Security and Privacy Workshops (SPW)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE Security and Privacy Workshops (SPW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPW59333.2023.00011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Prior literature has demonstrated that Federated learning (FL) is vulnerable to poisoning attacks that aim to jeopardize FL performance, and consequently, has introduced numerous defenses and demonstrated their robustness in various FL settings. In this work, we closely investigate a largely over-looked aspect in the robust FL literature, i.e., the experimental setup used to evaluate the robustness of FL poisoning defenses. We thoroughly review 50 defense works and highlight several questionable trends in the experimental setup of FL poisoning defense papers; we discuss the potential repercussions of such experimental setups on the key conclusions made by these works about the robustness of the proposed defenses. As a representative case study, we also evaluate a recent poisoning recovery paper from IEEE S&P'23, called FedRecover. Our case study demonstrates the importance of the experimental setup decisions (e.g., selecting representative and challenging datasets) in the validity of the robustness claims; For instance, while FedRecover performs well for MNIST and FashionMNIST (used in the original paper), in our experiments it performed poorly for FEMNIST and CIFAR10.