Kwing Hei Li, P. P. B. D. Gusmão, Daniel J. Beutel, N. Lane
Federated Learning (FL) allows parties to learn a shared prediction model by delegating the training computation to clients and aggregating all the separately trained models on the server. To prevent private information being inferred from local models, Secure Aggregation (SA) protocols are used to ensure that the server is unable to inspect individual trained models as it aggregates them. However, current implementations of SA in FL frameworks have limitations, including vulnerability to client dropouts or configuration difficulties. In this paper, we present Salvia, an implementation of SA for Python users in the Flower FL framework. Based on the SecAgg(+) protocols for a semi-honest threat model, Salvia is robust against client dropouts and exposes a flexible and easy-to-use API that is compatible with various machine learning frameworks. We show that Salvia's experimental performance is consistent with SecAgg(+)'s theoretical computation and communication complexities.
{"title":"Secure aggregation for federated learning in flower","authors":"Kwing Hei Li, P. P. B. D. Gusmão, Daniel J. Beutel, N. Lane","doi":"10.1145/3488659.3493776","DOIUrl":"https://doi.org/10.1145/3488659.3493776","url":null,"abstract":"Federated Learning (FL) allows parties to learn a shared prediction model by delegating the training computation to clients and aggregating all the separately trained models on the server. To prevent private information being inferred from local models, Secure Aggregation (SA) protocols are used to ensure that the server is unable to inspect individual trained models as it aggregates them. However, current implementations of SA in FL frameworks have limitations, including vulnerability to client dropouts or configuration difficulties. In this paper, we present Salvia, an implementation of SA for Python users in the Flower FL framework. Based on the SecAgg(+) protocols for a semi-honest threat model, Salvia is robust against client dropouts and exposes a flexible and easy-to-use API that is compatible with various machine learning frameworks. We show that Salvia's experimental performance is consistent with SecAgg(+)'s theoretical computation and communication complexities.","PeriodicalId":343000,"journal":{"name":"Proceedings of the 2nd ACM International Workshop on Distributed Machine Learning","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125600188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent developments in Deep Neural Networks have resulted in their wide deployment for services around many aspects of human life, including security critical domains that handle sensitive data. Congruently, we have seen a proliferation of IoT devices with limited resources. Together, these two trends have led to the distribution of data analysis, processing, and decision making between edge devices and third parties such as cloud services. In this work we assess the security of the previously proposed distributed machine learning (ML) schemes by analyzing the information leaked from the output of the edge devices, i.e. the intermediate representation (IR). We particularly look at a Deep Neural Network that is used for video/image classification and tackle the problem of image/frame reconstruction from the output of the edge device. Our work focuses on assessing whether the proposed scheme of partitioned enclave execution is secure against chosen-image attacks (CIA). Given the attacker has the capability of querying the model under attack (victim model) to create image-IR pairs, can the attacker reconstruct the private input images? In this work we show that it is possible to carry out a black-box reconstruction attack by training a CNN based encoder-decoder architecture (reconstruction model) using image-IR pairs. Our tests show that the proposed reconstruction model achieves a 70% similarity between the original image and the reconstructed image.
{"title":"Image reconstruction attacks on distributed machine learning models","authors":"Hadjer Benkraouda, K. Nahrstedt","doi":"10.1145/3488659.3493779","DOIUrl":"https://doi.org/10.1145/3488659.3493779","url":null,"abstract":"Recent developments in Deep Neural Networks have resulted in their wide deployment for services around many aspects of human life, including security critical domains that handle sensitive data. Congruently, we have seen a proliferation of IoT devices with limited resources. Together, these two trends have led to the distribution of data analysis, processing, and decision making between edge devices and third parties such as cloud services. In this work we assess the security of the previously proposed distributed machine learning (ML) schemes by analyzing the information leaked from the output of the edge devices, i.e. the intermediate representation (IR). We particularly look at a Deep Neural Network that is used for video/image classification and tackle the problem of image/frame reconstruction from the output of the edge device. Our work focuses on assessing whether the proposed scheme of partitioned enclave execution is secure against chosen-image attacks (CIA). Given the attacker has the capability of querying the model under attack (victim model) to create image-IR pairs, can the attacker reconstruct the private input images? In this work we show that it is possible to carry out a black-box reconstruction attack by training a CNN based encoder-decoder architecture (reconstruction model) using image-IR pairs. Our tests show that the proposed reconstruction model achieves a 70% similarity between the original image and the reconstructed image.","PeriodicalId":343000,"journal":{"name":"Proceedings of the 2nd ACM International Workshop on Distributed Machine Learning","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123826908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Konstantin Burlachenko, Samuel Horváth, Peter Richtárik
Federated Learning (FL) has emerged as a promising technique for edge devices to collaboratively learn a shared machine learning model while keeping training data locally on the device, thereby removing the need to store and access the full data in the cloud. However, FL is difficult to implement, test and deploy in practice considering heterogeneity in common edge device settings, making it fundamentally hard for researchers to efficiently prototype and test their optimization algorithms. In this work, our aim is to alleviate this problem by introducing FL_PyTorch : a suite of open-source software written in python that builds on top of one the most popular research Deep Learning (DL) framework PyTorch. We built FL_PyTorch as a research simulator for FL to enable fast development, prototyping and experimenting with new and existing FL optimization algorithms. Our system supports abstractions that provide researchers with a sufficient level of flexibility to experiment with existing and novel approaches to advance the state-of-the-art. Furthermore, FL_PyTorch is a simple to use console system, allows to run several clients simultaneously using local CPUs or GPU(s), and even remote compute devices without the need for any distributed implementation provided by the user. FL_PyTorch also offers a Graphical User Interface. For new methods, researchers only provide the centralized implementation of their algorithm. To showcase the possibilities and usefulness of our system, we experiment with several well-known state-of-the-art FL algorithms and a few of the most common FL datasets.
{"title":"FL_PyTorch: optimization research simulator for federated learning","authors":"Konstantin Burlachenko, Samuel Horváth, Peter Richtárik","doi":"10.1145/3488659.3493775","DOIUrl":"https://doi.org/10.1145/3488659.3493775","url":null,"abstract":"Federated Learning (FL) has emerged as a promising technique for edge devices to collaboratively learn a shared machine learning model while keeping training data locally on the device, thereby removing the need to store and access the full data in the cloud. However, FL is difficult to implement, test and deploy in practice considering heterogeneity in common edge device settings, making it fundamentally hard for researchers to efficiently prototype and test their optimization algorithms. In this work, our aim is to alleviate this problem by introducing FL_PyTorch : a suite of open-source software written in python that builds on top of one the most popular research Deep Learning (DL) framework PyTorch. We built FL_PyTorch as a research simulator for FL to enable fast development, prototyping and experimenting with new and existing FL optimization algorithms. Our system supports abstractions that provide researchers with a sufficient level of flexibility to experiment with existing and novel approaches to advance the state-of-the-art. Furthermore, FL_PyTorch is a simple to use console system, allows to run several clients simultaneously using local CPUs or GPU(s), and even remote compute devices without the need for any distributed implementation provided by the user. FL_PyTorch also offers a Graphical User Interface. For new methods, researchers only provide the centralized implementation of their algorithm. To showcase the possibilities and usefulness of our system, we experiment with several well-known state-of-the-art FL algorithms and a few of the most common FL datasets.","PeriodicalId":343000,"journal":{"name":"Proceedings of the 2nd ACM International Workshop on Distributed Machine Learning","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129078076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adarsh Kumar, Kausik Subramanian, S. Venkataraman, Aditya Akella
Many organizations employ compute clusters equipped with accelerators such as GPUs and TPUs for training deep learning models in a distributed fashion. Training is resource-intensive, consuming significant compute, memory, and network resources. Many prior works explore how to reduce training resource footprint without impacting quality, but their focus on a subset of the bottlenecks (typically only the network) limits their ability to improve overall cluster utilization. In this work, we exploit the unique characteristics of deep learning workloads to propose Structured Partial Backpropagation(SPB), a technique that systematically controls the amount of backpropagation at individual workers in distributed training. This simultaneously reduces network bandwidth, compute utilization, and memory footprint while preserving model quality. To efficiently leverage the benefits of SPB at cluster level, we introduce Jigsaw, a SPB aware scheduler, which does scheduling at the iteration level for Deep Learning Training(DLT) jobs. We find that Jigsaw can improve large scale cluster efficiency by as high as 28%.
{"title":"Doing more by doing less: how structured partial backpropagation improves deep learning clusters","authors":"Adarsh Kumar, Kausik Subramanian, S. Venkataraman, Aditya Akella","doi":"10.1145/3488659.3493778","DOIUrl":"https://doi.org/10.1145/3488659.3493778","url":null,"abstract":"Many organizations employ compute clusters equipped with accelerators such as GPUs and TPUs for training deep learning models in a distributed fashion. Training is resource-intensive, consuming significant compute, memory, and network resources. Many prior works explore how to reduce training resource footprint without impacting quality, but their focus on a subset of the bottlenecks (typically only the network) limits their ability to improve overall cluster utilization. In this work, we exploit the unique characteristics of deep learning workloads to propose Structured Partial Backpropagation(SPB), a technique that systematically controls the amount of backpropagation at individual workers in distributed training. This simultaneously reduces network bandwidth, compute utilization, and memory footprint while preserving model quality. To efficiently leverage the benefits of SPB at cluster level, we introduce Jigsaw, a SPB aware scheduler, which does scheduling at the iteration level for Deep Learning Training(DLT) jobs. We find that Jigsaw can improve large scale cluster efficiency by as high as 28%.","PeriodicalId":343000,"journal":{"name":"Proceedings of the 2nd ACM International Workshop on Distributed Machine Learning","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131025052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Consumer Internet of Things (IoT) devices are increasingly common in everyday homes, from smart speakers to security cameras. Along with their benefits come potential privacy and security threats. To limit these threats we must implement solutions to filter IoT traffic at the edge. To this end the identification of the IoT device is the first natural step. In this paper we demonstrate a novel method of rapid IoT device identification that uses neural networks trained on device DNS traffic that can be captured from a DNS server on the local network. The method identifies devices by fitting a model to the first seconds of DNS second-level-domain traffic following their first connection. Since security and privacy threat detection often operate at a device specific level, rapid identification allows these strategies to be implemented immediately. Through a total of 51,000 rigorous automated experiments, we classify 30 consumer IoT devices from 27 different manufacturers with 82% and 93% accuracy for product type and device manufacturers respectively.
{"title":"Rapid IoT device identification at the edge","authors":"O. Thompson, A. Mandalari, H. Haddadi","doi":"10.1145/3488659.3493777","DOIUrl":"https://doi.org/10.1145/3488659.3493777","url":null,"abstract":"Consumer Internet of Things (IoT) devices are increasingly common in everyday homes, from smart speakers to security cameras. Along with their benefits come potential privacy and security threats. To limit these threats we must implement solutions to filter IoT traffic at the edge. To this end the identification of the IoT device is the first natural step. In this paper we demonstrate a novel method of rapid IoT device identification that uses neural networks trained on device DNS traffic that can be captured from a DNS server on the local network. The method identifies devices by fitting a model to the first seconds of DNS second-level-domain traffic following their first connection. Since security and privacy threat detection often operate at a device specific level, rapid identification allows these strategies to be implemented immediately. Through a total of 51,000 rigorous automated experiments, we classify 30 consumer IoT devices from 27 different manufacturers with 82% and 93% accuracy for product type and device manufacturers respectively.","PeriodicalId":343000,"journal":{"name":"Proceedings of the 2nd ACM International Workshop on Distributed Machine Learning","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114267677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}