This paper presents EdgeC, a new language for programming reactive distributed applications. It enables separation of concerns between expressing behavior and controlling distributed aspects, inspired by aspect-oriented language design. In EdgeC, developers express functionality with sequential behaviors, and data allocation, reactivity, consistency, and underlying network with orthogonal specifications. Through such separation, EdgeC allows developers to change functionality and control the shape of resulting distributed behaviors without cross-cutting code, simplifying deployment to the edge. Developers can reason about and test their applications as sequential executions, whilst EdgeC automatically synthesizes low-level distributed code. It handles, with the help of the EdgeC run-time, allocation, communication, concurrency, and coordination, across the specified, potentially non-uniform, network model. We introduce the main features of EdgeC, present the new compiler design, its prototype implementation, the resulting performance, and discuss the potential of the approach for simplifying development of reactive applications over nonuniform networks and achieving performance gains, compared to existing approaches.
{"title":"Aspect-oriented language for reactive distributed applications at the edge","authors":"I. Kuraj, Armando Solar-Lezama","doi":"10.1145/3378679.3394531","DOIUrl":"https://doi.org/10.1145/3378679.3394531","url":null,"abstract":"This paper presents EdgeC, a new language for programming reactive distributed applications. It enables separation of concerns between expressing behavior and controlling distributed aspects, inspired by aspect-oriented language design. In EdgeC, developers express functionality with sequential behaviors, and data allocation, reactivity, consistency, and underlying network with orthogonal specifications. Through such separation, EdgeC allows developers to change functionality and control the shape of resulting distributed behaviors without cross-cutting code, simplifying deployment to the edge. Developers can reason about and test their applications as sequential executions, whilst EdgeC automatically synthesizes low-level distributed code. It handles, with the help of the EdgeC run-time, allocation, communication, concurrency, and coordination, across the specified, potentially non-uniform, network model. We introduce the main features of EdgeC, present the new compiler design, its prototype implementation, the resulting performance, and discuss the potential of the approach for simplifying development of reactive applications over nonuniform networks and achieving performance gains, compared to existing approaches.","PeriodicalId":268360,"journal":{"name":"Proceedings of the Third ACM International Workshop on Edge Systems, Analytics and Networking","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115206263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stacey Truex, Ling Liu, Ka-Ho Chow, M. E. Gursoy, Wenqi Wei
This paper presents LDP-Fed, a novel federated learning system with a formal privacy guarantee using local differential privacy (LDP). Existing LDP protocols are developed primarily to ensure data privacy in the collection of single numerical or categorical values, such as click count in Web access logs. However, in federated learning model parameter updates are collected iteratively from each participant and consist of high dimensional, continuous values with high precision (10s of digits after the decimal point), making existing LDP protocols inapplicable. To address this challenge in LDP-Fed, we design and develop two novel approaches. First, LDP-Fed's LDP Module provides a formal differential privacy guarantee for the repeated collection of model training parameters in the federated training of large-scale neural networks over multiple individual participants' private datasets. Second, LDP-Fed implements a suite of selection and filtering techniques for perturbing and sharing select parameter updates with the parameter server. We validate our system deployed with a condensed LDP protocol in training deep neural networks on public data. We compare this version of LDP-Fed, coined CLDP-Fed, with other state-of-the-art approaches with respect to model accuracy, privacy preservation, and system capabilities.
{"title":"LDP-Fed: federated learning with local differential privacy","authors":"Stacey Truex, Ling Liu, Ka-Ho Chow, M. E. Gursoy, Wenqi Wei","doi":"10.1145/3378679.3394533","DOIUrl":"https://doi.org/10.1145/3378679.3394533","url":null,"abstract":"This paper presents LDP-Fed, a novel federated learning system with a formal privacy guarantee using local differential privacy (LDP). Existing LDP protocols are developed primarily to ensure data privacy in the collection of single numerical or categorical values, such as click count in Web access logs. However, in federated learning model parameter updates are collected iteratively from each participant and consist of high dimensional, continuous values with high precision (10s of digits after the decimal point), making existing LDP protocols inapplicable. To address this challenge in LDP-Fed, we design and develop two novel approaches. First, LDP-Fed's LDP Module provides a formal differential privacy guarantee for the repeated collection of model training parameters in the federated training of large-scale neural networks over multiple individual participants' private datasets. Second, LDP-Fed implements a suite of selection and filtering techniques for perturbing and sharing select parameter updates with the parameter server. We validate our system deployed with a condensed LDP protocol in training deep neural networks on public data. We compare this version of LDP-Fed, coined CLDP-Fed, with other state-of-the-art approaches with respect to model accuracy, privacy preservation, and system capabilities.","PeriodicalId":268360,"journal":{"name":"Proceedings of the Third ACM International Workshop on Edge Systems, Analytics and Networking","volume":"311 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132567778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Angelo Feraudo, Poonam Yadav, Vadim Safronov, Diana Andreea Popescu, R. Mortier, Shiqiang Wang, P. Bellavista, J. Crowcroft
Edge computing and Federated Learning (FL) can work in tandem to address issues related to privacy and collaborative distributed learning in untrusted IoT environments. However, deployment of FL in resource-constrained IoT devices faces challenges including asynchronous participation of such devices in training, and the need to prevent malicious devices from participating. To address these challenges we present CoLearn, which build on the open-source Manufacturer Usage Description (MUD) implementation osMUD and the FL framework PySyft. We deploy CoLearn on resource-constrained devices in a lab environment to demonstrate (i) an asynchronous participation mechanism for IoT devices in machine learning model training using a publish/subscribe architecture, (ii) a mechanism for reducing the attack surface in FL architecture by allowing only IoT MUD-compliant devices to participate in the training phases, and (iii) a trade-off between communication bandwidth usage, training time and device temperature (thermal fatigue).
{"title":"CoLearn","authors":"Angelo Feraudo, Poonam Yadav, Vadim Safronov, Diana Andreea Popescu, R. Mortier, Shiqiang Wang, P. Bellavista, J. Crowcroft","doi":"10.1145/3378679.3394528","DOIUrl":"https://doi.org/10.1145/3378679.3394528","url":null,"abstract":"Edge computing and Federated Learning (FL) can work in tandem to address issues related to privacy and collaborative distributed learning in untrusted IoT environments. However, deployment of FL in resource-constrained IoT devices faces challenges including asynchronous participation of such devices in training, and the need to prevent malicious devices from participating. To address these challenges we present CoLearn, which build on the open-source Manufacturer Usage Description (MUD) implementation osMUD and the FL framework PySyft. We deploy CoLearn on resource-constrained devices in a lab environment to demonstrate (i) an asynchronous participation mechanism for IoT devices in machine learning model training using a publish/subscribe architecture, (ii) a mechanism for reducing the attack surface in FL architecture by allowing only IoT MUD-compliant devices to participate in the training phases, and (iii) a trade-off between communication bandwidth usage, training time and device temperature (thermal fatigue).","PeriodicalId":268360,"journal":{"name":"Proceedings of the Third ACM International Workshop on Edge Systems, Analytics and Networking","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121222417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dimitris Deyannis, Dimitris Karnikis, G. Vasiliadis, S. Ioannidis
The integrity of operating system (OS) kernels is of paramount importance in order to ensure the secure operation of user-level processes and services as well as the benign behavior of the entire system. Attackers aim to exploit a system's kernel since compromising it provides more flexibility for malicious operations compared to compromising a user-level process. Acquiring access to the OS kernel enables malicious parties to manipulate process execution, control the file system and the peripheral devices and obtain securityand privacy-critical data. One of the most effective countermeasures against rootkits are kernel integrity monitors, implemented in software (often assisted by a hypervisor) or external hardware, aiming to detect threats by scanning the kernel's state. However, modern rootkits are able to hide their presence and prevent detection from such mechanisms either by identifying and disabling the monitors or by performing transient attacks. In this paper we present SGX-Mon, an external kernel integrity monitor that verifies the operating system's kernel integrity using a very small TCB while it does not require any OS modifications or external hardware. SGX-Mon is a snapshot-based monitor, residing in the user space, and utilizes the trusted execution environment offered by Intel SGX enclaves in order to avoid detection from rootkits and prevent attackers from tampering its execution and operation-critical data. Our system is able to perform scanning, analysis and verification of arbitrary kernel memory pages and memory regions and ensure their integrity. The monitored locations can be specified by the user and can contain critical kernel code and data. SGX-Mon scans the system periodically and compares the contents of critical memory regions against their known benign values. Our experimental results show that SGX-Mon is able to achieve 100% accuracy while scanning up to 6,000 distinct kernel memory locations.
{"title":"An enclave assisted snapshot-based kernel integrity monitor","authors":"Dimitris Deyannis, Dimitris Karnikis, G. Vasiliadis, S. Ioannidis","doi":"10.1145/3378679.3394539","DOIUrl":"https://doi.org/10.1145/3378679.3394539","url":null,"abstract":"The integrity of operating system (OS) kernels is of paramount importance in order to ensure the secure operation of user-level processes and services as well as the benign behavior of the entire system. Attackers aim to exploit a system's kernel since compromising it provides more flexibility for malicious operations compared to compromising a user-level process. Acquiring access to the OS kernel enables malicious parties to manipulate process execution, control the file system and the peripheral devices and obtain securityand privacy-critical data. One of the most effective countermeasures against rootkits are kernel integrity monitors, implemented in software (often assisted by a hypervisor) or external hardware, aiming to detect threats by scanning the kernel's state. However, modern rootkits are able to hide their presence and prevent detection from such mechanisms either by identifying and disabling the monitors or by performing transient attacks. In this paper we present SGX-Mon, an external kernel integrity monitor that verifies the operating system's kernel integrity using a very small TCB while it does not require any OS modifications or external hardware. SGX-Mon is a snapshot-based monitor, residing in the user space, and utilizes the trusted execution environment offered by Intel SGX enclaves in order to avoid detection from rootkits and prevent attackers from tampering its execution and operation-critical data. Our system is able to perform scanning, analysis and verification of arbitrary kernel memory pages and memory regions and ensure their integrity. The monitored locations can be specified by the user and can contain critical kernel code and data. SGX-Mon scans the system periodically and compares the contents of critical memory regions against their known benign values. Our experimental results show that SGX-Mon is able to achieve 100% accuracy while scanning up to 6,000 distinct kernel memory locations.","PeriodicalId":268360,"journal":{"name":"Proceedings of the Third ACM International Workshop on Edge Systems, Analytics and Networking","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129102683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With edge computing emerging as a promising solution to cope with the challenges of Internet of Things (IoT) systems, there is an increasing need to automate the deployment of large-scale applications along with the publish/subscribe brokers they communicate over. Such a placement must adjust to the resource requirements of both applications and brokers in the heterogeneous environment of edge, fog, and cloud. In contrast to prior work focusing only on the placement of applications, this paper addresses the problem of jointly placing IoT applications and the pub/sub brokers on a set of network nodes, considering an application provider who aims at minimizing total end-to-end delays of all its subscribers. More specifically, we devise two heuristics for joint deployment of brokers and applications and analyze their performance in comparison to the current cloud-based IoT solutions wherein both the IoT applications and the brokers are located solely in the cloud. As an application provider should consider not only the location of the application users but also how they are distributed across different network components, we use von Mises distributions to model the degree of clustering of the users of an IoT application. Our simulations show that superior performance of our heuristics in comparison to cloud-based IoT operation is most pronounced under a high degree of clustering. When users of an IoT application are in close network proximity of the IoT sensors, cloud-based IoT unnecessarily introduces latency to move the data from the edge to the cloud and vice versa while processing could be performed at the edge or the fog layers.
{"title":"On the impact of clustering for IoT analytics and message broker placement across cloud and edge","authors":"Daniel Happ, S. Bayhan","doi":"10.1145/3378679.3394538","DOIUrl":"https://doi.org/10.1145/3378679.3394538","url":null,"abstract":"With edge computing emerging as a promising solution to cope with the challenges of Internet of Things (IoT) systems, there is an increasing need to automate the deployment of large-scale applications along with the publish/subscribe brokers they communicate over. Such a placement must adjust to the resource requirements of both applications and brokers in the heterogeneous environment of edge, fog, and cloud. In contrast to prior work focusing only on the placement of applications, this paper addresses the problem of jointly placing IoT applications and the pub/sub brokers on a set of network nodes, considering an application provider who aims at minimizing total end-to-end delays of all its subscribers. More specifically, we devise two heuristics for joint deployment of brokers and applications and analyze their performance in comparison to the current cloud-based IoT solutions wherein both the IoT applications and the brokers are located solely in the cloud. As an application provider should consider not only the location of the application users but also how they are distributed across different network components, we use von Mises distributions to model the degree of clustering of the users of an IoT application. Our simulations show that superior performance of our heuristics in comparison to cloud-based IoT operation is most pronounced under a high degree of clustering. When users of an IoT application are in close network proximity of the IoT sensors, cloud-based IoT unnecessarily introduces latency to move the data from the edge to the cloud and vice versa while processing could be performed at the edge or the fog layers.","PeriodicalId":268360,"journal":{"name":"Proceedings of the Third ACM International Workshop on Edge Systems, Analytics and Networking","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116822937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Intelligent Personal Assistants (IPAs) such as Apple's Siri, Google Now, and Amazon Alexa are becoming an increasingly important class of web application. In contrast to previous keyword-oriented search applications, IPAs support a rich query interface that allows user interaction through images, audio, and natural language queries. However, modern IPAs rely heavily on compute-intensive machine-learning inference. To achieve acceptable performance, ML-driven IPAs increasingly depend on specialized hardware accelerators (e.g. GPUs, FPGAs or TPUs), increasing costs for IPA service providers. For end-users, IPAs also present considerable privacy risks given the sensitive nature of the data they capture. We present PAIGE, a hybrid edge-cloud architecture for privacy-preserving Intelligent Personal Assistants. PAIGE's design is founded on the assumption that recent advances in low-cost hardware for machine-learning inference offer an opportunity to offload compute-intensive IPA ML tasks to the network edge. To allow privacy-preserving access to large IPA databases for less compute-intensive pre-processed queries, PAIGE leverages trusted execution environments at the server side. PAIGE's hybrid design allows privacy-preserving hardware acceleration of compute-intensive tasks, while avoiding the need to move potentially large IPA question-answering databases to the edge. As a step towards realising PAIGE, we present a first systematic performance evaluation of existing edge accelerator hardware platforms for a subset of IPA workloads, and show they offer a competitive alternative to existing data-center alternatives.
{"title":"PAIGE","authors":"Yilei Liang, Daniel O'keeffe, Nishanth R. Sastry","doi":"10.1145/3378679.3394536","DOIUrl":"https://doi.org/10.1145/3378679.3394536","url":null,"abstract":"Intelligent Personal Assistants (IPAs) such as Apple's Siri, Google Now, and Amazon Alexa are becoming an increasingly important class of web application. In contrast to previous keyword-oriented search applications, IPAs support a rich query interface that allows user interaction through images, audio, and natural language queries. However, modern IPAs rely heavily on compute-intensive machine-learning inference. To achieve acceptable performance, ML-driven IPAs increasingly depend on specialized hardware accelerators (e.g. GPUs, FPGAs or TPUs), increasing costs for IPA service providers. For end-users, IPAs also present considerable privacy risks given the sensitive nature of the data they capture. We present PAIGE, a hybrid edge-cloud architecture for privacy-preserving Intelligent Personal Assistants. PAIGE's design is founded on the assumption that recent advances in low-cost hardware for machine-learning inference offer an opportunity to offload compute-intensive IPA ML tasks to the network edge. To allow privacy-preserving access to large IPA databases for less compute-intensive pre-processed queries, PAIGE leverages trusted execution environments at the server side. PAIGE's hybrid design allows privacy-preserving hardware acceleration of compute-intensive tasks, while avoiding the need to move potentially large IPA question-answering databases to the edge. As a step towards realising PAIGE, we present a first systematic performance evaluation of existing edge accelerator hardware platforms for a subset of IPA workloads, and show they offer a competitive alternative to existing data-center alternatives.","PeriodicalId":268360,"journal":{"name":"Proceedings of the Third ACM International Workshop on Edge Systems, Analytics and Networking","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122179616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Making deep learning models efficient at inferring nowadays requires training with an extensive number of labeled data that are gathered in a centralized system. However, gathering labeled data is an expensive and time-consuming process, centralized systems cannot aggregate an ever-increasing amount of data and aggregating user data is raising privacy concerns. Federated learning solves data volume and privacy issues by leaving user data on devices, but is limited to use cases where labeled data can be generated from user interaction. Unsupervised representation learning reduces the amount of labeled data required for model training, but previous work is limited to centralized systems. This work introduces federated unsupervised representation learning, a novel software architecture that uses unsupervised representation learning to pre-train deep neural networks using unlabeled data in a federated setting. Pre-trained networks can be used to extract discriminative features. The features help learn a down-stream task of interest with a reduced amount of labeled data. Based on representation performance experiments with human activity detection it is recommended to pre-train with unlabeled data originating from more users performing a bigger set of activities compared to data used with the down-stream task of interest. As a result, competitive or superior performance compared to supervised deep learning is achieved.
{"title":"Towards federated unsupervised representation learning","authors":"Bram van Berlo, Aaqib Saeed, T. Ozcelebi","doi":"10.1145/3378679.3394530","DOIUrl":"https://doi.org/10.1145/3378679.3394530","url":null,"abstract":"Making deep learning models efficient at inferring nowadays requires training with an extensive number of labeled data that are gathered in a centralized system. However, gathering labeled data is an expensive and time-consuming process, centralized systems cannot aggregate an ever-increasing amount of data and aggregating user data is raising privacy concerns. Federated learning solves data volume and privacy issues by leaving user data on devices, but is limited to use cases where labeled data can be generated from user interaction. Unsupervised representation learning reduces the amount of labeled data required for model training, but previous work is limited to centralized systems. This work introduces federated unsupervised representation learning, a novel software architecture that uses unsupervised representation learning to pre-train deep neural networks using unlabeled data in a federated setting. Pre-trained networks can be used to extract discriminative features. The features help learn a down-stream task of interest with a reduced amount of labeled data. Based on representation performance experiments with human activity detection it is recommended to pre-train with unlabeled data originating from more users performing a bigger set of activities compared to data used with the down-stream task of interest. As a result, competitive or superior performance compared to supervised deep learning is achieved.","PeriodicalId":268360,"journal":{"name":"Proceedings of the Third ACM International Workshop on Edge Systems, Analytics and Networking","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129907284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the idea of exploiting all the computational resources that an IoT environment with multiple interconnected devices offers, serverkernel is presented as a new operating system architecture that blends ideas from distributed operating systems, Unikernel, and LWK. These concepts are mixed with a server in which a user can remotely offload computations and get the result. This single space-address operating system (OS) can be interpreted as a bare-metal OS in which only drivers for CPU, network, and accelerators are required in order to provide service. To demonstrate the advantages of serverkernel, jonOS, an open-source C implementation of this architecture for Raspberry Pi, is provided. Compared with commercial architectures used in IoT devices, serverkernel achieves an improvement ratio of 1.5 in CPU time, 2.5 in real-time, and around 9 times better in network speed.
{"title":"The serverkernel operating system","authors":"Jon Larrea, A. Barbalace","doi":"10.1145/3378679.3394537","DOIUrl":"https://doi.org/10.1145/3378679.3394537","url":null,"abstract":"With the idea of exploiting all the computational resources that an IoT environment with multiple interconnected devices offers, serverkernel is presented as a new operating system architecture that blends ideas from distributed operating systems, Unikernel, and LWK. These concepts are mixed with a server in which a user can remotely offload computations and get the result. This single space-address operating system (OS) can be interpreted as a bare-metal OS in which only drivers for CPU, network, and accelerators are required in order to provide service. To demonstrate the advantages of serverkernel, jonOS, an open-source C implementation of this architecture for Raspberry Pi, is provided. Compared with commercial architectures used in IoT devices, serverkernel achieves an improvement ratio of 1.5 in CPU time, 2.5 in real-time, and around 9 times better in network speed.","PeriodicalId":268360,"journal":{"name":"Proceedings of the Third ACM International Workshop on Edge Systems, Analytics and Networking","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115379179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Transmitting data to cloud datacenters in distributed IoT applications introduces significant communication latency, but is often the only feasible solution when source nodes are computationally limited. To address latency concerns, Cloudlets, in-network computing, and more capable edge nodes are all being explored as a way of moving processing capability towards the edge of the network. Hardware acceleration using Field programmable gate arrays (FPGAs) is also seeing increased interest due to reduced computation time and improved efficiency. This paper evaluates the the implications of these offloading approaches using a case study neural network based image classification application, quantifying both the computation and communication latency resulting from different platform choices. We demonstrate that emerging in-network accelerator approaches offer much improved and predictable performance as well as better scaling to support multiple data sources.
{"title":"Quantifying the latency benefits of near-edge and in-network FPGA acceleration","authors":"Ryan A. Cooke, Suhaib A. Fahmy","doi":"10.1145/3378679.3394534","DOIUrl":"https://doi.org/10.1145/3378679.3394534","url":null,"abstract":"Transmitting data to cloud datacenters in distributed IoT applications introduces significant communication latency, but is often the only feasible solution when source nodes are computationally limited. To address latency concerns, Cloudlets, in-network computing, and more capable edge nodes are all being explored as a way of moving processing capability towards the edge of the network. Hardware acceleration using Field programmable gate arrays (FPGAs) is also seeing increased interest due to reduced computation time and improved efficiency. This paper evaluates the the implications of these offloading approaches using a case study neural network based image classification application, quantifying both the computation and communication latency resulting from different platform choices. We demonstrate that emerging in-network accelerator approaches offer much improved and predictable performance as well as better scaling to support multiple data sources.","PeriodicalId":268360,"journal":{"name":"Proceedings of the Third ACM International Workshop on Edge Systems, Analytics and Networking","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124635965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuchen Zhao, H. Haddadi, Severin Skillman, Shirin Enshaeifar, P. Barnaghi
Activity recognition using deep learning and sensor data can help monitor activities and health conditions of people who need assistance in their daily lives. Deep Neural Network (DNN) models to infer the activities require data collected by in-home sensory devices. These data are often sent to a centralised cloud to be used for training the model. Centralising the data introduces privacy risks. The collected data contain sensitive information about the subjects. The cloud-based approach increases the risk that the data be stored and reused for other purposes without the owner's control. We propose a system that uses edge devices to implement activity and health monitoring locally and applies federated learning to facilitate the training process. The devices use the Databox platform to manage sensor data collected in people's homes, conduct activity recognition locally, and collaboratively train a DNN model without transferring the collected data into the cloud. We illustrate the applicability of the processing time of activity recognition on edge devices. We use a hierarchical model in which a global model is generated in the cloud, without requiring the raw data, and local models are trained on edge devices. The activity inference accuracy of the global model converges to a sufficient level after a few rounds of communication between edge devices and the cloud.
{"title":"Privacy-preserving activity and health monitoring on databox","authors":"Yuchen Zhao, H. Haddadi, Severin Skillman, Shirin Enshaeifar, P. Barnaghi","doi":"10.1145/3378679.3394529","DOIUrl":"https://doi.org/10.1145/3378679.3394529","url":null,"abstract":"Activity recognition using deep learning and sensor data can help monitor activities and health conditions of people who need assistance in their daily lives. Deep Neural Network (DNN) models to infer the activities require data collected by in-home sensory devices. These data are often sent to a centralised cloud to be used for training the model. Centralising the data introduces privacy risks. The collected data contain sensitive information about the subjects. The cloud-based approach increases the risk that the data be stored and reused for other purposes without the owner's control. We propose a system that uses edge devices to implement activity and health monitoring locally and applies federated learning to facilitate the training process. The devices use the Databox platform to manage sensor data collected in people's homes, conduct activity recognition locally, and collaboratively train a DNN model without transferring the collected data into the cloud. We illustrate the applicability of the processing time of activity recognition on edge devices. We use a hierarchical model in which a global model is generated in the cloud, without requiring the raw data, and local models are trained on edge devices. The activity inference accuracy of the global model converges to a sufficient level after a few rounds of communication between edge devices and the cloud.","PeriodicalId":268360,"journal":{"name":"Proceedings of the Third ACM International Workshop on Edge Systems, Analytics and Networking","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128443253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}