Due to outsource manufacturing, the semiconductor industry must deal with various hardware threats such as piracy and overproduction. To prevent illegal electronic products from functioning, the circuit can be encrypted using a protected key only known to the designer. However, an attacker can still decipher the secret key utilizing a functioning circuit bought from the market, and the encrypted layout leaked from an untrusted foundry. In this paper, after introducing essential conformity and mutuality features for secure logic encryption, we propose DLE, a novel Distributed Logic Encryption design that resists against all known oracle guided and structural attacks including the newly proposed fault-aided SAT-based attack that iteratively injects a single stuck-at fault to thwart the locking effect. DLE forces the attacker to insert multiple stuck-at faults simultaneously in critical points to achieve a smaller but meaningful encrypted circuit; thus, exponentially reducing the chance to hit all the critical points with properly located stuck-at fault injections. Our experiments confirm that DLE maintains an exponentially high degree of security under diverse attacks with the polynomial area and linear performance overheads.
{"title":"Distributed Logic Encryption: Essential Security Requirements and Low-Overhead Implementation","authors":"Raheel Afsharmazayejani, H. Sayadi, Amin Rezaei","doi":"10.1145/3526241.3530372","DOIUrl":"https://doi.org/10.1145/3526241.3530372","url":null,"abstract":"Due to outsource manufacturing, the semiconductor industry must deal with various hardware threats such as piracy and overproduction. To prevent illegal electronic products from functioning, the circuit can be encrypted using a protected key only known to the designer. However, an attacker can still decipher the secret key utilizing a functioning circuit bought from the market, and the encrypted layout leaked from an untrusted foundry. In this paper, after introducing essential conformity and mutuality features for secure logic encryption, we propose DLE, a novel Distributed Logic Encryption design that resists against all known oracle guided and structural attacks including the newly proposed fault-aided SAT-based attack that iteratively injects a single stuck-at fault to thwart the locking effect. DLE forces the attacker to insert multiple stuck-at faults simultaneously in critical points to achieve a smaller but meaningful encrypted circuit; thus, exponentially reducing the chance to hit all the critical points with properly located stuck-at fault injections. Our experiments confirm that DLE maintains an exponentially high degree of security under diverse attacks with the polynomial area and linear performance overheads.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123061904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Manoj Gopale, G. Ditzler, Roman L. Lysecky, Janet Roveda
Side-channel attacks (SCA) have been studied for several decades, which resulted in many techniques that use statistical models to extract system information from side channels. More recently, machine learning has shown significant promise to advance the ability for SCAs to expose vulnerabilities. Artificial neural networks (ANN) can effectively learn nonlinear relationships between features within a side channel. In this paper, we propose a multi-architecture data aggregation technique to profile power traces for a system with an embedded processor that is based on three types of deep NNs, namely, multi-layer perceptrons (MLP), convolutional neural networks (CNN), and recurrent neural networks (RNN). This is one of the first works to explore the inter-architecture portability of NNs and SCAs. We demonstrate the robustness of the ANNs performing power-based SCAs on multiple architecture configurations with different architectural features, such as L1/L2 caches' size and associativity, and system memory size. We provide a comprehensive set of benchmarks to demonstrate that architecturally identical devices are not essential for profile-based SCAs
{"title":"Inter-Architecture Portability of Artificial Neural Networks and Side Channel Attacks","authors":"Manoj Gopale, G. Ditzler, Roman L. Lysecky, Janet Roveda","doi":"10.1145/3526241.3530356","DOIUrl":"https://doi.org/10.1145/3526241.3530356","url":null,"abstract":"Side-channel attacks (SCA) have been studied for several decades, which resulted in many techniques that use statistical models to extract system information from side channels. More recently, machine learning has shown significant promise to advance the ability for SCAs to expose vulnerabilities. Artificial neural networks (ANN) can effectively learn nonlinear relationships between features within a side channel. In this paper, we propose a multi-architecture data aggregation technique to profile power traces for a system with an embedded processor that is based on three types of deep NNs, namely, multi-layer perceptrons (MLP), convolutional neural networks (CNN), and recurrent neural networks (RNN). This is one of the first works to explore the inter-architecture portability of NNs and SCAs. We demonstrate the robustness of the ANNs performing power-based SCAs on multiple architecture configurations with different architectural features, such as L1/L2 caches' size and associativity, and system memory size. We provide a comprehensive set of benchmarks to demonstrate that architecturally identical devices are not essential for profile-based SCAs","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128808045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Conventional synthesis algorithms transform the behavioral RTL design to a standard cell mapped gate level netlist, with support to customize optimization effort of few operators. HDL description standards and current synthesis methods lack support to generate netlist of custom functions for quick validation and characterization of the design. Additionally, synthesis does not cater directly to various mathematical functions, design efforts towards approximating the desired function is needed. Hence a synthesis method for realizing circuits applicable to not only arithmetic but also to non-linear functions will be highly valuable and appreciated among the VLSI design community. This work employs Cartesian Genetic Programming (CGP) algorithm, an evolutionary design methodology suitable to synthesize digital circuits. CGP benefits in accelerating the design process and offers the ease to realize complex functions with little to no design effort. Activation functions are difficult to realize as combinational circuits using traditional design methods, this work validates the synthesis results for 6 non-linear activation functions using both classical and standard cell synthesis oriented CGP. The ability to incorporate such unconventional designs to the traditional synthesis flow will be instrumental for implementing accelerators in hardware space, and eventually for efficient design of heterogeneous SoC systems.
{"title":"Evolutionary Standard Cell Synthesis of Unconventional Designs","authors":"C. PrashanthH., M. Rao","doi":"10.1145/3526241.3530353","DOIUrl":"https://doi.org/10.1145/3526241.3530353","url":null,"abstract":"Conventional synthesis algorithms transform the behavioral RTL design to a standard cell mapped gate level netlist, with support to customize optimization effort of few operators. HDL description standards and current synthesis methods lack support to generate netlist of custom functions for quick validation and characterization of the design. Additionally, synthesis does not cater directly to various mathematical functions, design efforts towards approximating the desired function is needed. Hence a synthesis method for realizing circuits applicable to not only arithmetic but also to non-linear functions will be highly valuable and appreciated among the VLSI design community. This work employs Cartesian Genetic Programming (CGP) algorithm, an evolutionary design methodology suitable to synthesize digital circuits. CGP benefits in accelerating the design process and offers the ease to realize complex functions with little to no design effort. Activation functions are difficult to realize as combinational circuits using traditional design methods, this work validates the synthesis results for 6 non-linear activation functions using both classical and standard cell synthesis oriented CGP. The ability to incorporate such unconventional designs to the traditional synthesis flow will be instrumental for implementing accelerators in hardware space, and eventually for efficient design of heterogeneous SoC systems.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125737583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Over the past decade, a few side-channel attacks (SCAs) and countermeasures against implementations of Elliptic-Curve Cryptography (ECC), commonly used in embedded systems and Internet-of- Things (IoT) devices, have been presented. This work discovers a new side-channel power leakage of an ECDH hardware implementation protected against existing attacks, where the power leakage is not directly related to the key bits, but related to the differential of two consecutive key bits. We propose an unsupervised differential-bit horizontal clustering attack and implement it against an ECDH FPGA implementation. We also comprehensively analyze the related operations and circuits, and identify the root cause of such leakage is due to the different arrival times of inputs to combinational circuits. Such leakage generally exists in ECC hardware implementations, including FPGA and ASIC. We further propose several effective countermeasures to address this new vulnerability and evaluate the implemetations.
{"title":"Protected ECC Still Leaks: A Novel Differential-Bit Side-channel Power Attack on ECDH and Countermeasures","authors":"Tianhong Xu, Cheng Gongye, Yunsi Fei","doi":"10.1145/3526241.3530342","DOIUrl":"https://doi.org/10.1145/3526241.3530342","url":null,"abstract":"Over the past decade, a few side-channel attacks (SCAs) and countermeasures against implementations of Elliptic-Curve Cryptography (ECC), commonly used in embedded systems and Internet-of- Things (IoT) devices, have been presented. This work discovers a new side-channel power leakage of an ECDH hardware implementation protected against existing attacks, where the power leakage is not directly related to the key bits, but related to the differential of two consecutive key bits. We propose an unsupervised differential-bit horizontal clustering attack and implement it against an ECDH FPGA implementation. We also comprehensively analyze the related operations and circuits, and identify the root cause of such leakage is due to the different arrival times of inputs to combinational circuits. Such leakage generally exists in ECC hardware implementations, including FPGA and ASIC. We further propose several effective countermeasures to address this new vulnerability and evaluate the implemetations.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"48 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128526753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Session 5A: Hardware Security","authors":"K. Gaj","doi":"10.1145/3542690","DOIUrl":"https://doi.org/10.1145/3542690","url":null,"abstract":"","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132719744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tanmoy Chowdhury, Ashka Vakil, B. S. Latibari, Seyed Aresh Beheshti Shirazi, Ali Mirzaeian, Xiaojie Guo, Sai Manoj Pudukotai Dinakarrao, H. Homayoun, I. Savidis, Liang Zhao, Avesta Sasan
This paper presents RAPTA, a customized Representation-learning Architecture for automation of feature engineering and predicting the result of Path-based Timing-Analysis early in the physical design cycle. RAPTA offers multiple advantages compared to prior work: 1) It has superior accuracy with errors std ranges 3.9ps~16.05ps in 32nm technology. 2) RAPTA's architecture does not change with feature-set size, 3) RAPTA does not require manual input feature engineering. To the best of our knowledge, this is the first work, in which Bidirectional Long Short-Term Memory (Bi-LSTM) representation learning is used to digest raw information for feature engineering, where generation of latent features and Multilayer Perceptron (MLP) based regression for timing prediction can be trained end-to-end.
{"title":"RAPTA: A Hierarchical Representation Learning Solution For Real-Time Prediction of Path-Based Static Timing Analysis","authors":"Tanmoy Chowdhury, Ashka Vakil, B. S. Latibari, Seyed Aresh Beheshti Shirazi, Ali Mirzaeian, Xiaojie Guo, Sai Manoj Pudukotai Dinakarrao, H. Homayoun, I. Savidis, Liang Zhao, Avesta Sasan","doi":"10.1145/3526241.3530831","DOIUrl":"https://doi.org/10.1145/3526241.3530831","url":null,"abstract":"This paper presents RAPTA, a customized Representation-learning Architecture for automation of feature engineering and predicting the result of Path-based Timing-Analysis early in the physical design cycle. RAPTA offers multiple advantages compared to prior work: 1) It has superior accuracy with errors std ranges 3.9ps~16.05ps in 32nm technology. 2) RAPTA's architecture does not change with feature-set size, 3) RAPTA does not require manual input feature engineering. To the best of our knowledge, this is the first work, in which Bidirectional Long Short-Term Memory (Bi-LSTM) representation learning is used to digest raw information for feature engineering, where generation of latent features and Multilayer Perceptron (MLP) based regression for timing prediction can be trained end-to-end.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"19 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132974418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Raising the level of VLSI design abstraction to the behavioral level allows to generate different micro-architectures from the same behavioral description by simply setting different synthesis options. These are typically synthesis directives in the form of pragmas that control how to synthesize arrays, loops, and functions. Out of all the combinations the designer is typically only interested in the synthesis directive combinations that lead to the Pareto-optimal designs. Unfortunately this multi-objective optimization problem grows supra-linearly with the number of the explorable operations. Thus, fast heuristics are needed. One additional way to accelerate the exploration process is by parallelizing the explorer tcreating multi-threaded versions. The main problem with this approach is that every time that a new pragma combination is generated the explorer requires to invoke the HLS process in order to evaluate the effect of these synthesis options on the resultant design. This tool invocation requires to check out a HLS tool license that will not be released until the HLS process has finished. This implies that the maximum number of parallel threads is limited by the number of licenses available. In the ASIC case, these licenses are extremely expensive, making it often prohibitory for some companies to have more than one. On contrary FPGA vendors provide their HLS tools free. Thus, it is tempting to investigate if FPGA HLS tools can be used to find the ASIC Pareto-optimal designs. To address this, in this work we present a dedicated multi-threaded parallel HLS DSE explorer that is able to accelerate HLS DSE for ASICs by targeting first FPGAs and using machine learning to convert the exploration results obtained to find the optimal ASIC equivalent. Experimental results show that our proposed approach is very efficient speedup up the exploration process considerably.
{"title":"Fast Parallel High-Level Synthesis Design Space Explorer: Targeting FPGAs to accelerate ASIC Exploration","authors":"M. I. Rashid, B. C. Schafer","doi":"10.1145/3526241.3530339","DOIUrl":"https://doi.org/10.1145/3526241.3530339","url":null,"abstract":"Raising the level of VLSI design abstraction to the behavioral level allows to generate different micro-architectures from the same behavioral description by simply setting different synthesis options. These are typically synthesis directives in the form of pragmas that control how to synthesize arrays, loops, and functions. Out of all the combinations the designer is typically only interested in the synthesis directive combinations that lead to the Pareto-optimal designs. Unfortunately this multi-objective optimization problem grows supra-linearly with the number of the explorable operations. Thus, fast heuristics are needed. One additional way to accelerate the exploration process is by parallelizing the explorer tcreating multi-threaded versions. The main problem with this approach is that every time that a new pragma combination is generated the explorer requires to invoke the HLS process in order to evaluate the effect of these synthesis options on the resultant design. This tool invocation requires to check out a HLS tool license that will not be released until the HLS process has finished. This implies that the maximum number of parallel threads is limited by the number of licenses available. In the ASIC case, these licenses are extremely expensive, making it often prohibitory for some companies to have more than one. On contrary FPGA vendors provide their HLS tools free. Thus, it is tempting to investigate if FPGA HLS tools can be used to find the ASIC Pareto-optimal designs. To address this, in this work we present a dedicated multi-threaded parallel HLS DSE explorer that is able to accelerate HLS DSE for ASICs by targeting first FPGAs and using machine learning to convert the exploration results obtained to find the optimal ASIC equivalent. Experimental results show that our proposed approach is very efficient speedup up the exploration process considerably.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132655176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Session 2B: Computer-Aided Design (CAD)","authors":"E. Salman","doi":"10.1145/3542685","DOIUrl":"https://doi.org/10.1145/3542685","url":null,"abstract":"","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129488198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Traditional computing systems based on von Neumann architectures are fundamentally bottle-necked by the transfer speeds between memory and processor. With growing computational needs of today's application space, dominated by Machine Learning (ML) workloads, there is a need to design special purpose computing systems operating on the principle of co-located memory and processing units. Such an approach, commonly known as 'In-memory computing', can potentially eliminate expensive data movement costs by computing inside the memory array itself. To that effect, crossbars based on resistive switching Non-Volatile Memory (NVM) devices has shown immense promise in serving as the building blocks of in-memory computing systems, as their high storage density can overcome scaling challenges that plague CMOS technology today. Adding to that, the ability of resistive crossbars to accelerate the main computational kernel of ML workloads by performing massively parallel, in-situ matrix vector multiplication (MVM) operations, makes them a promising candidate for building area and energy-efficient systems. However, the analog computing nature in resistive crossbars introduce approximations in MVM computations due to device and circuit level nonidealities. Further, analog systems pose high cost peripheral circuit requirements for conversions between the analog and digital domain. Thus, there is a need to understand the entire system design stack, from device characteristics to architectures, and perform effective hardware-software co-design to truly realize the potential of resistive crossbars as future computing systems. In this talk, we will present a comprehensive overview of NVM crossbars for accelerating ML workloads. We describe, in detail, the design principles of the basic building blocks, such as the device and associated circuits, that constitute the crossbars. We explore non-idealities arising from the device characteristics and circuit behavior and study their impact on MVM functionality of NVM crossbars for machine learning hardware.
{"title":"In-Memory Computing based Machine Learning Accelerators: Opportunities and Challenges","authors":"K. Roy","doi":"10.1145/3526241.3530051","DOIUrl":"https://doi.org/10.1145/3526241.3530051","url":null,"abstract":"Traditional computing systems based on von Neumann architectures are fundamentally bottle-necked by the transfer speeds between memory and processor. With growing computational needs of today's application space, dominated by Machine Learning (ML) workloads, there is a need to design special purpose computing systems operating on the principle of co-located memory and processing units. Such an approach, commonly known as 'In-memory computing', can potentially eliminate expensive data movement costs by computing inside the memory array itself. To that effect, crossbars based on resistive switching Non-Volatile Memory (NVM) devices has shown immense promise in serving as the building blocks of in-memory computing systems, as their high storage density can overcome scaling challenges that plague CMOS technology today. Adding to that, the ability of resistive crossbars to accelerate the main computational kernel of ML workloads by performing massively parallel, in-situ matrix vector multiplication (MVM) operations, makes them a promising candidate for building area and energy-efficient systems. However, the analog computing nature in resistive crossbars introduce approximations in MVM computations due to device and circuit level nonidealities. Further, analog systems pose high cost peripheral circuit requirements for conversions between the analog and digital domain. Thus, there is a need to understand the entire system design stack, from device characteristics to architectures, and perform effective hardware-software co-design to truly realize the potential of resistive crossbars as future computing systems. In this talk, we will present a comprehensive overview of NVM crossbars for accelerating ML workloads. We describe, in detail, the design principles of the basic building blocks, such as the device and associated circuits, that constitute the crossbars. We explore non-idealities arising from the device characteristics and circuit behavior and study their impact on MVM functionality of NVM crossbars for machine learning hardware.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131143119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sanket Shukla, Gaurav Kolhe, H. Homayoun, S. Rafatirad, Sai Manoj Pudukotai Dinakarrao
Federated Learning (FL) is a decentralized machine learning in which the training data is distributed on the Internet-of-Things (IoT) devices and learns a shared global model by aggregating local updates. However, the training data can be poisoned and manipulated by malicious adversaries, contaminating locally computed updates. To prevent this, detecting malicious IoT devices is very important. Since the local updates are large because of the high volume of data, minimizing the communication overhead is also necessary. This paper proposes a "RAFeL" framework, comprising of two techniques to tackle the above issues, (1) a robust defense technique and (2) a "Performance-aware bit-wise encoding" technique. "Robust and Active Protection with Intelligent Defense (RAPID)" is a defense system that detects malicious IoT devices and restricts the participation of the contaminated local updates computed by these malicious devices. To minimize communication cost, "Performance-aware bit-wise encoding" selects the appropriate encoding scheme for individual split bits based on their significance and effect on FL performance. The results illustrate that the proposed framework shows a 1.2-1.8x higher compression rate than lossy and lossless encoding techniques and has an average accuracy drop of 3% to 10% even with a fraction of malicious devices.
{"title":"RAFeL - Robust and Data-Aware Federated Learning-inspired Malware Detection in Internet-of-Things (IoT) Networks","authors":"Sanket Shukla, Gaurav Kolhe, H. Homayoun, S. Rafatirad, Sai Manoj Pudukotai Dinakarrao","doi":"10.1145/3526241.3530378","DOIUrl":"https://doi.org/10.1145/3526241.3530378","url":null,"abstract":"Federated Learning (FL) is a decentralized machine learning in which the training data is distributed on the Internet-of-Things (IoT) devices and learns a shared global model by aggregating local updates. However, the training data can be poisoned and manipulated by malicious adversaries, contaminating locally computed updates. To prevent this, detecting malicious IoT devices is very important. Since the local updates are large because of the high volume of data, minimizing the communication overhead is also necessary. This paper proposes a \"RAFeL\" framework, comprising of two techniques to tackle the above issues, (1) a robust defense technique and (2) a \"Performance-aware bit-wise encoding\" technique. \"Robust and Active Protection with Intelligent Defense (RAPID)\" is a defense system that detects malicious IoT devices and restricts the participation of the contaminated local updates computed by these malicious devices. To minimize communication cost, \"Performance-aware bit-wise encoding\" selects the appropriate encoding scheme for individual split bits based on their significance and effect on FL performance. The results illustrate that the proposed framework shows a 1.2-1.8x higher compression rate than lossy and lossless encoding techniques and has an average accuracy drop of 3% to 10% even with a fraction of malicious devices.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115527249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}