Pub Date : 2023-05-01DOI: 10.1109/SPW59333.2023.00030
Yuanzhe Wu, Grant Skipper, Ang Cui
Cryogenic mechanical memory extraction provides a means to obtain a device's volatile memory content at run-time. Numerous prior works has have demonstrated successful exploitation of the Memory Remanence Effect on modern computers and mobile devices. While this approach is arguably one of the most direct paths to reading a target device's physical RAM content, several significant limitations exist. For example, prior works were done either on removable memory with standardized connectors, or with the use of a custom kernel/bootloader. We present a generalized and automated system that performs reliable RAM content extraction against modern embedded devices. Our cryo-mechanical apparatus is built using low-cost hardware that is widely available, and supports target devices using single or multiple DDR1|2|3 memory modules. We discuss several novel techniques and hardware modifications that allow our apparatus to exceed the spatial and temporal precision required to reliably perform memory extraction against modern embedded systems that have memory modules soldered directly onto the PCB, and use custom memory controllers that spread bits of each word of memory across multiple physical RAM chips.
{"title":"Cryo-Mechanical RAM Content Extraction Against Modern Embedded Systems","authors":"Yuanzhe Wu, Grant Skipper, Ang Cui","doi":"10.1109/SPW59333.2023.00030","DOIUrl":"https://doi.org/10.1109/SPW59333.2023.00030","url":null,"abstract":"Cryogenic mechanical memory extraction provides a means to obtain a device's volatile memory content at run-time. Numerous prior works has have demonstrated successful exploitation of the Memory Remanence Effect on modern computers and mobile devices. While this approach is arguably one of the most direct paths to reading a target device's physical RAM content, several significant limitations exist. For example, prior works were done either on removable memory with standardized connectors, or with the use of a custom kernel/bootloader. We present a generalized and automated system that performs reliable RAM content extraction against modern embedded devices. Our cryo-mechanical apparatus is built using low-cost hardware that is widely available, and supports target devices using single or multiple DDR1|2|3 memory modules. We discuss several novel techniques and hardware modifications that allow our apparatus to exceed the spatial and temporal precision required to reliably perform memory extraction against modern embedded systems that have memory modules soldered directly onto the PCB, and use custom memory controllers that spread bits of each word of memory across multiple physical RAM chips.","PeriodicalId":308378,"journal":{"name":"2023 IEEE Security and Privacy Workshops (SPW)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121363641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-01DOI: 10.1109/SPW59333.2023.00010
N. Baracaldo, Farhan Ahmed, Kevin Eykholt, Yi Zhou, Shriti Priya, Taesung Lee, S. Kadhe, Mike Tan, Sridevi Polavaram, Sterling Suggs, Yuyang Gao, David Slater
Machine learning models are susceptible to a class of attacks known as adversarial poisoning where an adversary can maliciously manipulate training data to hinder model performance or, more concerningly, insert backdoors to exploit at inference time. Many methods have been proposed to defend against adversarial poisoning by either identifying the poisoned samples to facilitate removal or developing poison agnostic training algorithms. Although effective, these proposed approaches can have unintended consequences on the model, such as worsening performance on certain data sub-populations, thus inducing a classification bias. In this work, we evaluate several adversarial poisoning defenses. In addition to traditional security metrics, i.e., robustness to poisoned samples, we also adapt a fairness metric to measure the potential undesirable discrimination of sub-populations resulting from using these defenses. Our investigation highlights that many of the evaluated defenses trade decision fairness to achieve higher adversarial poisoning robustness. Given these results, we recommend our proposed metric to be part of standard evaluations of machine learning defenses.
{"title":"Benchmarking the Effect of Poisoning Defenses on the Security and Bias of Deep Learning Models","authors":"N. Baracaldo, Farhan Ahmed, Kevin Eykholt, Yi Zhou, Shriti Priya, Taesung Lee, S. Kadhe, Mike Tan, Sridevi Polavaram, Sterling Suggs, Yuyang Gao, David Slater","doi":"10.1109/SPW59333.2023.00010","DOIUrl":"https://doi.org/10.1109/SPW59333.2023.00010","url":null,"abstract":"Machine learning models are susceptible to a class of attacks known as adversarial poisoning where an adversary can maliciously manipulate training data to hinder model performance or, more concerningly, insert backdoors to exploit at inference time. Many methods have been proposed to defend against adversarial poisoning by either identifying the poisoned samples to facilitate removal or developing poison agnostic training algorithms. Although effective, these proposed approaches can have unintended consequences on the model, such as worsening performance on certain data sub-populations, thus inducing a classification bias. In this work, we evaluate several adversarial poisoning defenses. In addition to traditional security metrics, i.e., robustness to poisoned samples, we also adapt a fairness metric to measure the potential undesirable discrimination of sub-populations resulting from using these defenses. Our investigation highlights that many of the evaluated defenses trade decision fairness to achieve higher adversarial poisoning robustness. Given these results, we recommend our proposed metric to be part of standard evaluations of machine learning defenses.","PeriodicalId":308378,"journal":{"name":"2023 IEEE Security and Privacy Workshops (SPW)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134183229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-01DOI: 10.1109/SPW59333.2023.00007
Zhi Chen, Zhenning Zhang, Zeliang Kan, Limin Yang, Jacopo Cortellazzi, Feargus Pendlebury, Fabio Pierazzi, L. Cavallaro, Gang Wang
Concept drift is a major challenge faced by machine learning-based malware detectors when deployed in practice. While existing works have investigated methods to detect concept drift, it is not yet well understood regarding the main causes behind the drift. In this paper, we design experiments to empirically analyze the impact of feature-space drift (new features introduced by new samples) and compare it with data-space drift (data distribution shift over existing features). Surprisingly, we find that data-space drift is the dominating contributor to the model degradation over time while feature-space drift has little to no impact. This is consistently observed over both Android and PE malware detectors, with different feature types and feature engineering methods, across different settings. We further validate this observation with recent online learning based malware detectors that incrementally update the feature space. Our result indicates the possibility of handling concept drift without frequent feature updating, and we further discuss the open questions for future research.
{"title":"Is It Overkill? Analyzing Feature-Space Concept Drift in Malware Detectors","authors":"Zhi Chen, Zhenning Zhang, Zeliang Kan, Limin Yang, Jacopo Cortellazzi, Feargus Pendlebury, Fabio Pierazzi, L. Cavallaro, Gang Wang","doi":"10.1109/SPW59333.2023.00007","DOIUrl":"https://doi.org/10.1109/SPW59333.2023.00007","url":null,"abstract":"Concept drift is a major challenge faced by machine learning-based malware detectors when deployed in practice. While existing works have investigated methods to detect concept drift, it is not yet well understood regarding the main causes behind the drift. In this paper, we design experiments to empirically analyze the impact of feature-space drift (new features introduced by new samples) and compare it with data-space drift (data distribution shift over existing features). Surprisingly, we find that data-space drift is the dominating contributor to the model degradation over time while feature-space drift has little to no impact. This is consistently observed over both Android and PE malware detectors, with different feature types and feature engineering methods, across different settings. We further validate this observation with recent online learning based malware detectors that incrementally update the feature space. Our result indicates the possibility of handling concept drift without frequent feature updating, and we further discuss the open questions for future research.","PeriodicalId":308378,"journal":{"name":"2023 IEEE Security and Privacy Workshops (SPW)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124525060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-01DOI: 10.1109/SPW59333.2023.00017
Prashant Anantharaman, R. Lathrop, Rebecca Shapiro, M. Locasto
Complex data formats implicitly demand complex logic to parse and apprehend them. The Portable Document Format (PDF) is among the most demanding formats because it is used as both a data exchange and presentation format, and it has a particularly stringent tradition of supporting in-teroperability and consistent presentation. These requirements create complexity that presents an opportunity for adversaries to encode a variety of exploits and attacks. To investigate whether there is an association between structural malforms and malice (using PDF files as the example challenge format), we built PolyDoc, a tool that conducts format-aware tracing of files pulled from the PolySwarm network. The PolySwarm network crowdsources threat intelligence by running files through several industry-scale threat-detection engines. The PolySwarm network provides a PolyScore, which indicates whether a file is safe or malicious, as judged by threat-detection engines. We ran PolyDoc in a live hunt mode to gather PDF files submitted to PolySwarm and then trace the execution of these PDF files through popular PDF tools such as Mutool, Poppler, and Caradoc. We collected and analyzed 58,906 files from PolySwarm. Further, we used the PDF Error Ontology to assign error categories based on tracer output and compared them to the PolyScore. Our work demonstrates three core insights. First, PDF files classified as malicious contain syntactic malformations. Second, “uncategorized” error ontology classes were common across our different PDF tools—demonstrating that the PDF Error Ontology may be underspecified for files that real-world threat engines receive. Finally, attackers leverage specific syntactic malformations in attacks: malformations that current PDF tools can detect.
复杂的数据格式隐含地需要复杂的逻辑来解析和理解它们。可移植文档格式(Portable Document Format, PDF)是要求最高的格式之一,因为它既可用作数据交换格式,也可用作表示格式,而且它在支持互操作性和一致表示方面有着特别严格的传统。这些需求带来了复杂性,为对手提供了编码各种利用和攻击的机会。为了调查结构畸形和恶意之间是否存在关联(使用PDF文件作为示例挑战格式),我们构建了PolyDoc,这是一个对从PolySwarm网络中提取的文件进行格式感知跟踪的工具。该网络通过几个行业规模的威胁检测引擎运行文件,将威胁情报众包。该网络提供了一个PolyScore,它表明一个文件是安全的还是恶意的,由威胁检测引擎判断。我们以实时搜索模式运行PolyDoc,收集提交给PolySwarm的PDF文件,然后通过流行的PDF工具(如Mutool, Poppler和Caradoc)跟踪这些PDF文件的执行情况。我们收集并分析了来自PolySwarm的58,906个文件。此外,我们使用PDF错误本体根据跟踪器输出分配错误类别,并将它们与PolyScore进行比较。我们的工作展示了三个核心见解。首先,被归类为恶意的PDF文件包含语法错误。其次,“未分类”的错误本体类在我们不同的PDF工具中很常见,这表明PDF错误本体可能没有为现实世界的威胁引擎接收的文件指定充分。最后,攻击者利用攻击中特定的语法错误:当前PDF工具可以检测到的错误。
{"title":"PolyDoc: Surveying PDF Files from the PolySwarm network","authors":"Prashant Anantharaman, R. Lathrop, Rebecca Shapiro, M. Locasto","doi":"10.1109/SPW59333.2023.00017","DOIUrl":"https://doi.org/10.1109/SPW59333.2023.00017","url":null,"abstract":"Complex data formats implicitly demand complex logic to parse and apprehend them. The Portable Document Format (PDF) is among the most demanding formats because it is used as both a data exchange and presentation format, and it has a particularly stringent tradition of supporting in-teroperability and consistent presentation. These requirements create complexity that presents an opportunity for adversaries to encode a variety of exploits and attacks. To investigate whether there is an association between structural malforms and malice (using PDF files as the example challenge format), we built PolyDoc, a tool that conducts format-aware tracing of files pulled from the PolySwarm network. The PolySwarm network crowdsources threat intelligence by running files through several industry-scale threat-detection engines. The PolySwarm network provides a PolyScore, which indicates whether a file is safe or malicious, as judged by threat-detection engines. We ran PolyDoc in a live hunt mode to gather PDF files submitted to PolySwarm and then trace the execution of these PDF files through popular PDF tools such as Mutool, Poppler, and Caradoc. We collected and analyzed 58,906 files from PolySwarm. Further, we used the PDF Error Ontology to assign error categories based on tracer output and compared them to the PolyScore. Our work demonstrates three core insights. First, PDF files classified as malicious contain syntactic malformations. Second, “uncategorized” error ontology classes were common across our different PDF tools—demonstrating that the PDF Error Ontology may be underspecified for files that real-world threat engines receive. Finally, attackers leverage specific syntactic malformations in attacks: malformations that current PDF tools can detect.","PeriodicalId":308378,"journal":{"name":"2023 IEEE Security and Privacy Workshops (SPW)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115086120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-01DOI: 10.1109/SPW59333.2023.00009
Alireza Aghabagherloo, Rafa Gálvez, D. Preuveneers, B. Preneel
Neural networks have been shown to be vulnerable to visual data perturbations imperceptible to the human eye. Nowadays, the leading hypothesis about the reason for the existence of these adversarial examples is the presence of non-robust features, which are highly predictive but brittle. Also, it has been shown that there exist two types of non-robust features depending on whether or not they are entangled with robust features; perturbing non-robust features entangled with robust features can form adversarial examples. This paper extends earlier work by showing that models trained exclusively on robust features are still vulnerable to one type of adversarial example. Standard-trained networks can classify more accurately than robustly trained networks in this situation. Our experiments show that this phenomenon is due to the high correlation between most of the robust features and both correct and incorrect labels. In this work, we define features highly correlated with correct and incorrect labels as illusionary robust features. We discuss how perturbing an image attacking robust models affects the feature space. Based on our observations on the feature space, we explain why standard models are more successful in correctly classifying these perturbed images than robustly trained models. Our observations also show that, similar to changing non-robust features, changing some of the robust features is still imperceptible to the human eye.
{"title":"On the Brittleness of Robust Features: An Exploratory Analysis of Model Robustness and Illusionary Robust Features","authors":"Alireza Aghabagherloo, Rafa Gálvez, D. Preuveneers, B. Preneel","doi":"10.1109/SPW59333.2023.00009","DOIUrl":"https://doi.org/10.1109/SPW59333.2023.00009","url":null,"abstract":"Neural networks have been shown to be vulnerable to visual data perturbations imperceptible to the human eye. Nowadays, the leading hypothesis about the reason for the existence of these adversarial examples is the presence of non-robust features, which are highly predictive but brittle. Also, it has been shown that there exist two types of non-robust features depending on whether or not they are entangled with robust features; perturbing non-robust features entangled with robust features can form adversarial examples. This paper extends earlier work by showing that models trained exclusively on robust features are still vulnerable to one type of adversarial example. Standard-trained networks can classify more accurately than robustly trained networks in this situation. Our experiments show that this phenomenon is due to the high correlation between most of the robust features and both correct and incorrect labels. In this work, we define features highly correlated with correct and incorrect labels as illusionary robust features. We discuss how perturbing an image attacking robust models affects the feature space. Based on our observations on the feature space, we explain why standard models are more successful in correctly classifying these perturbed images than robustly trained models. Our observations also show that, similar to changing non-robust features, changing some of the robust features is still imperceptible to the human eye.","PeriodicalId":308378,"journal":{"name":"2023 IEEE Security and Privacy Workshops (SPW)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128175042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-01DOI: 10.1109/SPW59333.2023.00037
V. Ulitzsch, Deniz Scholz, D. Maier
Bugs in memory-unsafe languages are a major source of critical vulnerabilities. Large-scale fuzzing campaigns, such as Google's OSS-Fuzz, can help find and fix these bugs. To find bugs faster during fuzzing, as well as to cluster and triage the bugs more easily in an automated setup, the targets are compiled with a set of sanitizers enabled, checking certain conditions at runtime. The most common sanitizer, ASan, reports common bug patterns found during a fuzzing campaign, such as out-of-bounds reads and writes or use-after-free bugs, and aborts the program early. The information also contains the type of bug the sanitizer found. During triage, out-of-bounds reads are often considered less critical than other bugs, namely out-of-bounds writes and use-after-free bugs. However, in this paper we show that these more severe vulnerabilities can remain undetected in ASan, shadowed by an earlier faulty read access. To prove this claim empirically, we conduct a large-scale study on 814 out-of-bounds read bugs reported by OSS-Fuzz. By rerunning the same testcases, but disabling ASan's early exits, we show that almost five percent of test cases lead to more critical violations later in the execution. Further, we pick the real-world target wasm3, and show how the reported out-of-bounds read covered up an exploitable out-of-bounds write, that got silently patched.
{"title":"ASanity: On Bug Shadowing by Early ASan Exits","authors":"V. Ulitzsch, Deniz Scholz, D. Maier","doi":"10.1109/SPW59333.2023.00037","DOIUrl":"https://doi.org/10.1109/SPW59333.2023.00037","url":null,"abstract":"Bugs in memory-unsafe languages are a major source of critical vulnerabilities. Large-scale fuzzing campaigns, such as Google's OSS-Fuzz, can help find and fix these bugs. To find bugs faster during fuzzing, as well as to cluster and triage the bugs more easily in an automated setup, the targets are compiled with a set of sanitizers enabled, checking certain conditions at runtime. The most common sanitizer, ASan, reports common bug patterns found during a fuzzing campaign, such as out-of-bounds reads and writes or use-after-free bugs, and aborts the program early. The information also contains the type of bug the sanitizer found. During triage, out-of-bounds reads are often considered less critical than other bugs, namely out-of-bounds writes and use-after-free bugs. However, in this paper we show that these more severe vulnerabilities can remain undetected in ASan, shadowed by an earlier faulty read access. To prove this claim empirically, we conduct a large-scale study on 814 out-of-bounds read bugs reported by OSS-Fuzz. By rerunning the same testcases, but disabling ASan's early exits, we show that almost five percent of test cases lead to more critical violations later in the execution. Further, we pick the real-world target wasm3, and show how the reported out-of-bounds read covered up an exploitable out-of-bounds write, that got silently patched.","PeriodicalId":308378,"journal":{"name":"2023 IEEE Security and Privacy Workshops (SPW)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122616909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-01DOI: 10.1109/SPW59333.2023.00039
Jakub Pruzinec, Quynh Anh Nguyen
SQL Injection (SQLI) is a pervasive web attack where a malicious input is used to dynamically build SQL queries in a way that tricks the database (DB) engine into performing unintended harmful operations. Among many potential exploitations, an attacker may opt to exfiltrate the application data. The exfiltration process is straightforward when the web application responds to injected queries with their results. In case the content is not exposed, the adversary can still deduce it using Blind SQLI (BSQLI), an inference technique based on response differences or time delays. Unfortunately, a common drawback of BSQLI is its low inference rate (one bit per request), which severely limits the volume of data that can be extracted this way. To address this limitation, the state-of-the-art BSQLI tools optimize the inference of textual data with binary search. However, this approach has two major limitations: it assumes a uniform distribution of characters and does not take into account the history of previously inferred characters. Consequently, the technique is inefficient for natural languages used ubiquitously in DBs. This paper presents Hakuin - a new framework for optimizing BSQLI with probabilistic language models. Hakuin employs domain-specific pre-trained and adaptive models to predict the next characters based on the inference history and prioritizes characters with a higher probability of being the right ones. It also tracks statistical information to opportunistically guess strings as a whole instead of inferring the characters separately. We benchmark Hakuin against 3 state-of-the-art BSQLI tools using 20 industry-standard DB schemas and a generic DB. The results show that Hakuin is about 6 times more efficient in inferring schemas, up to 3.2 times more efficient with generic data, and up to 26 times more efficient on columns with limited values compared to the second-best performing tool. To the best of our knowledge, Hakuin is the first solution that combines domain-specific pre-trained and adaptive language models to optimize BSQLI. We release its full source code, datasets, and language models to facilitate further research.
{"title":"Hakuin: Optimizing Blind SQL Injection with Probabilistic Language Models","authors":"Jakub Pruzinec, Quynh Anh Nguyen","doi":"10.1109/SPW59333.2023.00039","DOIUrl":"https://doi.org/10.1109/SPW59333.2023.00039","url":null,"abstract":"SQL Injection (SQLI) is a pervasive web attack where a malicious input is used to dynamically build SQL queries in a way that tricks the database (DB) engine into performing unintended harmful operations. Among many potential exploitations, an attacker may opt to exfiltrate the application data. The exfiltration process is straightforward when the web application responds to injected queries with their results. In case the content is not exposed, the adversary can still deduce it using Blind SQLI (BSQLI), an inference technique based on response differences or time delays. Unfortunately, a common drawback of BSQLI is its low inference rate (one bit per request), which severely limits the volume of data that can be extracted this way. To address this limitation, the state-of-the-art BSQLI tools optimize the inference of textual data with binary search. However, this approach has two major limitations: it assumes a uniform distribution of characters and does not take into account the history of previously inferred characters. Consequently, the technique is inefficient for natural languages used ubiquitously in DBs. This paper presents Hakuin - a new framework for optimizing BSQLI with probabilistic language models. Hakuin employs domain-specific pre-trained and adaptive models to predict the next characters based on the inference history and prioritizes characters with a higher probability of being the right ones. It also tracks statistical information to opportunistically guess strings as a whole instead of inferring the characters separately. We benchmark Hakuin against 3 state-of-the-art BSQLI tools using 20 industry-standard DB schemas and a generic DB. The results show that Hakuin is about 6 times more efficient in inferring schemas, up to 3.2 times more efficient with generic data, and up to 26 times more efficient on columns with limited values compared to the second-best performing tool. To the best of our knowledge, Hakuin is the first solution that combines domain-specific pre-trained and adaptive language models to optimize BSQLI. We release its full source code, datasets, and language models to facilitate further research.","PeriodicalId":308378,"journal":{"name":"2023 IEEE Security and Privacy Workshops (SPW)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131217680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-01DOI: 10.1109/SPW59333.2023.00025
Florent Moriconi, Axel Neergaard, Lucas Georget, Samuel Aubertin, Aurélien Francillon
Continuous integration (CI) is a widely adopted methodology for supporting software development. It provides automated generation of artifacts (e.g., binaries, container images) which are then deployed in production. However, to which extent should you trust the generated artifacts even if the source code is clean of malicious code? Revisiting the famous compiler backdoor from Ken Thompson, we show that a container-based CI system can be compromised without leaving any trace in the source code. Therefore, detecting such malware is challenging or even impossible with common practices such as peer review or static code analysis. We detail multiple ways to do the initial infection process. Then, we show how to persist during CI system updates, allowing long-term compromise. We detail possible malicious attack payloads such as sensitive data extraction or backdooring production software. We show that infected CI systems can be remotely controlled using covert channels to update attack payload or adapt malware to mitigation strategies. Finally, we propose a proof of concept implementation tested on GitLab CI and applicable to major CI providers.
{"title":"Reflections on Trusting Docker: Invisible Malware in Continuous Integration Systems","authors":"Florent Moriconi, Axel Neergaard, Lucas Georget, Samuel Aubertin, Aurélien Francillon","doi":"10.1109/SPW59333.2023.00025","DOIUrl":"https://doi.org/10.1109/SPW59333.2023.00025","url":null,"abstract":"Continuous integration (CI) is a widely adopted methodology for supporting software development. It provides automated generation of artifacts (e.g., binaries, container images) which are then deployed in production. However, to which extent should you trust the generated artifacts even if the source code is clean of malicious code? Revisiting the famous compiler backdoor from Ken Thompson, we show that a container-based CI system can be compromised without leaving any trace in the source code. Therefore, detecting such malware is challenging or even impossible with common practices such as peer review or static code analysis. We detail multiple ways to do the initial infection process. Then, we show how to persist during CI system updates, allowing long-term compromise. We detail possible malicious attack payloads such as sensitive data extraction or backdooring production software. We show that infected CI systems can be remotely controlled using covert channels to update attack payload or adapt malware to mitigation strategies. Finally, we propose a proof of concept implementation tested on GitLab CI and applicable to major CI providers.","PeriodicalId":308378,"journal":{"name":"2023 IEEE Security and Privacy Workshops (SPW)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121565685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-01DOI: 10.1109/SPW59333.2023.00005
Chakshu Gupta, T. V. Ede, Andrea Continella
Over the past few years, we have witnessed a radical change in the architectures and infrastructures of web applications. Traditional monolithic systems are nowadays getting replaced by microservices-based architectures, which have become the natural choice for web application development due to portability, scalability, and ease of deployment. At the same time, due to its popularity, this architecture is now the target of specific cyberattacks. In the past, honeypots have been demonstrated to be valuable tools for collecting real-world attack data and understanding the methods that attackers adopt. However, to the best of our knowledge, there are no existing honeypots based on microservices architectures, which introduce new and different characteristics in the infrastructure. In this paper, we propose HoneyKube, a novel honeypot design that employs the microservices architecture for a web application. To address the challenges introduced by the highly dynamic nature of this architecture, we design an effective and scalable monitoring system that builds on top of the well-known Kubernetes orchestrator. We deploy our honeypot and collect approximately 850 GB of network and system data through our experiments. We also evaluate the fingerprintability of HoneyKube using a state-of-the-art reconnaissance tool. We will release our data and source code to facilitate more research in this field.
{"title":"HoneyKube: Designing and Deploying a Microservices-based Web Honeypot","authors":"Chakshu Gupta, T. V. Ede, Andrea Continella","doi":"10.1109/SPW59333.2023.00005","DOIUrl":"https://doi.org/10.1109/SPW59333.2023.00005","url":null,"abstract":"Over the past few years, we have witnessed a radical change in the architectures and infrastructures of web applications. Traditional monolithic systems are nowadays getting replaced by microservices-based architectures, which have become the natural choice for web application development due to portability, scalability, and ease of deployment. At the same time, due to its popularity, this architecture is now the target of specific cyberattacks. In the past, honeypots have been demonstrated to be valuable tools for collecting real-world attack data and understanding the methods that attackers adopt. However, to the best of our knowledge, there are no existing honeypots based on microservices architectures, which introduce new and different characteristics in the infrastructure. In this paper, we propose HoneyKube, a novel honeypot design that employs the microservices architecture for a web application. To address the challenges introduced by the highly dynamic nature of this architecture, we design an effective and scalable monitoring system that builds on top of the well-known Kubernetes orchestrator. We deploy our honeypot and collect approximately 850 GB of network and system data through our experiments. We also evaluate the fingerprintability of HoneyKube using a state-of-the-art reconnaissance tool. We will release our data and source code to facilitate more research in this field.","PeriodicalId":308378,"journal":{"name":"2023 IEEE Security and Privacy Workshops (SPW)","volume":"494 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127026400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-01DOI: 10.1109/SPW59333.2023.00011
Momin Ahmad Khan, Virat Shejwalkar, A. Houmansadr, F. Anwar
Prior literature has demonstrated that Federated learning (FL) is vulnerable to poisoning attacks that aim to jeopardize FL performance, and consequently, has introduced numerous defenses and demonstrated their robustness in various FL settings. In this work, we closely investigate a largely over-looked aspect in the robust FL literature, i.e., the experimental setup used to evaluate the robustness of FL poisoning defenses. We thoroughly review 50 defense works and highlight several questionable trends in the experimental setup of FL poisoning defense papers; we discuss the potential repercussions of such experimental setups on the key conclusions made by these works about the robustness of the proposed defenses. As a representative case study, we also evaluate a recent poisoning recovery paper from IEEE S&P'23, called FedRecover. Our case study demonstrates the importance of the experimental setup decisions (e.g., selecting representative and challenging datasets) in the validity of the robustness claims; For instance, while FedRecover performs well for MNIST and FashionMNIST (used in the original paper), in our experiments it performed poorly for FEMNIST and CIFAR10.
{"title":"On the Pitfalls of Security Evaluation of Robust Federated Learning","authors":"Momin Ahmad Khan, Virat Shejwalkar, A. Houmansadr, F. Anwar","doi":"10.1109/SPW59333.2023.00011","DOIUrl":"https://doi.org/10.1109/SPW59333.2023.00011","url":null,"abstract":"Prior literature has demonstrated that Federated learning (FL) is vulnerable to poisoning attacks that aim to jeopardize FL performance, and consequently, has introduced numerous defenses and demonstrated their robustness in various FL settings. In this work, we closely investigate a largely over-looked aspect in the robust FL literature, i.e., the experimental setup used to evaluate the robustness of FL poisoning defenses. We thoroughly review 50 defense works and highlight several questionable trends in the experimental setup of FL poisoning defense papers; we discuss the potential repercussions of such experimental setups on the key conclusions made by these works about the robustness of the proposed defenses. As a representative case study, we also evaluate a recent poisoning recovery paper from IEEE S&P'23, called FedRecover. Our case study demonstrates the importance of the experimental setup decisions (e.g., selecting representative and challenging datasets) in the validity of the robustness claims; For instance, while FedRecover performs well for MNIST and FashionMNIST (used in the original paper), in our experiments it performed poorly for FEMNIST and CIFAR10.","PeriodicalId":308378,"journal":{"name":"2023 IEEE Security and Privacy Workshops (SPW)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133942028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}