Proceedings of the 2020 Workshop on DYnamic and Novel Advances in Machine Learning and Intelligent Cyber Security最新文献

英文中文

Why Deep Learning Makes it Difficult to Keep Secrets in FPGAs 为什么深度学习使fpga难以保密

Proceedings of the 2020 Workshop on DYnamic and Novel Advances in Machine Learning and Intelligent Cyber Security

Pub Date : 2020-12-07 DOI: 10.1145/3477997.3478001

Yang Yu, M. Moraitis, E. Dubrova

With the growth of popularity of Field-Programmable Gate Arrays (FPGAs) in cloud environments, new paradigms such as FPGA-as-a-Service (FaaS) emerge. This challenges the conventional FPGA security models which assume trust between the user and the hardware owner. In an FaaS scenario, the user may want to keep data or FPGA configuration bitstream confidential in order to protect privacy or intellectual property. However, securing FaaS use cases is hard due to the difficulty of protecting encryption keys and other secrets from the hardware owner. In this paper we demonstrate that even advanced key provisioning and remote attestation methods based on Physical Unclonable Functions (PUFs) can be broken by profiling side-channel attacks employing deep learning. Using power traces from two profiling FPGA boards implementing an arbiter PUF, we train a Convolutional Neural Network (CNN) model to learn features corresponding to “0” and “1” PUF’s responses. Then, we use the resulting model to classify responses of PUFs implemented in FPGA boards under attack (different from the profiling boards). We show that the presented attack can overcome countermeasures based on encrypting challenges and responses of a PUF.

随着现场可编程门阵列(fpga)在云环境中的普及，出现了fpga即服务(FaaS)等新范式。这挑战了传统的FPGA安全模型，该模型假定用户和硬件所有者之间存在信任。在FaaS场景中，用户可能希望对数据或FPGA配置位流保密，以保护隐私或知识产权。然而，保护FaaS用例是很困难的，因为很难对硬件所有者保护加密密钥和其他秘密。在本文中，我们证明了即使是基于物理不可克隆函数(puf)的高级密钥供应和远程认证方法也可以通过使用深度学习分析侧信道攻击来破解。利用实现仲裁PUF的两个分析FPGA板的电源走线，我们训练了一个卷积神经网络(CNN)模型来学习对应于“0”和“1”PUF响应的特征。然后，我们使用所得模型对FPGA板中实现的puf在攻击下的响应进行分类(与分析板不同)。我们证明了所提出的攻击可以克服基于加密挑战和PUF响应的对策。

{"title":"Why Deep Learning Makes it Difficult to Keep Secrets in FPGAs","authors":"Yang Yu, M. Moraitis, E. Dubrova","doi":"10.1145/3477997.3478001","DOIUrl":"https://doi.org/10.1145/3477997.3478001","url":null,"abstract":"With the growth of popularity of Field-Programmable Gate Arrays (FPGAs) in cloud environments, new paradigms such as FPGA-as-a-Service (FaaS) emerge. This challenges the conventional FPGA security models which assume trust between the user and the hardware owner. In an FaaS scenario, the user may want to keep data or FPGA configuration bitstream confidential in order to protect privacy or intellectual property. However, securing FaaS use cases is hard due to the difficulty of protecting encryption keys and other secrets from the hardware owner. In this paper we demonstrate that even advanced key provisioning and remote attestation methods based on Physical Unclonable Functions (PUFs) can be broken by profiling side-channel attacks employing deep learning. Using power traces from two profiling FPGA boards implementing an arbiter PUF, we train a Convolutional Neural Network (CNN) model to learn features corresponding to “0” and “1” PUF’s responses. Then, we use the resulting model to classify responses of PUFs implemented in FPGA boards under attack (different from the profiling boards). We show that the presented attack can overcome countermeasures based on encrypting challenges and responses of a PUF.","PeriodicalId":130265,"journal":{"name":"Proceedings of the 2020 Workshop on DYnamic and Novel Advances in Machine Learning and Intelligent Cyber Security","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128710354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

WikipediaBot: Machine Learning Assisted Adversarial Manipulation of Wikipedia Articles WikipediaBot:机器学习辅助的维基百科文章的对抗性操作

Proceedings of the 2020 Workshop on DYnamic and Novel Advances in Machine Learning and Intelligent Cyber Security

Pub Date : 2020-12-07 DOI: 10.1145/3477997.3478008

Filipo Sharevski, Peter Jachim, Emma Pieroni

This paper presents an automated adversarial mechanism called WikipediaBot. WikipediaBot allows an adversary to create and control a bot infrastructure for the purpose of adversarial edits of Wikipedia articles. The WikipediaBot is a self-contained mechanism with modules for generating credentials for Wikipedia editors, bypassing login protections, and a production of contextually-relevant adversarial edits for target Wikipedia articles that evade conventional detection. The contextually-relevant adversarial edits are generated using an adversarial Markov chain that incorporates a linguistic manipulation attack known as MIM or malware-induced misperceptions. We conducted a preliminary qualitative analysis with a small focus group to test the effect of the adversarial edits in manipulating the perception a human reader has about a target Wikipedia article. Because the nefarious use of WikipediaBot could result in harmful damages to the integrity of a wide range of Wikipedia articles, we provide an elaborate discussion about the implications, detection, and defenses Wikipedia could employ to address the threat of automated adversarial manipulations.

本文提出了一种名为WikipediaBot的自动对抗机制。WikipediaBot允许攻击者创建和控制机器人基础设施，以对维基百科文章进行对抗性编辑。WikipediaBot是一个自包含的机制，其模块用于为维基百科编辑生成凭据，绕过登录保护，以及为目标维基百科文章生成与上下文相关的对抗性编辑，以逃避常规检测。与上下文相关的对抗性编辑是使用对抗性马尔可夫链生成的，该链包含称为MIM或恶意软件引起的误解的语言操纵攻击。我们对一个小型焦点小组进行了初步定性分析，以测试对抗性编辑在操纵人类读者对维基百科目标文章的看法方面的影响。由于恶意使用WikipediaBot可能会对大量维基百科文章的完整性造成有害损害，因此我们详细讨论了维基百科可以采用的含义、检测和防御措施，以解决自动对抗性操作的威胁。

{"title":"WikipediaBot: Machine Learning Assisted Adversarial Manipulation of Wikipedia Articles","authors":"Filipo Sharevski, Peter Jachim, Emma Pieroni","doi":"10.1145/3477997.3478008","DOIUrl":"https://doi.org/10.1145/3477997.3478008","url":null,"abstract":"This paper presents an automated adversarial mechanism called WikipediaBot. WikipediaBot allows an adversary to create and control a bot infrastructure for the purpose of adversarial edits of Wikipedia articles. The WikipediaBot is a self-contained mechanism with modules for generating credentials for Wikipedia editors, bypassing login protections, and a production of contextually-relevant adversarial edits for target Wikipedia articles that evade conventional detection. The contextually-relevant adversarial edits are generated using an adversarial Markov chain that incorporates a linguistic manipulation attack known as MIM or malware-induced misperceptions. We conducted a preliminary qualitative analysis with a small focus group to test the effect of the adversarial edits in manipulating the perception a human reader has about a target Wikipedia article. Because the nefarious use of WikipediaBot could result in harmful damages to the integrity of a wide range of Wikipedia articles, we provide an elaborate discussion about the implications, detection, and defenses Wikipedia could employ to address the threat of automated adversarial manipulations.","PeriodicalId":130265,"journal":{"name":"Proceedings of the 2020 Workshop on DYnamic and Novel Advances in Machine Learning and Intelligent Cyber Security","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132475397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

The Semantic Processing Pipeline: Quantifying the Network-Wide Impact of Security Tools 语义处理管道:量化安全工具的网络范围影响

Proceedings of the 2020 Workshop on DYnamic and Novel Advances in Machine Learning and Intelligent Cyber Security

Pub Date : 2020-12-07 DOI: 10.1145/3477997.3478005

Katarzyna Olejnik, M. Atighetchi, Stephane Blais

We present the Semantic Processing Pipeline (SPP), a component of the larger process of our Uncertainty Handling Workflow [10]. The SPP is a configurable, customizable plugin framework for computing network-wide impact of security tools. In addition, it can be used as a labeled data generation mechanism for leveraging machine learning based security techniques. The SPP takes cyber range experiment results as input, quantifies the tool impact, and produces a connected graph encoding knowledge derived from the experiment. This is then used as input into a quantification mechanism of our choice, be it machine learning algorithms or a Multi-Entity Bayesian Network, as in our current implementation. We quantify the level of uncertainty with respect to five key metrics, which we have termed Derived Attributes: Speed, Success, Detectability, Attribution, and Collateral Damage. We present results from experiments quantifying the effect of Nmap, a host and service discovery tool, configured in various ways. While we use Nmap as an example use case, we demonstrate that the SPP easily be applied to various tool types. In addition, we present results regarding performance and correctness of the SPP. We present runtimes for individual components as well as overall, and show that the processing time for the SPP scales quadratically with increasing input sizes. However, the overall runtime is low: the SPP can compute a connected graph from a 200-host topology in roughly one minute.

我们提出了语义处理管道(SPP)，这是我们的不确定性处理工作流的一个更大过程的组成部分[10]。SPP是一个可配置的、可定制的插件框架，用于计算安全工具在网络范围内的影响。此外，它还可以用作利用基于机器学习的安全技术的标记数据生成机制。SPP将网络范围实验结果作为输入，量化工具影响，并生成一个连接图，对实验得出的知识进行编码。然后将其用作我们选择的量化机制的输入，无论是机器学习算法还是多实体贝叶斯网络，就像我们目前的实现一样。我们将不确定性的程度量化为5个关键指标，我们称之为衍生属性:速度、成功、可探测性、归因和附带损害。我们给出了量化Nmap效果的实验结果，Nmap是一种主机和服务发现工具，以各种方式配置。当我们使用Nmap作为示例用例时，我们演示了SPP很容易应用于各种工具类型。此外，我们给出了关于SPP的性能和正确性的结果。我们给出了单个组件和整体组件的运行时间，并表明SPP的处理时间随着输入大小的增加呈二次增长。但是，总体运行时间很低:SPP可以在大约一分钟内从200个主机的拓扑计算一个连接图。

{"title":"The Semantic Processing Pipeline: Quantifying the Network-Wide Impact of Security Tools","authors":"Katarzyna Olejnik, M. Atighetchi, Stephane Blais","doi":"10.1145/3477997.3478005","DOIUrl":"https://doi.org/10.1145/3477997.3478005","url":null,"abstract":"We present the Semantic Processing Pipeline (SPP), a component of the larger process of our Uncertainty Handling Workflow [10]. The SPP is a configurable, customizable plugin framework for computing network-wide impact of security tools. In addition, it can be used as a labeled data generation mechanism for leveraging machine learning based security techniques. The SPP takes cyber range experiment results as input, quantifies the tool impact, and produces a connected graph encoding knowledge derived from the experiment. This is then used as input into a quantification mechanism of our choice, be it machine learning algorithms or a Multi-Entity Bayesian Network, as in our current implementation. We quantify the level of uncertainty with respect to five key metrics, which we have termed Derived Attributes: Speed, Success, Detectability, Attribution, and Collateral Damage. We present results from experiments quantifying the effect of Nmap, a host and service discovery tool, configured in various ways. While we use Nmap as an example use case, we demonstrate that the SPP easily be applied to various tool types. In addition, we present results regarding performance and correctness of the SPP. We present runtimes for individual components as well as overall, and show that the processing time for the SPP scales quadratically with increasing input sizes. However, the overall runtime is low: the SPP can compute a connected graph from a 200-host topology in roughly one minute.","PeriodicalId":130265,"journal":{"name":"Proceedings of the 2020 Workshop on DYnamic and Novel Advances in Machine Learning and Intelligent Cyber Security","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134569804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Statistical Approach to Detecting Low-Throughput Exfiltration through the Domain Name System Protocol 一种通过域名系统协议检测低吞吐量溢出的统计方法

Proceedings of the 2020 Workshop on DYnamic and Novel Advances in Machine Learning and Intelligent Cyber Security

Pub Date : 2020-12-07 DOI: 10.1145/3477997.3478007

Emily Joback, Leslie Shing, Kenneth Alperin, Steven R. Gomez, Steven Jorgensen, Gabe Elkin

The Domain Name System (DNS) is a critical network protocol that resolves human-readable domain names to IP addresses. Because it is an essential component necessary for the Internet to function, DNS traffic is typically allowed to bypass firewalls and other security services. Additionally, this protocol was not designed for the purpose of data transfer, so is not as heavily monitored as other protocols. These reasons make the protocol an ideal tool for covert data exfiltration by a malicious actor. A typical company or organization has network traffic containing tens to hundreds of thousands of DNS queries a day. It is impossible for an analyst to sift through such a vast dataset and investigate every domain to ensure its legitimacy. An attacker can use this as an advantage to hide traces of malicious activity within a small percentage of total traffic. Recent research in this field has focused on applying supervised machine learning (ML) or one-class classifier techniques to build a predictive model to determine if a DNS domain query is used for exfiltration purposes; however, these models require labelled datasets. In the supervised approach, models require both legitimate and malicious data samples, but it is difficult to train these models since realistic network datasets containing known DNS exploits are rarely made public. Instead, prior studies used synthetic curated datasets, but this has the potential to introduce bias. In addition, some studies have suggested that ML algorithms do not perform as well in situations where the ratio between the two classes of data is significant, as is the case for DNS exfiltration datasets. In the one-class classifier approach, these models require a dataset known to be void of exfiltration data. Our model aims to circumvent these issues by identifying cases of DNS exfiltration within a network, without requiring a labelled or curated dataset. Our approach eliminates the need for a network analyst to sift through a high volume of DNS queries, by automatically detecting traffic indicative of exfiltration.

DNS (Domain Name System)是将人类可读的域名解析为IP地址的重要网络协议。因为它是Internet运行所必需的基本组件，所以通常允许DNS流量绕过防火墙和其他安全服务。此外，该协议不是为数据传输而设计的，因此不像其他协议那样受到严格监控。这些原因使该协议成为恶意行为者隐蔽数据泄露的理想工具。典型的公司或组织每天的网络流量包含数万到数十万个DNS查询。分析师不可能筛选如此庞大的数据集，并调查每个领域以确保其合法性。攻击者可以利用这一优势，在总流量的一小部分中隐藏恶意活动的痕迹。该领域最近的研究主要集中在应用监督机器学习(ML)或单类分类器技术来构建预测模型，以确定DNS域查询是否用于泄漏目的;然而，这些模型需要标记数据集。在监督方法中，模型需要合法和恶意的数据样本，但是很难训练这些模型，因为包含已知DNS漏洞的实际网络数据集很少公开。相反，之前的研究使用了合成整理的数据集，但这有可能引入偏见。此外，一些研究表明，ML算法在两类数据之间的比例很大的情况下表现不佳，就像DNS泄露数据集的情况一样。在单类分类器方法中，这些模型需要一个已知没有泄漏数据的数据集。我们的模型旨在通过识别网络中的DNS泄露案例来规避这些问题，而不需要标记或管理数据集。我们的方法通过自动检测泄露的流量指示，消除了网络分析师筛选大量DNS查询的需要。

{"title":"A Statistical Approach to Detecting Low-Throughput Exfiltration through the Domain Name System Protocol","authors":"Emily Joback, Leslie Shing, Kenneth Alperin, Steven R. Gomez, Steven Jorgensen, Gabe Elkin","doi":"10.1145/3477997.3478007","DOIUrl":"https://doi.org/10.1145/3477997.3478007","url":null,"abstract":"The Domain Name System (DNS) is a critical network protocol that resolves human-readable domain names to IP addresses. Because it is an essential component necessary for the Internet to function, DNS traffic is typically allowed to bypass firewalls and other security services. Additionally, this protocol was not designed for the purpose of data transfer, so is not as heavily monitored as other protocols. These reasons make the protocol an ideal tool for covert data exfiltration by a malicious actor. A typical company or organization has network traffic containing tens to hundreds of thousands of DNS queries a day. It is impossible for an analyst to sift through such a vast dataset and investigate every domain to ensure its legitimacy. An attacker can use this as an advantage to hide traces of malicious activity within a small percentage of total traffic. Recent research in this field has focused on applying supervised machine learning (ML) or one-class classifier techniques to build a predictive model to determine if a DNS domain query is used for exfiltration purposes; however, these models require labelled datasets. In the supervised approach, models require both legitimate and malicious data samples, but it is difficult to train these models since realistic network datasets containing known DNS exploits are rarely made public. Instead, prior studies used synthetic curated datasets, but this has the potential to introduce bias. In addition, some studies have suggested that ML algorithms do not perform as well in situations where the ratio between the two classes of data is significant, as is the case for DNS exfiltration datasets. In the one-class classifier approach, these models require a dataset known to be void of exfiltration data. Our model aims to circumvent these issues by identifying cases of DNS exfiltration within a network, without requiring a labelled or curated dataset. Our approach eliminates the need for a network analyst to sift through a high volume of DNS queries, by automatically detecting traffic indicative of exfiltration.","PeriodicalId":130265,"journal":{"name":"Proceedings of the 2020 Workshop on DYnamic and Novel Advances in Machine Learning and Intelligent Cyber Security","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121126839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Efficient Black-Box Search for Adversarial Examples using Relevance Masks 有效的黑盒搜索使用相关蒙版对抗的例子

Proceedings of the 2020 Workshop on DYnamic and Novel Advances in Machine Learning and Intelligent Cyber Security

Pub Date : 2020-12-07 DOI: 10.1145/3477997.3478013

F. Freiling, Ramin Tavakoli Kolagari, Katja Auernhammer

Machine learning classifiers for image recognition are prevalent in many applications. We study the problem of finding adversarial examples for such classifiers, i.e., to manipulate the images in such a way that they still look like the original images to a human but are misinterpreted by the classifier. Finding adversarial examples corresponds to a search problem in the image space. We focus on black-box attacks that can only use the original classifier to guide the search. The challenge is not to find adversarial examples, but rather to find them efficiently, ideally in real time. We show two novel methods that increase the efficiency of black-box search algorithms for adversarial examples: The first uses a relevance mask, i.e., a bitmask on the original image that restricts the search to those pixels that appear to be more relevant to the attacked classifier than others. The second exploits the discovery of merge drift, a phenomenon that negatively affects search algorithms that are based on the merging of image candidates. We evaluate both concepts on existing and new algorithms.

用于图像识别的机器学习分类器在许多应用中都很流行。我们研究了为这样的分类器找到对抗性示例的问题，即，以这样一种方式操纵图像，使它们看起来仍然像原始图像，但被分类器误解。寻找对抗性示例对应于图像空间中的搜索问题。我们关注的是黑盒攻击，这种攻击只能使用原始分类器来指导搜索。我们面临的挑战不是找到对抗性的例子，而是有效地找到它们，理想情况下是实时的。我们展示了两种新方法，可以提高对抗性示例的黑箱搜索算法的效率:第一种方法使用关联掩码，即原始图像上的位掩码，将搜索限制在那些看起来与被攻击分类器比其他分类器更相关的像素。第二个是利用合并漂移的发现，这是一种对基于图像候选合并的搜索算法产生负面影响的现象。我们评估了现有算法和新算法的概念。

{"title":"Efficient Black-Box Search for Adversarial Examples using Relevance Masks","authors":"F. Freiling, Ramin Tavakoli Kolagari, Katja Auernhammer","doi":"10.1145/3477997.3478013","DOIUrl":"https://doi.org/10.1145/3477997.3478013","url":null,"abstract":"Machine learning classifiers for image recognition are prevalent in many applications. We study the problem of finding adversarial examples for such classifiers, i.e., to manipulate the images in such a way that they still look like the original images to a human but are misinterpreted by the classifier. Finding adversarial examples corresponds to a search problem in the image space. We focus on black-box attacks that can only use the original classifier to guide the search. The challenge is not to find adversarial examples, but rather to find them efficiently, ideally in real time. We show two novel methods that increase the efficiency of black-box search algorithms for adversarial examples: The first uses a relevance mask, i.e., a bitmask on the original image that restricts the search to those pixels that appear to be more relevant to the attacked classifier than others. The second exploits the discovery of merge drift, a phenomenon that negatively affects search algorithms that are based on the merging of image candidates. We evaluate both concepts on existing and new algorithms.","PeriodicalId":130265,"journal":{"name":"Proceedings of the 2020 Workshop on DYnamic and Novel Advances in Machine Learning and Intelligent Cyber Security","volume":"162 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121640439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Defending Against Adversarial Denial-of-Service Data Poisoning Attacks 防御对抗性拒绝服务数据中毒攻击

Proceedings of the 2020 Workshop on DYnamic and Novel Advances in Machine Learning and Intelligent Cyber Security

Pub Date : 2020-12-07 DOI: 10.1145/3477997.3478017

N. Müller, Simon Roschmann, Konstantin Böttinger

Data poisoning is one of the most relevant security threats against machine learning and data-driven technologies. Since many applications rely on untrusted training data, an attacker can easily craft malicious samples and inject them into the training dataset to degrade the performance of machine learning models. As recent work has shown, such Denial-of-Service (DoS) data poisoning attacks are highly effective. To mitigate this threat, we propose a new approach of detecting DoS poisoned instances. In comparison to related work, we deviate from clustering and anomaly detection based approaches, which often suffer from the curse of dimensionality and arbitrary anomaly threshold selection. Rather, our defence is based on extracting information from the training data in such a generalized manner that we can identify poisoned samples based on the information present in the unpoisoned portion of the data. We evaluate our defence against two DoS poisoning attacks and seven datasets, and find that it reliably identifies poisoned instances. In comparison to related work, our defence improves false positive / false negative rates by at least 50%, often more.

数据中毒是对机器学习和数据驱动技术最相关的安全威胁之一。由于许多应用程序依赖于不可信的训练数据，攻击者可以很容易地制作恶意样本并将其注入训练数据集中，以降低机器学习模型的性能。最近的研究表明，这种拒绝服务(DoS)数据中毒攻击非常有效。为了减轻这种威胁，我们提出了一种检测DoS中毒实例的新方法。与相关工作相比，我们偏离了基于聚类和异常检测的方法，这些方法经常受到维度诅咒和任意异常阈值选择的影响。相反，我们的防御是基于从训练数据中提取信息，以这样一种广义的方式，我们可以根据数据中未中毒部分的信息来识别中毒样本。我们评估了我们对两个DoS中毒攻击和七个数据集的防御，并发现它可靠地识别中毒实例。与相关工作相比，我们的防御将假阳性/假阴性率提高了至少50%，甚至更多。

{"title":"Defending Against Adversarial Denial-of-Service Data Poisoning Attacks","authors":"N. Müller, Simon Roschmann, Konstantin Böttinger","doi":"10.1145/3477997.3478017","DOIUrl":"https://doi.org/10.1145/3477997.3478017","url":null,"abstract":"Data poisoning is one of the most relevant security threats against machine learning and data-driven technologies. Since many applications rely on untrusted training data, an attacker can easily craft malicious samples and inject them into the training dataset to degrade the performance of machine learning models. As recent work has shown, such Denial-of-Service (DoS) data poisoning attacks are highly effective. To mitigate this threat, we propose a new approach of detecting DoS poisoned instances. In comparison to related work, we deviate from clustering and anomaly detection based approaches, which often suffer from the curse of dimensionality and arbitrary anomaly threshold selection. Rather, our defence is based on extracting information from the training data in such a generalized manner that we can identify poisoned samples based on the information present in the unpoisoned portion of the data. We evaluate our defence against two DoS poisoning attacks and seven datasets, and find that it reliably identifies poisoned instances. In comparison to related work, our defence improves false positive / false negative rates by at least 50%, often more.","PeriodicalId":130265,"journal":{"name":"Proceedings of the 2020 Workshop on DYnamic and Novel Advances in Machine Learning and Intelligent Cyber Security","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133874062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Program Behavior Analysis and Clustering using Performance Counters 使用性能计数器的程序行为分析和聚类

Proceedings of the 2020 Workshop on DYnamic and Novel Advances in Machine Learning and Intelligent Cyber Security

Pub Date : 2020-12-07 DOI: 10.1145/3477997.3478011

S. Kadiyala, Kartheek Akella, Tram Truong-Huu

Understanding the dynamic behavior of computer programs during normal working conditions is an important task, which has multiple security benefits such as the development of behavior-based anomaly detection, vulnerability discovery, and patching. Existing works achieved this goal by collecting and analyzing various data including network traffic, system calls, instruction traces, etc. In this paper, we explore the use of a new type of data, performance counters, to analyze the dynamic behavior of programs. Using existing primitives, we develop a tool named perfextract to capture data from different performance counters for a program during its startup time, thus forming multiple time series to represent the dynamic behavior of the program. We analyze the collected data and develop a semi-supervised clustering algorithm that allows us to classify each program using its performance counter time series into a specific group and to identify the intrinsic behavior of that group. We carry out extensive experiments with 18 real-world programs that belong to 4 groups including web browsers, text editors, image viewers, and audio players. The experimental results show that the examined programs can be accurately differentiated based on their performance counter data regardless of whether programs are run in physical or virtual environments.

了解计算机程序在正常工作条件下的动态行为是一项重要的任务，它具有多种安全益处，例如基于行为的异常检测，漏洞发现和补丁的开发。现有的工作通过收集和分析各种数据，包括网络流量、系统调用、指令跟踪等，实现了这一目标。在本文中，我们探索了使用一种新型的数据，性能计数器，来分析程序的动态行为。使用现有的原语，我们开发了一个名为perfextract的工具，用于在程序启动时从不同的性能计数器中捕获数据，从而形成多个时间序列来表示程序的动态行为。我们分析了收集到的数据，并开发了一种半监督聚类算法，该算法允许我们使用其性能计数器时间序列将每个程序分类为特定组，并识别该组的内在行为。我们对18个真实世界的程序进行了广泛的实验，这些程序属于4组，包括网页浏览器、文本编辑器、图像查看器和音频播放器。实验结果表明，无论程序是在物理环境还是虚拟环境中运行，都可以根据其性能计数器数据准确地区分所检查的程序。

{"title":"Program Behavior Analysis and Clustering using Performance Counters","authors":"S. Kadiyala, Kartheek Akella, Tram Truong-Huu","doi":"10.1145/3477997.3478011","DOIUrl":"https://doi.org/10.1145/3477997.3478011","url":null,"abstract":"Understanding the dynamic behavior of computer programs during normal working conditions is an important task, which has multiple security benefits such as the development of behavior-based anomaly detection, vulnerability discovery, and patching. Existing works achieved this goal by collecting and analyzing various data including network traffic, system calls, instruction traces, etc. In this paper, we explore the use of a new type of data, performance counters, to analyze the dynamic behavior of programs. Using existing primitives, we develop a tool named perfextract to capture data from different performance counters for a program during its startup time, thus forming multiple time series to represent the dynamic behavior of the program. We analyze the collected data and develop a semi-supervised clustering algorithm that allows us to classify each program using its performance counter time series into a specific group and to identify the intrinsic behavior of that group. We carry out extensive experiments with 18 real-world programs that belong to 4 groups including web browsers, text editors, image viewers, and audio players. The experimental results show that the examined programs can be accurately differentiated based on their performance counter data regardless of whether programs are run in physical or virtual environments.","PeriodicalId":130265,"journal":{"name":"Proceedings of the 2020 Workshop on DYnamic and Novel Advances in Machine Learning and Intelligent Cyber Security","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115395631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Optimizing Information Loss Towards Robust Neural Networks 面向鲁棒神经网络的信息损失优化

Proceedings of the 2020 Workshop on DYnamic and Novel Advances in Machine Learning and Intelligent Cyber Security

Pub Date : 2020-08-07 DOI: 10.1145/3477997.3478016

Philip Sperl, Konstantin Böttinger

Neural Networks (NNs) are vulnerable to adversarial examples. Such inputs differ only slightly from their benign counterparts yet provoke misclassifications of the attacked NNs. The perturbations required to craft the examples are often negligible and even human-imperceptible. To protect deep learning-based systems from such attacks, several countermeasures have been proposed with adversarial training still being considered the most effective. Here, NNs are iteratively retrained using adversarial examples forming a computationally expensive and time consuming process, which often leads to a performance decrease. To overcome the downsides of adversarial training while still providing a high level of security, we present a new training approach we call entropic retraining. Based on an information-theoretic-inspired analysis, we investigate the effects of adversarial training and achieve a robustness increase without laboriously generating adversarial examples. With our prototype implementation we validate and show the effectiveness of our approach for various NN architectures and data sets. We empirically show that entropic retraining leads to a significant increase in NNs’ security and robustness while only relying on the given original data.

神经网络(NNs)容易受到对抗性示例的影响。这样的输入与良性输入只有轻微的不同，但却会引起被攻击的神经网络的错误分类。制作示例所需的扰动通常可以忽略不计，甚至是人类无法察觉的。为了保护基于深度学习的系统免受此类攻击，已经提出了几种对策，对抗性训练仍然被认为是最有效的。在这里，使用对抗性示例迭代地重新训练神经网络，形成一个计算昂贵且耗时的过程，这通常会导致性能下降。为了克服对抗性训练的缺点，同时仍然提供高水平的安全性，我们提出了一种新的训练方法，我们称之为熵再训练。基于信息理论启发的分析，我们研究了对抗性训练的效果，并在不费力生成对抗性示例的情况下实现了鲁棒性的提高。通过我们的原型实现，我们验证并展示了我们的方法对各种神经网络架构和数据集的有效性。我们的经验表明，熵再训练导致神经网络的安全性和鲁棒性显著提高，而只依赖于给定的原始数据。

{"title":"Optimizing Information Loss Towards Robust Neural Networks","authors":"Philip Sperl, Konstantin Böttinger","doi":"10.1145/3477997.3478016","DOIUrl":"https://doi.org/10.1145/3477997.3478016","url":null,"abstract":"Neural Networks (NNs) are vulnerable to adversarial examples. Such inputs differ only slightly from their benign counterparts yet provoke misclassifications of the attacked NNs. The perturbations required to craft the examples are often negligible and even human-imperceptible. To protect deep learning-based systems from such attacks, several countermeasures have been proposed with adversarial training still being considered the most effective. Here, NNs are iteratively retrained using adversarial examples forming a computationally expensive and time consuming process, which often leads to a performance decrease. To overcome the downsides of adversarial training while still providing a high level of security, we present a new training approach we call entropic retraining. Based on an information-theoretic-inspired analysis, we investigate the effects of adversarial training and achieve a robustness increase without laboriously generating adversarial examples. With our prototype implementation we validate and show the effectiveness of our approach for various NN architectures and data sets. We empirically show that entropic retraining leads to a significant increase in NNs’ security and robustness while only relying on the given original data.","PeriodicalId":130265,"journal":{"name":"Proceedings of the 2020 Workshop on DYnamic and Novel Advances in Machine Learning and Intelligent Cyber Security","volume":"305 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125197898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Proceedings of the 2020 Workshop on DYnamic and Novel Advances in Machine Learning and Intelligent Cyber Security 2020年机器学习与智能网络安全动态与新进展研讨会论文集

Proceedings of the 2020 Workshop on DYnamic and Novel Advances in Machine Learning and Intelligent Cyber Security

Pub Date : 1900-01-01 DOI: 10.1145/3477997

引用次数: 0

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 2020 Workshop on DYnamic and Novel Advances in Machine Learning and Intelligent Cyber Security

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀