首页 > 最新文献

Proceedings of the 2013 ACM workshop on Artificial intelligence and security最新文献

英文 中文
Session details: Security in societal computing 会议细节:社会计算中的安全
P. Laskov
{"title":"Session details: Security in societal computing","authors":"P. Laskov","doi":"10.1145/3249989","DOIUrl":"https://doi.org/10.1145/3249989","url":null,"abstract":"","PeriodicalId":422398,"journal":{"name":"Proceedings of the 2013 ACM workshop on Artificial intelligence and security","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126012649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Early security classification of skype users via machine learning 通过机器学习对skype用户进行早期安全分类
Pub Date : 2013-11-04 DOI: 10.1145/2517312.2517322
A. Leontjeva, M. Goldszmidt, Yinglian Xie, Fang Yu, M. Abadi
We investigate possible improvements in online fraud detection based on information about users and their interactions. We develop, apply, and evaluate our methods in the context of Skype. Specifically, in Skype, we aim to provide tools that identify fraudsters that have eluded the first line of detection systems and have been active for months. Our approach to automation is based on machine learning methods. We rely on a variety of features present in the data, including static user profiles (e.g., age), dynamic product usage (e.g., time series of calls), local social behavior (addition/deletion of friends), and global social features (e.g., PageRank). We introduce new techniques for pre-processing the dynamic (time series) features and fusing them with social features. We provide a thorough analysis of the usefulness of the different categories of features and of the effectiveness of our new techniques.
我们研究了基于用户及其交互信息的在线欺诈检测的可能改进。我们开发,应用,并评估我们的方法在Skype的背景下。具体来说,在Skype中,我们的目标是提供工具,识别那些躲过了一线检测系统并已活跃数月的欺诈者。我们的自动化方法是基于机器学习方法。我们依赖于数据中存在的各种特征,包括静态用户配置文件(例如,年龄),动态产品使用情况(例如,呼叫的时间序列),本地社交行为(添加/删除朋友)和全局社交特征(例如,PageRank)。我们引入了新的技术来预处理动态(时间序列)特征,并将其与社会特征融合。我们对不同类别的功能的有用性和新技术的有效性进行了全面的分析。
{"title":"Early security classification of skype users via machine learning","authors":"A. Leontjeva, M. Goldszmidt, Yinglian Xie, Fang Yu, M. Abadi","doi":"10.1145/2517312.2517322","DOIUrl":"https://doi.org/10.1145/2517312.2517322","url":null,"abstract":"We investigate possible improvements in online fraud detection based on information about users and their interactions. We develop, apply, and evaluate our methods in the context of Skype. Specifically, in Skype, we aim to provide tools that identify fraudsters that have eluded the first line of detection systems and have been active for months. Our approach to automation is based on machine learning methods. We rely on a variety of features present in the data, including static user profiles (e.g., age), dynamic product usage (e.g., time series of calls), local social behavior (addition/deletion of friends), and global social features (e.g., PageRank). We introduce new techniques for pre-processing the dynamic (time series) features and fusing them with social features. We provide a thorough analysis of the usefulness of the different categories of features and of the effectiveness of our new techniques.","PeriodicalId":422398,"journal":{"name":"Proceedings of the 2013 ACM workshop on Artificial intelligence and security","volume":"125 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133265391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Session details: Keynote address 会议详情:主题演讲
B. Nelson
{"title":"Session details: Keynote address","authors":"B. Nelson","doi":"10.1145/3249988","DOIUrl":"https://doi.org/10.1145/3249988","url":null,"abstract":"","PeriodicalId":422398,"journal":{"name":"Proceedings of the 2013 ACM workshop on Artificial intelligence and security","volume":"1844 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129833186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Is data clustering in adversarial settings secure? 对抗性设置中的数据聚类安全吗?
Pub Date : 2013-11-04 DOI: 10.1145/2517312.2517321
B. Biggio, I. Pillai, S. R. Bulò, Davide Ariu, M. Pelillo, F. Roli
Clustering algorithms have been increasingly adopted in security applications to spot dangerous or illicit activities. However, they have not been originally devised to deal with deliberate attack attempts that may aim to subvert the clustering process itself. Whether clustering can be safely adopted in such settings remains thus questionable. In this work we propose a general framework that allows one to identify potential attacks against clustering algorithms, and to evaluate their impact, by making specific assumptions on the adversary's goal, knowledge of the attacked system, and capabilities of manipulating the input data. We show that an attacker may significantly poison the whole clustering process by adding a relatively small percentage of attack samples to the input data, and that some attack samples may be obfuscated to be hidden within some existing clusters. We present a case study on single-linkage hierarchical clustering, and report experiments on clustering of malware samples and handwritten digits.
聚类算法已越来越多地应用于安全应用程序,以发现危险或非法活动。然而,它们最初的设计并不是为了处理可能旨在破坏集群过程本身的蓄意攻击企图。因此,在这种情况下是否可以安全地采用聚类仍然值得怀疑。在这项工作中,我们提出了一个通用框架,允许人们识别针对聚类算法的潜在攻击,并通过对对手的目标、被攻击系统的知识和操纵输入数据的能力做出特定假设来评估其影响。我们表明,攻击者可以通过在输入数据中添加相对较小百分比的攻击样本来显著地毒害整个聚类过程,并且一些攻击样本可能会被混淆以隐藏在一些现有的聚类中。我们给出了一个单链接分层聚类的案例研究,并报告了恶意软件样本和手写数字聚类的实验。
{"title":"Is data clustering in adversarial settings secure?","authors":"B. Biggio, I. Pillai, S. R. Bulò, Davide Ariu, M. Pelillo, F. Roli","doi":"10.1145/2517312.2517321","DOIUrl":"https://doi.org/10.1145/2517312.2517321","url":null,"abstract":"Clustering algorithms have been increasingly adopted in security applications to spot dangerous or illicit activities. However, they have not been originally devised to deal with deliberate attack attempts that may aim to subvert the clustering process itself. Whether clustering can be safely adopted in such settings remains thus questionable. In this work we propose a general framework that allows one to identify potential attacks against clustering algorithms, and to evaluate their impact, by making specific assumptions on the adversary's goal, knowledge of the attacked system, and capabilities of manipulating the input data. We show that an attacker may significantly poison the whole clustering process by adding a relatively small percentage of attack samples to the input data, and that some attack samples may be obfuscated to be hidden within some existing clusters. We present a case study on single-linkage hierarchical clustering, and report experiments on clustering of malware samples and handwritten digits.","PeriodicalId":422398,"journal":{"name":"Proceedings of the 2013 ACM workshop on Artificial intelligence and security","volume":"8 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116358869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 119
What you want is not what you get: predicting sharing policies for text-based content on facebook 你想要的不是你得到的:预测facebook上基于文本的内容的共享政策
Pub Date : 2013-11-04 DOI: 10.1145/2517312.2517317
Arunesh Sinha, Yan Li, Lujo Bauer
As the amount of content users publish on social networking sites rises, so do the danger and costs of inadvertently sharing content with an unintended audience. Studies repeatedly show that users frequently misconfigure their policies or misunderstand the privacy features offered by social networks. A way to mitigate these problems is to develop automated tools to assist users in correctly setting their policy. This paper explores the viability of one such approach: we examine the extent to which machine learning can be used to deduce users' sharing preferences for content posted on Facebook. To generate data on which to evaluate our approach, we conduct an online survey of Facebook users, gathering their Facebook posts and associated policies, as well as their intended privacy policy for a subset of the posts. We use this data to test the efficacy of several algorithms at predicting policies, and the effects on prediction accuracy of varying the features on which they base their predictions. We find that Facebook's default behavior of assigning to a new post the privacy settings of the preceding one correctly assigns policies for only 67% of posts. The best of the prediction algorithms we tested outperforms this baseline for 80% of participants, with an average accuracy of 81%; this equates to a 45% reduction in the number of posts with misconfigured policies. Further, for those participants (66%) whose implemented policy usually matched their intended policy, our approach predicts the correct privacy settings for 94% of posts.
随着用户在社交网站上发布的内容数量的增加,无意中与意想不到的受众分享内容的风险和成本也在增加。研究反复表明,用户经常会错误地配置他们的策略,或者误解社交网络提供的隐私功能。缓解这些问题的一种方法是开发自动化工具来帮助用户正确设置策略。本文探讨了这样一种方法的可行性:我们研究了机器学习在多大程度上可以用来推断用户对Facebook上发布的内容的分享偏好。为了生成评估我们方法的数据,我们对Facebook用户进行了一项在线调查,收集他们的Facebook帖子和相关政策,以及他们对部分帖子的预期隐私政策。我们使用这些数据来测试几种算法在预测策略方面的有效性,以及改变其预测所基于的特征对预测准确性的影响。我们发现Facebook为新帖子分配前一个帖子的隐私设置的默认行为仅为67%的帖子正确分配策略。我们测试的最好的预测算法在80%的参与者中表现优于这个基线,平均准确率为81%;这相当于策略配置错误的岗位数量减少了45%。此外,对于那些执行策略通常与预期策略相匹配的参与者(66%),我们的方法预测了94%帖子的正确隐私设置。
{"title":"What you want is not what you get: predicting sharing policies for text-based content on facebook","authors":"Arunesh Sinha, Yan Li, Lujo Bauer","doi":"10.1145/2517312.2517317","DOIUrl":"https://doi.org/10.1145/2517312.2517317","url":null,"abstract":"As the amount of content users publish on social networking sites rises, so do the danger and costs of inadvertently sharing content with an unintended audience. Studies repeatedly show that users frequently misconfigure their policies or misunderstand the privacy features offered by social networks. A way to mitigate these problems is to develop automated tools to assist users in correctly setting their policy. This paper explores the viability of one such approach: we examine the extent to which machine learning can be used to deduce users' sharing preferences for content posted on Facebook. To generate data on which to evaluate our approach, we conduct an online survey of Facebook users, gathering their Facebook posts and associated policies, as well as their intended privacy policy for a subset of the posts. We use this data to test the efficacy of several algorithms at predicting policies, and the effects on prediction accuracy of varying the features on which they base their predictions. We find that Facebook's default behavior of assigning to a new post the privacy settings of the preceding one correctly assigns policies for only 67% of posts. The best of the prediction algorithms we tested outperforms this baseline for 80% of participants, with an average accuracy of 81%; this equates to a 45% reduction in the number of posts with misconfigured policies. Further, for those participants (66%) whose implemented policy usually matched their intended policy, our approach predicts the correct privacy settings for 94% of posts.","PeriodicalId":422398,"journal":{"name":"Proceedings of the 2013 ACM workshop on Artificial intelligence and security","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121533138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Using naive bayes to detect spammy names in social networks 使用朴素贝叶斯检测社交网络中的垃圾邮件名称
Pub Date : 2013-11-04 DOI: 10.1145/2517312.2517314
D. Freeman
Many social networks are predicated on the assumption that a member's online information reflects his or her real identity. In such networks, members who fill their name fields with fictitious identities, company names, phone numbers, or just gibberish are violating the terms of service, polluting search results, and degrading the value of the site to real members. Finding and removing these accounts on the basis of their spammy names can both improve the site experience for real members and prevent further abusive activity. In this paper we describe a set of features that can be used by a Naive Bayes classifier to find accounts whose names do not represent real people. The model can detect both automated and human abusers and can be used at registration time, before other signals such as social graph or clickstream history are present. We use member data from LinkedIn to train and validate our model and to choose parameters. Our best-scoring model achieves AUC 0.85 on a sequestered test set. We ran the algorithm on live LinkedIn data for one month in parallel with our previous name scoring algorithm based on regular expressions. The false positive rate of our new algorithm (3.3%) was less than half that of the previous algorithm (7.0%). When the algorithm is run on email usernames as well as user-entered first and last names, it provides an effective way to catch not only bad human actors but also bots that have poor name and email generation algorithms.
许多社交网络建立在一个假设之上,即会员的在线信息反映了他或她的真实身份。在这样的网络中,会员在自己的名字栏中填写虚假的身份、公司名称、电话号码或只是胡言乱语,这不仅违反了服务条款,而且污染了搜索结果,降低了网站对真实会员的价值。查找和删除这些帐户的基础上,他们的垃圾名称,既可以改善网站的经验,为真正的成员和防止进一步滥用活动。在本文中,我们描述了一组可以被朴素贝叶斯分类器用来查找名字不代表真实人物的帐户的特征。该模型可以检测到自动和人为滥用者,可以在注册时使用,在社交图谱或点击流历史等其他信号出现之前。我们使用LinkedIn的会员数据来训练和验证我们的模型,并选择参数。我们的最佳评分模型在隔离的测试集上达到了0.85的AUC。我们在LinkedIn的实时数据上运行了一个月的算法,与之前基于正则表达式的名字评分算法并行。新算法的误报率(3.3%)不到前算法(7.0%)的一半。当算法在电子邮件用户名以及用户输入的名字和姓氏上运行时,它提供了一种有效的方法,不仅可以捕获不良的人类参与者,还可以捕获具有不良名称和电子邮件生成算法的机器人。
{"title":"Using naive bayes to detect spammy names in social networks","authors":"D. Freeman","doi":"10.1145/2517312.2517314","DOIUrl":"https://doi.org/10.1145/2517312.2517314","url":null,"abstract":"Many social networks are predicated on the assumption that a member's online information reflects his or her real identity. In such networks, members who fill their name fields with fictitious identities, company names, phone numbers, or just gibberish are violating the terms of service, polluting search results, and degrading the value of the site to real members. Finding and removing these accounts on the basis of their spammy names can both improve the site experience for real members and prevent further abusive activity. In this paper we describe a set of features that can be used by a Naive Bayes classifier to find accounts whose names do not represent real people. The model can detect both automated and human abusers and can be used at registration time, before other signals such as social graph or clickstream history are present. We use member data from LinkedIn to train and validate our model and to choose parameters. Our best-scoring model achieves AUC 0.85 on a sequestered test set. We ran the algorithm on live LinkedIn data for one month in parallel with our previous name scoring algorithm based on regular expressions. The false positive rate of our new algorithm (3.3%) was less than half that of the previous algorithm (7.0%). When the algorithm is run on email usernames as well as user-entered first and last names, it provides an effective way to catch not only bad human actors but also bots that have poor name and email generation algorithms.","PeriodicalId":422398,"journal":{"name":"Proceedings of the 2013 ACM workshop on Artificial intelligence and security","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126974955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 53
On the hardness of evading combinations of linear classifiers 线性分类器组合回避的硬度
Pub Date : 2013-11-04 DOI: 10.1145/2517312.2517318
David Stevens, Daniel Lowd
An increasing number of machine learning applications involve detecting the malicious behavior of an attacker who wishes to avoid detection. In such domains, attackers modify their behavior to evade the classifier while accomplishing their goals as efficiently as possible. The attackers typically do not know the exact classifier parameters, but they may be able to evade it by observing the classifier's behavior on test instances that they construct. For example, spammers may learn the most effective ways to modify their spams by sending test emails to accounts they control. This problem setting has been formally analyzed for linear classifiers with discrete features and convex-inducing classifiers with continuous features, but never for non-linear classifiers with discrete features. In this paper, we extend previous ACRE learning results to convex polytopes representing unions or intersections of linear classifiers. We prove that exponentially many queries are required in the worst case, but that when the features used by the component classifiers are disjoint, previous attacks on linear classifiers can be adapted to efficiently attack them. In experiments, we further analyze the cost and number of queries required to attack different types of classifiers. These results move us closer to a comprehensive understanding of the relative vulnerability of different types of classifiers to malicious adversaries.
越来越多的机器学习应用程序涉及检测希望避免检测的攻击者的恶意行为。在这些领域中,攻击者修改他们的行为以逃避分类器,同时尽可能高效地完成他们的目标。攻击者通常不知道确切的分类器参数,但是他们可以通过观察他们构建的测试实例上的分类器的行为来规避它。例如,垃圾邮件发送者可以通过向他们控制的帐户发送测试电子邮件来学习最有效的修改垃圾邮件的方法。对于具有离散特征的线性分类器和具有连续特征的凸诱导分类器,已经正式分析了这个问题设置,但从未对具有离散特征的非线性分类器进行过分析。在本文中,我们将以前的ACRE学习结果扩展到表示线性分类器的并集或交集的凸多边形。我们证明了在最坏的情况下需要指数级的查询,但是当组件分类器使用的特征不相交时,可以调整先前对线性分类器的攻击来有效地攻击它们。在实验中,我们进一步分析了攻击不同类型分类器所需的查询成本和数量。这些结果使我们更接近于全面了解不同类型的分类器对恶意对手的相对脆弱性。
{"title":"On the hardness of evading combinations of linear classifiers","authors":"David Stevens, Daniel Lowd","doi":"10.1145/2517312.2517318","DOIUrl":"https://doi.org/10.1145/2517312.2517318","url":null,"abstract":"An increasing number of machine learning applications involve detecting the malicious behavior of an attacker who wishes to avoid detection. In such domains, attackers modify their behavior to evade the classifier while accomplishing their goals as efficiently as possible. The attackers typically do not know the exact classifier parameters, but they may be able to evade it by observing the classifier's behavior on test instances that they construct. For example, spammers may learn the most effective ways to modify their spams by sending test emails to accounts they control. This problem setting has been formally analyzed for linear classifiers with discrete features and convex-inducing classifiers with continuous features, but never for non-linear classifiers with discrete features. In this paper, we extend previous ACRE learning results to convex polytopes representing unions or intersections of linear classifiers. We prove that exponentially many queries are required in the worst case, but that when the features used by the component classifiers are disjoint, previous attacks on linear classifiers can be adapted to efficiently attack them. In experiments, we further analyze the cost and number of queries required to attack different types of classifiers. These results move us closer to a comprehensive understanding of the relative vulnerability of different types of classifiers to malicious adversaries.","PeriodicalId":422398,"journal":{"name":"Proceedings of the 2013 ACM workshop on Artificial intelligence and security","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129308659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Session details: Adversarial learning 会议细节:对抗性学习
Christos Dimitrakakis
{"title":"Session details: Adversarial learning","authors":"Christos Dimitrakakis","doi":"10.1145/3249991","DOIUrl":"https://doi.org/10.1145/3249991","url":null,"abstract":"","PeriodicalId":422398,"journal":{"name":"Proceedings of the 2013 ACM workshop on Artificial intelligence and security","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121063291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Structural detection of android malware using embedded call graphs 使用嵌入式调用图的android恶意软件结构检测
Pub Date : 2013-11-04 DOI: 10.1145/2517312.2517315
Hugo Gascon, Fabian Yamaguchi, Dan Arp, Konrad Rieck
The number of malicious applications targeting the Android system has literally exploded in recent years. While the security community, well aware of this fact, has proposed several methods for detection of Android malware, most of these are based on permission and API usage or the identification of expert features. Unfortunately, many of these approaches are susceptible to instruction level obfuscation techniques. Previous research on classic desktop malware has shown that some high level characteristics of the code, such as function call graphs, can be used to find similarities between samples while being more robust against certain obfuscation strategies. However, the identification of similarities in graphs is a non-trivial problem whose complexity hinders the use of these features for malware detection. In this paper, we explore how recent developments in machine learning classification of graphs can be efficiently applied to this problem. We propose a method for malware detection based on efficient embeddings of function call graphs with an explicit feature map inspired by a linear-time graph kernel. In an evaluation with 12,158 malware samples our method, purely based on structural features, outperforms several related approaches and detects 89% of the malware with few false alarms, while also allowing to pin-point malicious code structures within Android applications.
近年来,针对Android系统的恶意应用程序数量呈爆炸式增长。虽然安全社区很清楚这一事实,已经提出了几种检测Android恶意软件的方法,但大多数方法都是基于权限和API使用或专家特性的识别。不幸的是,这些方法中的许多都容易受到指令级混淆技术的影响。先前对经典桌面恶意软件的研究表明,代码的一些高级特征,如函数调用图,可以用来发现样本之间的相似性,同时对某些混淆策略更加健壮。然而,图中相似性的识别是一个非常重要的问题,其复杂性阻碍了这些特性在恶意软件检测中的使用。在本文中,我们探讨了机器学习分类图的最新发展如何有效地应用于这个问题。我们提出了一种基于函数调用图的有效嵌入的恶意软件检测方法,该方法具有受线性时间图核启发的显式特征映射。在对12158个恶意软件样本的评估中,我们的方法纯粹基于结构特征,优于几种相关方法,检测出89%的恶意软件,几乎没有假警报,同时还允许在Android应用程序中精确定位恶意代码结构。
{"title":"Structural detection of android malware using embedded call graphs","authors":"Hugo Gascon, Fabian Yamaguchi, Dan Arp, Konrad Rieck","doi":"10.1145/2517312.2517315","DOIUrl":"https://doi.org/10.1145/2517312.2517315","url":null,"abstract":"The number of malicious applications targeting the Android system has literally exploded in recent years. While the security community, well aware of this fact, has proposed several methods for detection of Android malware, most of these are based on permission and API usage or the identification of expert features. Unfortunately, many of these approaches are susceptible to instruction level obfuscation techniques. Previous research on classic desktop malware has shown that some high level characteristics of the code, such as function call graphs, can be used to find similarities between samples while being more robust against certain obfuscation strategies. However, the identification of similarities in graphs is a non-trivial problem whose complexity hinders the use of these features for malware detection. In this paper, we explore how recent developments in machine learning classification of graphs can be efficiently applied to this problem. We propose a method for malware detection based on efficient embeddings of function call graphs with an explicit feature map inspired by a linear-time graph kernel. In an evaluation with 12,158 malware samples our method, purely based on structural features, outperforms several related approaches and detects 89% of the malware with few false alarms, while also allowing to pin-point malicious code structures within Android applications.","PeriodicalId":422398,"journal":{"name":"Proceedings of the 2013 ACM workshop on Artificial intelligence and security","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129461507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 323
Off the beaten path: machine learning for offensive security 另类:用于攻击性安全的机器学习
Pub Date : 2013-11-04 DOI: 10.1145/2517312.2517313
Konrad Rieck
Machine learning has been widely used for defensive security. Numerous approaches have been devised that make use of learning techniques for detecting attacks and malicious software. By contrast, only very few research has studied how machine learning can be applied for offensive security. In this talk, we will explore this research direction and show how learning methods can be used for discovering vulnerabilities in software, finding information leaks in protected data, or supporting network reconnaissance. We discuss advantages and challenges of learning for offensive security as well as identify directions for future research.
机器学习已被广泛用于防御安全。已经设计了许多方法,利用学习技术来检测攻击和恶意软件。相比之下,只有很少的研究研究了如何将机器学习应用于攻击性安全。在这次演讲中,我们将探讨这一研究方向,并展示如何使用学习方法来发现软件中的漏洞,发现受保护数据中的信息泄漏,或支持网络侦察。我们讨论了进攻安全学习的优势和挑战,并确定了未来研究的方向。
{"title":"Off the beaten path: machine learning for offensive security","authors":"Konrad Rieck","doi":"10.1145/2517312.2517313","DOIUrl":"https://doi.org/10.1145/2517312.2517313","url":null,"abstract":"Machine learning has been widely used for defensive security. Numerous approaches have been devised that make use of learning techniques for detecting attacks and malicious software. By contrast, only very few research has studied how machine learning can be applied for offensive security. In this talk, we will explore this research direction and show how learning methods can be used for discovering vulnerabilities in software, finding information leaks in protected data, or supporting network reconnaissance. We discuss advantages and challenges of learning for offensive security as well as identify directions for future research.","PeriodicalId":422398,"journal":{"name":"Proceedings of the 2013 ACM workshop on Artificial intelligence and security","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114292449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Proceedings of the 2013 ACM workshop on Artificial intelligence and security
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1