Guilt by association: large scale malware detection by mining file-relation graphs

Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining Pub Date : 2014-08-24 DOI:10.1145/2623330.2623342

Acar Tamersoy, Kevin A. Roundy, Duen Horng Chau

{"title":"Guilt by association: large scale malware detection by mining file-relation graphs","authors":"Acar Tamersoy, Kevin A. Roundy, Duen Horng Chau","doi":"10.1145/2623330.2623342","DOIUrl":null,"url":null,"abstract":"The increasing sophistication of malicious software calls for new defensive techniques that are harder to evade, and are capable of protecting users against novel threats. We present AESOP, a scalable algorithm that identifies malicious executable files by applying Aesop's moral that \"a man is known by the company he keeps.\" We use a large dataset voluntarily contributed by the members of Norton Community Watch, consisting of partial lists of the files that exist on their machines, to identify close relationships between files that often appear together on machines. AESOP leverages locality-sensitive hashing to measure the strength of these inter-file relationships to construct a graph, on which it performs large scale inference by propagating information from the labeled files (as benign or malicious) to the preponderance of unlabeled files. AESOP attained early labeling of 99% of benign files and 79% of malicious files, over a week before they are labeled by the state-of-the-art techniques, with a 0.9961 true positive rate at flagging malware, at 0.0001 false positive rate.","PeriodicalId":20536,"journal":{"name":"Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":"22 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2014-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"165","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2623330.2623342","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 165

Abstract

The increasing sophistication of malicious software calls for new defensive techniques that are harder to evade, and are capable of protecting users against novel threats. We present AESOP, a scalable algorithm that identifies malicious executable files by applying Aesop's moral that "a man is known by the company he keeps." We use a large dataset voluntarily contributed by the members of Norton Community Watch, consisting of partial lists of the files that exist on their machines, to identify close relationships between files that often appear together on machines. AESOP leverages locality-sensitive hashing to measure the strength of these inter-file relationships to construct a graph, on which it performs large scale inference by propagating information from the labeled files (as benign or malicious) to the preponderance of unlabeled files. AESOP attained early labeling of 99% of benign files and 79% of malicious files, over a week before they are labeled by the state-of-the-art techniques, with a 0.9961 true positive rate at flagging malware, at 0.0001 false positive rate.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

关联犯罪:通过挖掘文件关系图进行大规模恶意软件检测

越来越复杂的恶意软件要求新的防御技术，更难以逃避，并能够保护用户免受新的威胁。我们介绍了AESOP，这是一种可扩展的算法，通过应用伊索的道德“谁与谁为邻”来识别恶意可执行文件。我们使用诺顿社区观察成员自愿提供的大型数据集，包括他们机器上存在的文件的部分列表，以识别经常出现在机器上的文件之间的密切关系。AESOP利用对位置敏感的散列来测量这些文件间关系的强度，以构建一个图，在这个图上，它通过将信息从标记文件(无论是良性的还是恶意的)传播到未标记文件的优势来执行大规模推断。AESOP对99%的良性文件和79%的恶意文件进行了早期标记，在最先进的技术标记之前一周，标记恶意软件的真阳性率为0.9961，假阳性率为0.0001。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining

自引率

0.00%

发文量