Defining malware families based on analyst insights

Jeffrey Gennari, David French
{"title":"Defining malware families based on analyst insights","authors":"Jeffrey Gennari, David French","doi":"10.1109/THS.2011.6107902","DOIUrl":null,"url":null,"abstract":"Determining whether arbitrary files are related to known malicious files is often useful in network and host-based defense. Doing so can give network defenders sufficient exemplars of a particular threat to develop comprehensive signatures and heuristics for identifying the threat, leading to decreased response time and improved prevention of a cyber attack. Identifying these malicious families is a complex process involving the categorization of potentially malicious code into sets that share similar features, while being distinguishable from unrelated threats or non-malicious code. Current methods for automatically or manually describing malware families are typically unable to distinguish between indicators derived from the structure of the malware and indicators derived from the behavior of the malware. Further, attempts to cluster potentially related files by mapping them into alternate domains, including histograms, fuzzy hashes, Bloom filters, and so on often produces clusters of files solely derived from structural information. These similarity measurements are often very effective on crudely similar files, yet they fail to identify files that have similar or identical behavior and semantics. We propose an analytic method, driven largely by human experience and based on objective criteria, for assigning arbitrary files membership in a malicious code family. We describe a process for iteratively refining the criteria used to select a malicious code family, until such criteria described are both necessary and sufficient to distinguish a particular malicious code family. We contrast this process with similar processes, such as antivirus signature generation and automatic and blind classification methods. We formalize this process to describe a roadmap for practitioners of malicious code analysis and to highlight opportunities for improvement and automation of both the process and the observation of relevant criteria. Finally, we provide experimental results of applying this methodology to real-world malware.","PeriodicalId":228322,"journal":{"name":"2011 IEEE International Conference on Technologies for Homeland Security (HST)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE International Conference on Technologies for Homeland Security (HST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/THS.2011.6107902","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Determining whether arbitrary files are related to known malicious files is often useful in network and host-based defense. Doing so can give network defenders sufficient exemplars of a particular threat to develop comprehensive signatures and heuristics for identifying the threat, leading to decreased response time and improved prevention of a cyber attack. Identifying these malicious families is a complex process involving the categorization of potentially malicious code into sets that share similar features, while being distinguishable from unrelated threats or non-malicious code. Current methods for automatically or manually describing malware families are typically unable to distinguish between indicators derived from the structure of the malware and indicators derived from the behavior of the malware. Further, attempts to cluster potentially related files by mapping them into alternate domains, including histograms, fuzzy hashes, Bloom filters, and so on often produces clusters of files solely derived from structural information. These similarity measurements are often very effective on crudely similar files, yet they fail to identify files that have similar or identical behavior and semantics. We propose an analytic method, driven largely by human experience and based on objective criteria, for assigning arbitrary files membership in a malicious code family. We describe a process for iteratively refining the criteria used to select a malicious code family, until such criteria described are both necessary and sufficient to distinguish a particular malicious code family. We contrast this process with similar processes, such as antivirus signature generation and automatic and blind classification methods. We formalize this process to describe a roadmap for practitioners of malicious code analysis and to highlight opportunities for improvement and automation of both the process and the observation of relevant criteria. Finally, we provide experimental results of applying this methodology to real-world malware.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
根据分析人员的见解定义恶意软件家族
确定任意文件是否与已知的恶意文件有关,在基于网络和主机的防御中通常是有用的。这样做可以为网络防御者提供足够的特定威胁示例,以开发全面的签名和启发式方法来识别威胁,从而减少响应时间并改进对网络攻击的预防。识别这些恶意代码族是一个复杂的过程,包括将潜在恶意代码分类为具有相似特征的代码集,同时将其与不相关的威胁或非恶意代码区分开来。目前用于自动或手动描述恶意软件家族的方法通常无法区分源自恶意软件结构的指标和源自恶意软件行为的指标。此外,试图通过将潜在相关文件映射到其他域(包括直方图、模糊散列、Bloom过滤器等)来对它们进行聚类,通常会产生仅从结构信息派生的文件簇。这些相似性度量通常对大致相似的文件非常有效,但是它们无法识别具有相似或相同行为和语义的文件。我们提出了一种分析方法,主要由人类经验驱动并基于客观标准,用于分配恶意代码家族中的任意文件成员。我们描述了一个迭代细化用于选择恶意代码族的标准的过程,直到所描述的这些标准既必要又足以区分特定的恶意代码族。我们将此过程与类似的过程进行了对比,例如反病毒签名生成以及自动和盲分类方法。我们将此过程形式化,以描述恶意代码分析从业者的路线图,并强调改进和自动化过程以及观察相关标准的机会。最后,我们提供了将该方法应用于实际恶意软件的实验结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Face recognition despite missing information Separating the baby from the bathwater: Toward a generic and practical framework for anonymization A calibration free hybrid RF and video surveillance system for reliable tracking and identification Low cost, pervasive detection of radiation threats Avoiding the closure of ports during a national emergency
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1