Efficient Malware Analysis Using Metric Embeddings

Ethan M. Rudd, David B. Krisiloff, Scott E. Coull, Daniel Olszewski, Edward Raff, James Holt
{"title":"Efficient Malware Analysis Using Metric Embeddings","authors":"Ethan M. Rudd, David B. Krisiloff, Scott E. Coull, Daniel Olszewski, Edward Raff, James Holt","doi":"10.1145/3615669","DOIUrl":null,"url":null,"abstract":"Real-world malware analysis consists of a complex pipeline of classifiers and data analysis – from detection to classification of capabilities to retrieval of unique training samples from user systems. In this paper, we aim to reduce the complexity of these pipelines through the use of low-dimensional metric embeddings of Windows PE files, which can be used in a variety of downstream applications, including malware detection, family classification, and malware attribute tagging. Specifically, we enrich labeling of malicious and benign PE files with computationally-expensive, disassembly-based malicious capabilities information. Using this enhanced labeling, we derive several different types of efficient metric embeddings utilizing an embedding neural network trained via contrastive loss, Spearman rank correlation, and combinations thereof. Our evaluation examines performance on a variety of transfer tasks performed on the EMBER and SOREL datasets, demonstrating that low-dimensional, computationally-efficient metric embeddings maintain performance with little decay. This offers the potential to quickly retrain for a variety of transfer tasks at significantly reduced overhead and complexity. We conclude with an examination of practical considerations for the use of our proposed embedding approach, such as robustness to adversarial evasion and introduction of task-specific auxiliary objectives to improve performance on mission critical tasks.","PeriodicalId":202552,"journal":{"name":"Digital Threats: Research and Practice","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Threats: Research and Practice","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3615669","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Real-world malware analysis consists of a complex pipeline of classifiers and data analysis – from detection to classification of capabilities to retrieval of unique training samples from user systems. In this paper, we aim to reduce the complexity of these pipelines through the use of low-dimensional metric embeddings of Windows PE files, which can be used in a variety of downstream applications, including malware detection, family classification, and malware attribute tagging. Specifically, we enrich labeling of malicious and benign PE files with computationally-expensive, disassembly-based malicious capabilities information. Using this enhanced labeling, we derive several different types of efficient metric embeddings utilizing an embedding neural network trained via contrastive loss, Spearman rank correlation, and combinations thereof. Our evaluation examines performance on a variety of transfer tasks performed on the EMBER and SOREL datasets, demonstrating that low-dimensional, computationally-efficient metric embeddings maintain performance with little decay. This offers the potential to quickly retrain for a variety of transfer tasks at significantly reduced overhead and complexity. We conclude with an examination of practical considerations for the use of our proposed embedding approach, such as robustness to adversarial evasion and introduction of task-specific auxiliary objectives to improve performance on mission critical tasks.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用度量嵌入的有效恶意软件分析
现实世界的恶意软件分析包括一个复杂的分类器和数据分析管道——从检测到分类能力,再到从用户系统中检索独特的训练样本。在本文中,我们的目标是通过使用Windows PE文件的低维度量嵌入来降低这些管道的复杂性,这可以用于各种下游应用程序,包括恶意软件检测,家族分类和恶意软件属性标记。具体地说,我们用计算昂贵的、基于反汇编的恶意能力信息来丰富恶意和良性PE文件的标记。使用这种增强的标记,我们利用通过对比损失、Spearman秩相关及其组合训练的嵌入神经网络,推导出几种不同类型的有效度量嵌入。我们的评估检查了在EMBER和SOREL数据集上执行的各种传输任务的性能,证明了低维、计算效率高的度量嵌入保持了几乎没有衰减的性能。这提供了在显著降低开销和复杂性的情况下快速重新训练各种传输任务的潜力。最后,我们对使用我们提出的嵌入方法的实际考虑进行了检查,例如对抗性规避的鲁棒性和引入特定于任务的辅助目标以提高关键任务的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Causal Inconsistencies are Normal in Windows Memory Dumps (too) InvesTEE: A TEE-supported Framework for Lawful Remote Forensic Investigations Does Cyber Insurance promote Cyber Security Best Practice? An Analysis based on Insurance Application Forms Unveiling Cyber Threat Actors: A Hybrid Deep Learning Approach for Behavior-based Attribution A Framework for Enhancing Social Media Misinformation Detection with Topical-Tactics
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1