用于自动恶意软件分类的分层学习

Shayok Chakraborty, J. W. Stokes, Lin Xiao, Dengyong Zhou, M. Marinescu, Anil Thomas
{"title":"用于自动恶意软件分类的分层学习","authors":"Shayok Chakraborty, J. W. Stokes, Lin Xiao, Dengyong Zhou, M. Marinescu, Anil Thomas","doi":"10.1109/MILCOM.2017.8170758","DOIUrl":null,"url":null,"abstract":"Despite widespread use of commercial anti-virus products, the number of malicious files detected on home and corporate computers continues to increase at a significant rate. Recently, anti-virus companies have started investing in machine learning solutions to augment signatures manually designed by analysts. A malicious file's determination is often represented as a hierarchical structure consisting of a type (e.g. Worm, Backdoor), a platform (e.g. Win32, Win64), a family (e.g. Rbot, Rugrat) and a family variant (e.g. A, B). While there has been substantial research in automated malware classification, the aforementioned hierarchical structure, which can provide additional information to the classification models, has been ignored. In this paper, we propose the novel idea and study the performance of employing hierarchical learning algorithms for automated classification of malicious files. To the best of our knowledge, this is the first research effort which incorporates the hierarchical structure of the malware label in its automated classification and in the security domain, in general. It is important to note that our method does not require any additional effort by analysts because they typically assign these hierarchical labels today. Our empirical results on a real world, industrial-scale malware dataset of 3.6 million files demonstrate that incorporation of the label hierarchy achieves a significant reduction of 33.1% in the binary error rate as compared to a non-hierarchical classifier which is traditionally used in such problems.","PeriodicalId":113767,"journal":{"name":"MILCOM 2017 - 2017 IEEE Military Communications Conference (MILCOM)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Hierarchical learning for automated malware classification\",\"authors\":\"Shayok Chakraborty, J. W. Stokes, Lin Xiao, Dengyong Zhou, M. Marinescu, Anil Thomas\",\"doi\":\"10.1109/MILCOM.2017.8170758\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Despite widespread use of commercial anti-virus products, the number of malicious files detected on home and corporate computers continues to increase at a significant rate. Recently, anti-virus companies have started investing in machine learning solutions to augment signatures manually designed by analysts. A malicious file's determination is often represented as a hierarchical structure consisting of a type (e.g. Worm, Backdoor), a platform (e.g. Win32, Win64), a family (e.g. Rbot, Rugrat) and a family variant (e.g. A, B). While there has been substantial research in automated malware classification, the aforementioned hierarchical structure, which can provide additional information to the classification models, has been ignored. In this paper, we propose the novel idea and study the performance of employing hierarchical learning algorithms for automated classification of malicious files. To the best of our knowledge, this is the first research effort which incorporates the hierarchical structure of the malware label in its automated classification and in the security domain, in general. It is important to note that our method does not require any additional effort by analysts because they typically assign these hierarchical labels today. Our empirical results on a real world, industrial-scale malware dataset of 3.6 million files demonstrate that incorporation of the label hierarchy achieves a significant reduction of 33.1% in the binary error rate as compared to a non-hierarchical classifier which is traditionally used in such problems.\",\"PeriodicalId\":113767,\"journal\":{\"name\":\"MILCOM 2017 - 2017 IEEE Military Communications Conference (MILCOM)\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"MILCOM 2017 - 2017 IEEE Military Communications Conference (MILCOM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MILCOM.2017.8170758\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"MILCOM 2017 - 2017 IEEE Military Communications Conference (MILCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MILCOM.2017.8170758","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

尽管商业杀毒产品被广泛使用,但在家庭和公司电脑上检测到的恶意文件数量仍在以显著的速度增长。最近,反病毒公司开始投资机器学习解决方案,以增强分析师手动设计的签名。恶意文件的确定通常表示为由类型(例如Worm,后门),平台(例如Win32, Win64),家族(例如Rbot, Rugrat)和家族变体(例如A, B)组成的层次结构。尽管在恶意软件自动分类方面已经有大量研究,但上述可以为分类模型提供额外信息的层次结构一直被忽略。在本文中,我们提出了新的思想,并研究了采用层次学习算法对恶意文件进行自动分类的性能。据我们所知,这是第一次将恶意软件标签的层次结构纳入其自动分类和安全领域的研究工作。重要的是要注意,我们的方法不需要分析师的任何额外工作,因为他们今天通常分配这些分层标签。我们在现实世界中360万个文件的工业规模恶意软件数据集上的经验结果表明,与传统上用于此类问题的非分层分类器相比,标签层次结构的结合在二进制错误率上显著降低了33.1%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Hierarchical learning for automated malware classification
Despite widespread use of commercial anti-virus products, the number of malicious files detected on home and corporate computers continues to increase at a significant rate. Recently, anti-virus companies have started investing in machine learning solutions to augment signatures manually designed by analysts. A malicious file's determination is often represented as a hierarchical structure consisting of a type (e.g. Worm, Backdoor), a platform (e.g. Win32, Win64), a family (e.g. Rbot, Rugrat) and a family variant (e.g. A, B). While there has been substantial research in automated malware classification, the aforementioned hierarchical structure, which can provide additional information to the classification models, has been ignored. In this paper, we propose the novel idea and study the performance of employing hierarchical learning algorithms for automated classification of malicious files. To the best of our knowledge, this is the first research effort which incorporates the hierarchical structure of the malware label in its automated classification and in the security domain, in general. It is important to note that our method does not require any additional effort by analysts because they typically assign these hierarchical labels today. Our empirical results on a real world, industrial-scale malware dataset of 3.6 million files demonstrate that incorporation of the label hierarchy achieves a significant reduction of 33.1% in the binary error rate as compared to a non-hierarchical classifier which is traditionally used in such problems.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Improved target-tracking process in PCL Evasion and causative attacks with adversarial deep learning Performance of selection DF scheme for a relay system with non-identical Rician fading Single-channel blind separation of co-frequency PSK signals with unknown carrier frequency offsets Design of a software defined radio-based tactical DSA network
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1