Technology of key feature identification in malware API calls sequences

Analysis and data processing systems Pub Date : 2021-09-30 DOI:10.17212/2782-2001-2021-3-37-52

V. Voronin, A. Morozov

{"title":"Technology of key feature identification in malware API calls sequences","authors":"V. Voronin, A. Morozov","doi":"10.17212/2782-2001-2021-3-37-52","DOIUrl":null,"url":null,"abstract":"Today, almost everyone is faced with computer security problems in one or another way. Antivirus programs are used to control threats to the security of malicious software. Conventional methods for detecting malware are no longer effective enough; nowadays, neural networks and behavioral analysis technology have begun to be used for these purposes. Analyzing the behavior of programs is a difficult task, since there is no clear sequence of actions to accurately identify a program as malicious. In addition, such programs use measures to resist such detection, for example, noise masking the sequence of their work with meaningless actions. There is also the problem of uniquely identifying the class of malware due to the fact that malware can use similar methods, while being assigned to different classes. In this paper, it is proposed to use NLP methods, such as word embedding, and LDA in relation to the problems of analyzing malware API calls sequences in order to reveal the presence of semantic dependencies and assess the effectiveness of the application of these methods. The results obtained indicate the possibility of identifying the key features of malware behavior, which in the future will significantly improve the technology for detecting and identifying such programs.","PeriodicalId":292298,"journal":{"name":"Analysis and data processing systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analysis and data processing systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17212/2782-2001-2021-3-37-52","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Today, almost everyone is faced with computer security problems in one or another way. Antivirus programs are used to control threats to the security of malicious software. Conventional methods for detecting malware are no longer effective enough; nowadays, neural networks and behavioral analysis technology have begun to be used for these purposes. Analyzing the behavior of programs is a difficult task, since there is no clear sequence of actions to accurately identify a program as malicious. In addition, such programs use measures to resist such detection, for example, noise masking the sequence of their work with meaningless actions. There is also the problem of uniquely identifying the class of malware due to the fact that malware can use similar methods, while being assigned to different classes. In this paper, it is proposed to use NLP methods, such as word embedding, and LDA in relation to the problems of analyzing malware API calls sequences in order to reveal the presence of semantic dependencies and assess the effectiveness of the application of these methods. The results obtained indicate the possibility of identifying the key features of malware behavior, which in the future will significantly improve the technology for detecting and identifying such programs.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

恶意软件API调用序列关键特征识别技术

今天，几乎每个人都面临着这样或那样的计算机安全问题。防病毒程序用于控制恶意软件对安全的威胁。检测恶意软件的传统方法不再有效;如今，神经网络和行为分析技术已经开始用于这些目的。分析程序的行为是一项困难的任务，因为没有明确的操作序列来准确地识别程序为恶意程序。此外，这些程序使用一些措施来抵抗这种检测，例如，噪声用无意义的动作掩盖了它们工作的顺序。由于恶意软件可以使用相似的方法，而被分配到不同的类，因此还存在唯一识别恶意软件类别的问题。本文提出使用NLP方法，如词嵌入和LDA来分析恶意软件API调用序列的问题，以揭示语义依赖的存在并评估这些方法应用的有效性。所获得的结果表明，识别恶意软件行为的关键特征是可能的，这将在未来显著提高检测和识别此类程序的技术。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Analysis and data processing systems

自引率

0.00%

发文量