Sanket Shukla, Gaurav Kolhe, Sai Manoj Pudukotai Dinakarrao, S. Rafatirad
{"title":"RNN-Based Classifier to Detect Stealthy Malware using Localized Features and Complex Symbolic Sequence","authors":"Sanket Shukla, Gaurav Kolhe, Sai Manoj Pudukotai Dinakarrao, S. Rafatirad","doi":"10.1109/ICMLA.2019.00076","DOIUrl":null,"url":null,"abstract":"Malware detection and classification has enticed a lot of researchers in the past decades. Several mechanisms based on machine learning (ML), computer vision and deep learning have been deployed to this task and have achieved considerable results. However, advanced malware (stealthy malware) generated using various obfuscation techniques like code relocation, code transposition, polymorphism and mutation thwart the detection. In this paper, we propose a two-pronged technique which can efficiently detect both traditional and stealthy malware. Firstly, we extract the microarchitectural traces procured while executing the application, which are fed to the traditional ML classifiers to identify malware spawned as separate thread. In parallel, for an efficient stealthy malware detection, we instigate an automated localized feature extraction technique that will be used as an input to recurrent neural networks (RNNs) for classification. We have tested the proposed mechanism rigorously on stealthy malware created using code relocation obfuscation technique. With the proposed two-pronged approach, an accuracy of 94%, precision of 93%, recall score of 96% and F-1 score of 94% is achieved. Furthermore, the proposed technique attains up to 11% higher on average detection accuracy and precision, along with 24% higher on average recall and F-1 score as compared to the CNN-based sequence classification and hidden Markov model (HMM) based approaches in detecting stealthy malware.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"393 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2019.00076","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19
Abstract
Malware detection and classification has enticed a lot of researchers in the past decades. Several mechanisms based on machine learning (ML), computer vision and deep learning have been deployed to this task and have achieved considerable results. However, advanced malware (stealthy malware) generated using various obfuscation techniques like code relocation, code transposition, polymorphism and mutation thwart the detection. In this paper, we propose a two-pronged technique which can efficiently detect both traditional and stealthy malware. Firstly, we extract the microarchitectural traces procured while executing the application, which are fed to the traditional ML classifiers to identify malware spawned as separate thread. In parallel, for an efficient stealthy malware detection, we instigate an automated localized feature extraction technique that will be used as an input to recurrent neural networks (RNNs) for classification. We have tested the proposed mechanism rigorously on stealthy malware created using code relocation obfuscation technique. With the proposed two-pronged approach, an accuracy of 94%, precision of 93%, recall score of 96% and F-1 score of 94% is achieved. Furthermore, the proposed technique attains up to 11% higher on average detection accuracy and precision, along with 24% higher on average recall and F-1 score as compared to the CNN-based sequence classification and hidden Markov model (HMM) based approaches in detecting stealthy malware.