Gilbert Breves Martins, Rosiane de Freitas, E. Souto
{"title":"Virtual structures and heterogeneous nodes in dependency graphs for detecting metamorphic malware","authors":"Gilbert Breves Martins, Rosiane de Freitas, E. Souto","doi":"10.1109/PCCC.2014.7017069","DOIUrl":null,"url":null,"abstract":"The traditional way to identify malicious programs is to compare the code body with a set of previously stored code patterns, also known as signatures, extracted from already identified malware code. To nullify this identification process, the malware developers can insert in their creations the ability to modify the malware code when the next contamination process takes place, using obfuscation techniques. One way to deal with this metamorphic malware behavior is the use of dependency graphs, generated by surveying dependency relationships among code elements, creating a model that is resilient to code mutations. Analog to the signature model, a matching procedure that compares these graphs with a reference graph database is used to identify a malware code. Since graph matching is a NP-hard problem, it is necessary to find ways to optimize this process, so this identification technique can be applied. Using dependency graphs extracted from binary code, we present an approach to reduce the size of the reference dependency graphs stored on the graph database, by introducing a node differentiation based on its features. This way, in conjunction with the insertion of virtual paths, it is possible to build a virtual clique used to identify and dispose of less relevant elements of the original graph. The use of dependency graph reduction also produces more stable results in the matching process. To validate these statements, we present a methodology for generating these graphs from binary programs and compare the results achieved with and without the proposed approach in the identification of the Evol and Polip metamorphic malware.","PeriodicalId":105442,"journal":{"name":"2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PCCC.2014.7017069","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The traditional way to identify malicious programs is to compare the code body with a set of previously stored code patterns, also known as signatures, extracted from already identified malware code. To nullify this identification process, the malware developers can insert in their creations the ability to modify the malware code when the next contamination process takes place, using obfuscation techniques. One way to deal with this metamorphic malware behavior is the use of dependency graphs, generated by surveying dependency relationships among code elements, creating a model that is resilient to code mutations. Analog to the signature model, a matching procedure that compares these graphs with a reference graph database is used to identify a malware code. Since graph matching is a NP-hard problem, it is necessary to find ways to optimize this process, so this identification technique can be applied. Using dependency graphs extracted from binary code, we present an approach to reduce the size of the reference dependency graphs stored on the graph database, by introducing a node differentiation based on its features. This way, in conjunction with the insertion of virtual paths, it is possible to build a virtual clique used to identify and dispose of less relevant elements of the original graph. The use of dependency graph reduction also produces more stable results in the matching process. To validate these statements, we present a methodology for generating these graphs from binary programs and compare the results achieved with and without the proposed approach in the identification of the Evol and Polip metamorphic malware.