{"title":"Malware Variants Identification in Practice","authors":"Marcus Botacin, A. Grégio, P. De Geus","doi":"10.5753/sbseg.2019.13960","DOIUrl":null,"url":null,"abstract":"Malware are persistent threats to computer systems and analysis procedures allow developing countermeasures to them. However, as samples are spreading on growing rates, malware clustering techniques are required to keep analysis procedures scalable. Current clustering approaches use Call Graphs (CGs) to identify polymorphic samples, but they consider only individual functions calls, thus failing to cluster malware variants created by replacing sample's original functions by semantically-equivalent ones. To solve this problem, we propose a behavior-based classification procedure able to group functions on classes, thus reducing analysis procedures costs. We show that classifying samples according their behaviors (via function call semantics) instead by their pure API invocation is a more effective way to cluster malware variants. We also show that using a continence metric instead of a similarity metric helps to identify malware variants when a sample is embedded in another.","PeriodicalId":221963,"journal":{"name":"Anais do XIX Simpósio Brasileiro de Segurança da Informação e de Sistemas Computacionais (SBSeg 2019)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Anais do XIX Simpósio Brasileiro de Segurança da Informação e de Sistemas Computacionais (SBSeg 2019)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5753/sbseg.2019.13960","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Malware are persistent threats to computer systems and analysis procedures allow developing countermeasures to them. However, as samples are spreading on growing rates, malware clustering techniques are required to keep analysis procedures scalable. Current clustering approaches use Call Graphs (CGs) to identify polymorphic samples, but they consider only individual functions calls, thus failing to cluster malware variants created by replacing sample's original functions by semantically-equivalent ones. To solve this problem, we propose a behavior-based classification procedure able to group functions on classes, thus reducing analysis procedures costs. We show that classifying samples according their behaviors (via function call semantics) instead by their pure API invocation is a more effective way to cluster malware variants. We also show that using a continence metric instead of a similarity metric helps to identify malware variants when a sample is embedded in another.