{"title":"Graph Learning on Instruction Stream-Augmented CFG for Malware Variant Detection","authors":"Jiaxin Mi;Qi Li;Zewei Han;Weilue Liao;Junsong Fu","doi":"10.1109/TIFS.2025.3539937","DOIUrl":null,"url":null,"abstract":"As malware as a service (MaaS) and organized attacks develop and drive a shift in malware variant generation mechanism, current variant detection, designed to counter conventional obfuscation and anti-detection strategies, falls short in facing new challenges, particularly in identifying variants that maintain core functionalities while altering local behaviors, or those sharing similar code logic but diverge in actual functionalities. To tackle the problems, we present ISCMVD, an Instruction Stream-augmented CFG-based Malware Variant Detection scheme, melding control flow structures with machine semantic information from instruction streams within blocks to build a comprehensive functional representation for variants’ basic and detailed behaviors. Leveraging a global-enhanced attentive graph neural network to integrate local and global functional features, we significantly boost the capture of representative stable primary behaviors’ similarity from variants within the same family identifying variants generated under attackers’ code rewriting, module modification, and other transformation means. Additionally, through cross-family associative analysis, we eliminate classification interference of variants’ logic similarities stemming from the same organization generating. Evaluation results on public and real-world datasets demonstrate the superiority and robustness of ISCMVD with an average of 99.29% in AC and 99.25% in F1 and perform well even in few-shot cases. What’s more important, we achieve a breakthrough in two special sample sets including variants related to MaaS and APT group, and outperform state-of-the-art methods under the current variant generation mechanism, proving its suitability for future trends.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"20 ","pages":"3015-3030"},"PeriodicalIF":8.0000,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10877912/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
As malware as a service (MaaS) and organized attacks develop and drive a shift in malware variant generation mechanism, current variant detection, designed to counter conventional obfuscation and anti-detection strategies, falls short in facing new challenges, particularly in identifying variants that maintain core functionalities while altering local behaviors, or those sharing similar code logic but diverge in actual functionalities. To tackle the problems, we present ISCMVD, an Instruction Stream-augmented CFG-based Malware Variant Detection scheme, melding control flow structures with machine semantic information from instruction streams within blocks to build a comprehensive functional representation for variants’ basic and detailed behaviors. Leveraging a global-enhanced attentive graph neural network to integrate local and global functional features, we significantly boost the capture of representative stable primary behaviors’ similarity from variants within the same family identifying variants generated under attackers’ code rewriting, module modification, and other transformation means. Additionally, through cross-family associative analysis, we eliminate classification interference of variants’ logic similarities stemming from the same organization generating. Evaluation results on public and real-world datasets demonstrate the superiority and robustness of ISCMVD with an average of 99.29% in AC and 99.25% in F1 and perform well even in few-shot cases. What’s more important, we achieve a breakthrough in two special sample sets including variants related to MaaS and APT group, and outperform state-of-the-art methods under the current variant generation mechanism, proving its suitability for future trends.
期刊介绍:
The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features