Cuihua Ma, Zhenwan Li, Haixia Long, Anas Bilal, Xiaowen Liu
{"title":"A malware classification method based on directed API call relationships.","authors":"Cuihua Ma, Zhenwan Li, Haixia Long, Anas Bilal, Xiaowen Liu","doi":"10.1371/journal.pone.0299706","DOIUrl":null,"url":null,"abstract":"<p><p>In response to the growing complexity of network threats, researchers are increasingly turning to machine learning and deep learning techniques to develop advanced models for malware detection. Many existing methods that utilize Application Programming Interface (API) sequence instructions for malware classification often overlook the structural information inherent in these sequences. While some approaches consider the structure of API calls, they typically rely on the Graph Convolutional Network (GCN) framework, which tends to neglect the sequential nature of API interactions. To address these limitations, we propose a novel malware classification method that leverages the directed relationships within API sequences. Our approach models each API sequence as a directed graph, incorporating node attributes, structural information, and directional relationships. To effectively capture these features, we introduce First-order and Second-order Graph Convolutional Networks (FSGCN) to approximate the operations of a directed graph convolutional network (DGCN). The resulting directed graph embeddings from the FSGCN are then transformed into grayscale images and classified using a Convolutional Neural Network (CNN). Additionally, to mitigate the effects of imbalanced datasets, we employ the Synthetic Minority Over-sampling Technique (SMOTE), ensuring that underrepresented classes receive adequate attention during training. Our method has been rigorously evaluated through extensive experiments on two real-world malware datasets. The results demonstrate the effectiveness and superiority of our approach compared to traditional and graph-based malware classification techniques.</p>","PeriodicalId":20189,"journal":{"name":"PLoS ONE","volume":"20 3","pages":"e0299706"},"PeriodicalIF":2.9000,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS ONE","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1371/journal.pone.0299706","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
In response to the growing complexity of network threats, researchers are increasingly turning to machine learning and deep learning techniques to develop advanced models for malware detection. Many existing methods that utilize Application Programming Interface (API) sequence instructions for malware classification often overlook the structural information inherent in these sequences. While some approaches consider the structure of API calls, they typically rely on the Graph Convolutional Network (GCN) framework, which tends to neglect the sequential nature of API interactions. To address these limitations, we propose a novel malware classification method that leverages the directed relationships within API sequences. Our approach models each API sequence as a directed graph, incorporating node attributes, structural information, and directional relationships. To effectively capture these features, we introduce First-order and Second-order Graph Convolutional Networks (FSGCN) to approximate the operations of a directed graph convolutional network (DGCN). The resulting directed graph embeddings from the FSGCN are then transformed into grayscale images and classified using a Convolutional Neural Network (CNN). Additionally, to mitigate the effects of imbalanced datasets, we employ the Synthetic Minority Over-sampling Technique (SMOTE), ensuring that underrepresented classes receive adequate attention during training. Our method has been rigorously evaluated through extensive experiments on two real-world malware datasets. The results demonstrate the effectiveness and superiority of our approach compared to traditional and graph-based malware classification techniques.
期刊介绍:
PLOS ONE is an international, peer-reviewed, open-access, online publication. PLOS ONE welcomes reports on primary research from any scientific discipline. It provides:
* Open-access—freely accessible online, authors retain copyright
* Fast publication times
* Peer review by expert, practicing researchers
* Post-publication tools to indicate quality and impact
* Community-based dialogue on articles
* Worldwide media coverage