A malware classification method based on directed API call relationships.

IF 2.9 3区综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES PLoS ONE Pub Date : 2025-03-17 eCollection Date: 2025-01-01 DOI:10.1371/journal.pone.0299706

Cuihua Ma, Zhenwan Li, Haixia Long, Anas Bilal, Xiaowen Liu

{"title":"A malware classification method based on directed API call relationships.","authors":"Cuihua Ma, Zhenwan Li, Haixia Long, Anas Bilal, Xiaowen Liu","doi":"10.1371/journal.pone.0299706","DOIUrl":null,"url":null,"abstract":"<p><p>In response to the growing complexity of network threats, researchers are increasingly turning to machine learning and deep learning techniques to develop advanced models for malware detection. Many existing methods that utilize Application Programming Interface (API) sequence instructions for malware classification often overlook the structural information inherent in these sequences. While some approaches consider the structure of API calls, they typically rely on the Graph Convolutional Network (GCN) framework, which tends to neglect the sequential nature of API interactions. To address these limitations, we propose a novel malware classification method that leverages the directed relationships within API sequences. Our approach models each API sequence as a directed graph, incorporating node attributes, structural information, and directional relationships. To effectively capture these features, we introduce First-order and Second-order Graph Convolutional Networks (FSGCN) to approximate the operations of a directed graph convolutional network (DGCN). The resulting directed graph embeddings from the FSGCN are then transformed into grayscale images and classified using a Convolutional Neural Network (CNN). Additionally, to mitigate the effects of imbalanced datasets, we employ the Synthetic Minority Over-sampling Technique (SMOTE), ensuring that underrepresented classes receive adequate attention during training. Our method has been rigorously evaluated through extensive experiments on two real-world malware datasets. The results demonstrate the effectiveness and superiority of our approach compared to traditional and graph-based malware classification techniques.</p>","PeriodicalId":20189,"journal":{"name":"PLoS ONE","volume":"20 3","pages":"e0299706"},"PeriodicalIF":2.9000,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS ONE","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1371/journal.pone.0299706","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

In response to the growing complexity of network threats, researchers are increasingly turning to machine learning and deep learning techniques to develop advanced models for malware detection. Many existing methods that utilize Application Programming Interface (API) sequence instructions for malware classification often overlook the structural information inherent in these sequences. While some approaches consider the structure of API calls, they typically rely on the Graph Convolutional Network (GCN) framework, which tends to neglect the sequential nature of API interactions. To address these limitations, we propose a novel malware classification method that leverages the directed relationships within API sequences. Our approach models each API sequence as a directed graph, incorporating node attributes, structural information, and directional relationships. To effectively capture these features, we introduce First-order and Second-order Graph Convolutional Networks (FSGCN) to approximate the operations of a directed graph convolutional network (DGCN). The resulting directed graph embeddings from the FSGCN are then transformed into grayscale images and classified using a Convolutional Neural Network (CNN). Additionally, to mitigate the effects of imbalanced datasets, we employ the Synthetic Minority Over-sampling Technique (SMOTE), ensuring that underrepresented classes receive adequate attention during training. Our method has been rigorously evaluated through extensive experiments on two real-world malware datasets. The results demonstrate the effectiveness and superiority of our approach compared to traditional and graph-based malware classification techniques.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助

来源期刊

PLoS ONE 生物-生物学

CiteScore

6.20

自引率

5.40%

发文量

14242

审稿时长

3.7 months

期刊介绍： PLOS ONE is an international, peer-reviewed, open-access, online publication. PLOS ONE welcomes reports on primary research from any scientific discipline. It provides: * Open-access—freely accessible online, authors retain copyright * Fast publication times * Peer review by expert, practicing researchers * Post-publication tools to indicate quality and impact * Community-based dialogue on articles * Worldwide media coverage