MalFSCIL: A Few-Shot Class-Incremental Learning Approach for Malware Detection

IF 8 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS IEEE Transactions on Information Forensics and Security Pub Date : 2024-12-12 DOI:10.1109/TIFS.2024.3516565

Yuhan Chai;Ximing Chen;Jing Qiu;Lei Du;Yanjun Xiao;Qiying Feng;Shouling Ji;Zhihong Tian

{"title":"MalFSCIL: A Few-Shot Class-Incremental Learning Approach for Malware Detection","authors":"Yuhan Chai;Ximing Chen;Jing Qiu;Lei Du;Yanjun Xiao;Qiying Feng;Shouling Ji;Zhihong Tian","doi":"10.1109/TIFS.2024.3516565","DOIUrl":null,"url":null,"abstract":"The continuous evolution of malware is posing a serious threat to personal privacy, enterprise data security, and global network infrastructure. For example, attackers can use phishing emails, botnets, etc. to induce victims to execute malware for nefarious purposes such as stealing sensitive information. Therefore, it is significant to develop effective and efficient methods to detect malware. Towards this, most state-of-the-art methods are focused on learning-based method. In order to adapt to the characteristics of sample scarcity and dynamic evolution of malware detection tasks, few-shot class incremental learning has been proposed as an efficient pairwise solution. Nevertheless, they still face two major challenges: 1) Catastrophic Forgetting: the erosion of existing knowledge by newly acquired knowledge during incremental learning. 2) Decision boundary confusion: after continuous multiple incremental sessions, the discriminative ability of the classification model is weakened. To address the above challenges, we propose a new Malware detection framework based on Few-Shot Class Incremental Learning, MalFSCIL, which utilizes a decoupled training strategy combined with a variational autocoder to mitigate catastrophic forgetting, and designs a dynamic boundary delineation method based on class prototyping to achieve accurate delineation of incremental decision boundaries. Extensive experimental results show that the proposed method outperforms the state-of-the-art techniques in malware detection and classification with high classification accuracy with open-source dataset and Internal enterprise dataset.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"20 ","pages":"2999-3014"},"PeriodicalIF":8.0000,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10795155/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

The continuous evolution of malware is posing a serious threat to personal privacy, enterprise data security, and global network infrastructure. For example, attackers can use phishing emails, botnets, etc. to induce victims to execute malware for nefarious purposes such as stealing sensitive information. Therefore, it is significant to develop effective and efficient methods to detect malware. Towards this, most state-of-the-art methods are focused on learning-based method. In order to adapt to the characteristics of sample scarcity and dynamic evolution of malware detection tasks, few-shot class incremental learning has been proposed as an efficient pairwise solution. Nevertheless, they still face two major challenges: 1) Catastrophic Forgetting: the erosion of existing knowledge by newly acquired knowledge during incremental learning. 2) Decision boundary confusion: after continuous multiple incremental sessions, the discriminative ability of the classification model is weakened. To address the above challenges, we propose a new Malware detection framework based on Few-Shot Class Incremental Learning, MalFSCIL, which utilizes a decoupled training strategy combined with a variational autocoder to mitigate catastrophic forgetting, and designs a dynamic boundary delineation method based on class prototyping to achieve accurate delineation of incremental decision boundaries. Extensive experimental results show that the proposed method outperforms the state-of-the-art techniques in malware detection and classification with high classification accuracy with open-source dataset and Internal enterprise dataset.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

MalFSCIL：用于恶意软件检测的少量类增量学习方法

恶意软件的不断演变对个人隐私、企业数据安全和全球网络基础设施构成了严重威胁。例如，攻击者可以使用网络钓鱼电子邮件、僵尸网络等引诱受害者执行恶意软件，以达到窃取敏感信息等邪恶目的。因此，开发有效、高效的恶意软件检测方法具有重要意义。为此，大多数先进的方法都集中在基于学习的方法上。为了适应恶意软件检测任务的样本稀缺性和动态演化的特点，提出了一种有效的两两学习方法——少射类增量学习。然而，他们仍然面临着两大挑战：1)灾难性遗忘：在增量学习过程中新获得的知识对现有知识的侵蚀。2)决策边界混淆：连续多次增量会话后，分类模型的判别能力被削弱。为了解决上述问题，我们提出了一种基于Few-Shot类增量学习的恶意软件检测框架MalFSCIL，该框架利用解耦训练策略结合变分自动编码器来减轻灾难性遗忘，并设计了一种基于类原型的动态边界描绘方法来实现增量决策边界的准确描绘。大量的实验结果表明，该方法在开源数据集和企业内部数据集的恶意软件检测和分类方面优于现有技术，具有较高的分类精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Information Forensics and Security 工程技术-工程：电子与电气

CiteScore

14.40

自引率

7.40%

发文量

234

审稿时长

6.5 months

期刊介绍： The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features