Identifying malicious traffic under concept drift based on intraclass consistency enhanced variational autoencoder

IF 7.3 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Science China Information Sciences Pub Date : 2024-07-23 DOI:10.1007/s11432-023-4010-4

Xiang Luo, Chang Liu, Gaopeng Gou, Gang Xiong, Zhen Li, Binxing Fang

{"title":"Identifying malicious traffic under concept drift based on intraclass consistency enhanced variational autoencoder","authors":"Xiang Luo, Chang Liu, Gaopeng Gou, Gang Xiong, Zhen Li, Binxing Fang","doi":"10.1007/s11432-023-4010-4","DOIUrl":null,"url":null,"abstract":"<p>Accurate identification of malicious traffic is crucial for implementing effective defense counter-measures and has led to extensive research efforts. However, the continuously evolving techniques employed by adversaries have introduced the issues of concept drift, which significantly affects the performance of existing methods. To tackle this challenge, some researchers have focused on improving the separability of malicious traffic representation and designing drift detectors to reduce the number of false positives. Nevertheless, these methods often overlook the importance of enhancing the generalization and intraclass consistency in the representation. Additionally, the detectors are not sufficiently sensitive to the variations among different malicious traffic classes, which results in poor performance and limited robustness. In this paper, we propose intraclass consistency enhanced variational autoencoder with Class-Perception detector (ICE-CP) to identify malicious traffic under concept drift. It comprises two key modules during training: intraclass consistency enhanced (ICE) representation learning and Class-Perception (CP) detector construction. In the first module, we employ a variational autoencoder (VAE) in conjunction with Kullback-Leibler (KL)-divergence and cross-entropy loss to model the distribution of each input malicious traffic flow. This approach simultaneously enhances the generalization, interclass consistency, and intraclass differences in the learned representation. Consequently, we obtain a compact representation and a trained classifier for non-drifting malicious traffic. In the second module, we design the CP detector, which generates a centroid and threshold for each malicious traffic class separately based on the learned representation, depicting the boundaries between drifting and non-drifting malicious traffic. During testing, we utilize the trained classifier to predict malicious traffic classes for the testing samples. Then, we use the CP detector to detect the potential drifting samples using the centroid and threshold defined for each class. We evaluate ICE-CP and some advanced methods on various real-world malicious traffic datasets. The results show that our method outperforms others in identifying malicious traffic and detecting potential drifting samples, demonstrating outstanding robustness among different concept drift settings.</p>","PeriodicalId":21618,"journal":{"name":"Science China Information Sciences","volume":"86 1","pages":""},"PeriodicalIF":7.3000,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science China Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11432-023-4010-4","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Accurate identification of malicious traffic is crucial for implementing effective defense counter-measures and has led to extensive research efforts. However, the continuously evolving techniques employed by adversaries have introduced the issues of concept drift, which significantly affects the performance of existing methods. To tackle this challenge, some researchers have focused on improving the separability of malicious traffic representation and designing drift detectors to reduce the number of false positives. Nevertheless, these methods often overlook the importance of enhancing the generalization and intraclass consistency in the representation. Additionally, the detectors are not sufficiently sensitive to the variations among different malicious traffic classes, which results in poor performance and limited robustness. In this paper, we propose intraclass consistency enhanced variational autoencoder with Class-Perception detector (ICE-CP) to identify malicious traffic under concept drift. It comprises two key modules during training: intraclass consistency enhanced (ICE) representation learning and Class-Perception (CP) detector construction. In the first module, we employ a variational autoencoder (VAE) in conjunction with Kullback-Leibler (KL)-divergence and cross-entropy loss to model the distribution of each input malicious traffic flow. This approach simultaneously enhances the generalization, interclass consistency, and intraclass differences in the learned representation. Consequently, we obtain a compact representation and a trained classifier for non-drifting malicious traffic. In the second module, we design the CP detector, which generates a centroid and threshold for each malicious traffic class separately based on the learned representation, depicting the boundaries between drifting and non-drifting malicious traffic. During testing, we utilize the trained classifier to predict malicious traffic classes for the testing samples. Then, we use the CP detector to detect the potential drifting samples using the centroid and threshold defined for each class. We evaluate ICE-CP and some advanced methods on various real-world malicious traffic datasets. The results show that our method outperforms others in identifying malicious traffic and detecting potential drifting samples, demonstrating outstanding robustness among different concept drift settings.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于类内一致性增强变异自动编码器识别概念漂移下的恶意流量

准确识别恶意流量对于实施有效的防御反制措施至关重要，因此相关研究工作十分广泛。然而，敌方采用的技术不断发展，带来了概念漂移问题，严重影响了现有方法的性能。为了应对这一挑战，一些研究人员专注于提高恶意流量表示的可分离性，并设计漂移检测器来减少误报。然而，这些方法往往忽视了增强表征的泛化和类内一致性的重要性。此外，这些检测器对不同恶意流量类别之间的变化不够敏感，导致性能低下，鲁棒性有限。本文提出了类内一致性增强变分自动编码器与类感知检测器（ICE-CP），用于识别概念漂移下的恶意流量。它在训练过程中包括两个关键模块：类内一致性增强（ICE）表示学习和类感知（CP）检测器构建。在第一个模块中，我们将变异自动编码器（VAE）与库尔贝克-莱布勒（KL）-发散和交叉熵损失相结合，对每个输入恶意流量的分布进行建模。这种方法同时增强了所学表示的泛化、类间一致性和类内差异。因此，我们得到了一个紧凑的表示和一个训练有素的非漂移恶意流量分类器。在第二个模块中，我们设计了 CP 检测器，该检测器根据学习到的表示分别为每个恶意流量类别生成中心点和阈值，描绘出漂移和非漂移恶意流量之间的界限。在测试过程中，我们利用训练有素的分类器来预测测试样本的恶意流量类别。然后，我们使用 CP 检测器，利用为每个类别定义的中心点和阈值来检测潜在的漂移样本。我们在各种真实世界的恶意流量数据集上评估了 ICE-CP 和一些高级方法。结果表明，我们的方法在识别恶意流量和检测潜在漂移样本方面优于其他方法，在不同的概念漂移设置中表现出卓越的鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Science China Information Sciences COMPUTER SCIENCE, INFORMATION SYSTEMS-

CiteScore

12.60

自引率

5.70%

发文量

224

审稿时长

8.3 months

期刊介绍： Science China Information Sciences is a dedicated journal that showcases high-quality, original research across various domains of information sciences. It encompasses Computer Science & Technologies, Control Science & Engineering, Information & Communication Engineering, Microelectronics & Solid-State Electronics, and Quantum Information, providing a platform for the dissemination of significant contributions in these fields.