{"title":"Identifying malicious traffic under concept drift based on intraclass consistency enhanced variational autoencoder","authors":"Xiang Luo, Chang Liu, Gaopeng Gou, Gang Xiong, Zhen Li, Binxing Fang","doi":"10.1007/s11432-023-4010-4","DOIUrl":null,"url":null,"abstract":"<p>Accurate identification of malicious traffic is crucial for implementing effective defense counter-measures and has led to extensive research efforts. However, the continuously evolving techniques employed by adversaries have introduced the issues of concept drift, which significantly affects the performance of existing methods. To tackle this challenge, some researchers have focused on improving the separability of malicious traffic representation and designing drift detectors to reduce the number of false positives. Nevertheless, these methods often overlook the importance of enhancing the generalization and intraclass consistency in the representation. Additionally, the detectors are not sufficiently sensitive to the variations among different malicious traffic classes, which results in poor performance and limited robustness. In this paper, we propose intraclass consistency enhanced variational autoencoder with Class-Perception detector (ICE-CP) to identify malicious traffic under concept drift. It comprises two key modules during training: intraclass consistency enhanced (ICE) representation learning and Class-Perception (CP) detector construction. In the first module, we employ a variational autoencoder (VAE) in conjunction with Kullback-Leibler (KL)-divergence and cross-entropy loss to model the distribution of each input malicious traffic flow. This approach simultaneously enhances the generalization, interclass consistency, and intraclass differences in the learned representation. Consequently, we obtain a compact representation and a trained classifier for non-drifting malicious traffic. In the second module, we design the CP detector, which generates a centroid and threshold for each malicious traffic class separately based on the learned representation, depicting the boundaries between drifting and non-drifting malicious traffic. During testing, we utilize the trained classifier to predict malicious traffic classes for the testing samples. Then, we use the CP detector to detect the potential drifting samples using the centroid and threshold defined for each class. We evaluate ICE-CP and some advanced methods on various real-world malicious traffic datasets. The results show that our method outperforms others in identifying malicious traffic and detecting potential drifting samples, demonstrating outstanding robustness among different concept drift settings.</p>","PeriodicalId":21618,"journal":{"name":"Science China Information Sciences","volume":"86 1","pages":""},"PeriodicalIF":7.3000,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science China Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11432-023-4010-4","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate identification of malicious traffic is crucial for implementing effective defense counter-measures and has led to extensive research efforts. However, the continuously evolving techniques employed by adversaries have introduced the issues of concept drift, which significantly affects the performance of existing methods. To tackle this challenge, some researchers have focused on improving the separability of malicious traffic representation and designing drift detectors to reduce the number of false positives. Nevertheless, these methods often overlook the importance of enhancing the generalization and intraclass consistency in the representation. Additionally, the detectors are not sufficiently sensitive to the variations among different malicious traffic classes, which results in poor performance and limited robustness. In this paper, we propose intraclass consistency enhanced variational autoencoder with Class-Perception detector (ICE-CP) to identify malicious traffic under concept drift. It comprises two key modules during training: intraclass consistency enhanced (ICE) representation learning and Class-Perception (CP) detector construction. In the first module, we employ a variational autoencoder (VAE) in conjunction with Kullback-Leibler (KL)-divergence and cross-entropy loss to model the distribution of each input malicious traffic flow. This approach simultaneously enhances the generalization, interclass consistency, and intraclass differences in the learned representation. Consequently, we obtain a compact representation and a trained classifier for non-drifting malicious traffic. In the second module, we design the CP detector, which generates a centroid and threshold for each malicious traffic class separately based on the learned representation, depicting the boundaries between drifting and non-drifting malicious traffic. During testing, we utilize the trained classifier to predict malicious traffic classes for the testing samples. Then, we use the CP detector to detect the potential drifting samples using the centroid and threshold defined for each class. We evaluate ICE-CP and some advanced methods on various real-world malicious traffic datasets. The results show that our method outperforms others in identifying malicious traffic and detecting potential drifting samples, demonstrating outstanding robustness among different concept drift settings.
期刊介绍:
Science China Information Sciences is a dedicated journal that showcases high-quality, original research across various domains of information sciences. It encompasses Computer Science & Technologies, Control Science & Engineering, Information & Communication Engineering, Microelectronics & Solid-State Electronics, and Quantum Information, providing a platform for the dissemination of significant contributions in these fields.