A balanced supervised contrastive learning-based method for encrypted network traffic classification

IF 4.8 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Computers & Security Pub Date : 2024-07-25 DOI:10.1016/j.cose.2024.104023

{"title":"A balanced supervised contrastive learning-based method for encrypted network traffic classification","authors":"","doi":"10.1016/j.cose.2024.104023","DOIUrl":null,"url":null,"abstract":"<div><p>Encrypted network traffic classification plays an important role in enhancing network security and improving network performance. However, the imbalanced nature of traffic data makes the classification of encrypted network traffic challenging and may result in poor classification performance. Existing encrypted network traffic classification studies attempt to rebalance the data distribution through resampling strategies, which suffer from information loss, overfitting, and increased model complexity. Motivated by this, we propose an improved supervised contrastive learning approach to improve the classification performance of supervised contrastive learning classifiers for the traffic class imbalance problem in encrypted network traffic classification. Our method consists of two parts: data processing and traffic classification. In the data processing stage, we transform the raw network traffic data into grayscale images. In the traffic classification stage, we design optimized class-complement and class-averaging schemes in supervised contrastive learning. The construction of contrastive tasks is a critical link in contrastive learning. However, when constructing the set of positive and negative samples of network traffic, the samples generated by traditional methods do not conform to the salient features of network traffic. Traditional methods typically involve color modification, cropping, rotation, noise injection, and random erasure. When these traditional methods are applied to images generated from network traffic data, they may alter significant features of the network traffic data, such as changing the distribution of packet sizes. This is detrimental to maintaining the characteristics of traffic classes and does not aid the learning process. Therefore, we preprocess the traffic into images in a particular format suitable for contrastive learning, and then design a novel contrastive task construction method. The evaluation results on public datasets show that the proposed method can significantly improve the classification performance of encrypted traffic classification on imbalanced datasets.</p></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":null,"pages":null},"PeriodicalIF":4.8000,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Security","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167404824003286","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Encrypted network traffic classification plays an important role in enhancing network security and improving network performance. However, the imbalanced nature of traffic data makes the classification of encrypted network traffic challenging and may result in poor classification performance. Existing encrypted network traffic classification studies attempt to rebalance the data distribution through resampling strategies, which suffer from information loss, overfitting, and increased model complexity. Motivated by this, we propose an improved supervised contrastive learning approach to improve the classification performance of supervised contrastive learning classifiers for the traffic class imbalance problem in encrypted network traffic classification. Our method consists of two parts: data processing and traffic classification. In the data processing stage, we transform the raw network traffic data into grayscale images. In the traffic classification stage, we design optimized class-complement and class-averaging schemes in supervised contrastive learning. The construction of contrastive tasks is a critical link in contrastive learning. However, when constructing the set of positive and negative samples of network traffic, the samples generated by traditional methods do not conform to the salient features of network traffic. Traditional methods typically involve color modification, cropping, rotation, noise injection, and random erasure. When these traditional methods are applied to images generated from network traffic data, they may alter significant features of the network traffic data, such as changing the distribution of packet sizes. This is detrimental to maintaining the characteristics of traffic classes and does not aid the learning process. Therefore, we preprocess the traffic into images in a particular format suitable for contrastive learning, and then design a novel contrastive task construction method. The evaluation results on public datasets show that the proposed method can significantly improve the classification performance of encrypted traffic classification on imbalanced datasets.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于平衡监督对比学习的加密网络流量分类方法

加密网络流量分类在增强网络安全和提高网络性能方面发挥着重要作用。然而，流量数据的不平衡性使得加密网络流量分类具有挑战性，并可能导致分类性能低下。现有的加密网络流量分类研究试图通过重采样策略来重新平衡数据分布，但这种方法存在信息丢失、过拟合和模型复杂度增加等问题。受此启发，我们提出了一种改进的监督对比学习方法，以提高监督对比学习分类器的分类性能，从而解决加密网络流量分类中的流量类别不平衡问题。我们的方法由两部分组成：数据处理和流量分类。在数据处理阶段，我们将原始网络流量数据转换成灰度图像。在流量分类阶段，我们在监督对比学习中设计了优化的类补全和类平均方案。对比任务的构建是对比学习的关键环节。然而，在构建网络流量的正负样本集时，传统方法生成的样本并不符合网络流量的显著特征。传统方法通常涉及颜色修改、裁剪、旋转、噪声注入和随机擦除。当这些传统方法应用于网络流量数据生成的图像时，可能会改变网络流量数据的显著特征，例如改变数据包大小的分布。这不利于保持流量类别的特征，也无助于学习过程。因此，我们将流量预处理成适合对比学习的特定格式图像，然后设计了一种新颖的对比任务构建方法。在公共数据集上的评估结果表明，所提出的方法可以显著提高加密流量分类在不平衡数据集上的分类性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Computers & Security 工程技术-计算机：信息系统

CiteScore

12.40

自引率

7.10%

发文量

365

审稿时长

10.7 months

期刊介绍： Computers & Security is the most respected technical journal in the IT security field. With its high-profile editorial board and informative regular features and columns, the journal is essential reading for IT security professionals around the world. Computers & Security provides you with a unique blend of leading edge research and sound practical management advice. It is aimed at the professional involved with computer security, audit, control and data integrity in all sectors - industry, commerce and academia. Recognized worldwide as THE primary source of reference for applied research and technical expertise it is your first step to fully secure systems.