Packet header-based reweight-long short term memory (Rew-LSTM) method for encrypted network traffic classification

IF 3.3 3区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS Computing Pub Date : 2024-07-02 DOI:10.1007/s00607-024-01306-w

Jiangang Hou, Xin Li, Hongji Xu, Chun Wang, Lizhen Cui, Zhi Liu, Changzhen Hu

{"title":"Packet header-based reweight-long short term memory (Rew-LSTM) method for encrypted network traffic classification","authors":"Jiangang Hou, Xin Li, Hongji Xu, Chun Wang, Lizhen Cui, Zhi Liu, Changzhen Hu","doi":"10.1007/s00607-024-01306-w","DOIUrl":null,"url":null,"abstract":"<p>With the development of Internet technology, cyberspace security has become a research hotspot. Network traffic classification is closely related to cyberspace security. In this paper, the problem of classification based on raw traffic data is investigated. This involves the granularity analysis of packets, separating packet headers from payloads, complementing and aligning packet headers, and converting them into structured data, including three representation types: bit, byte, and segmented protocol fields. Based on this, we propose the Rew-LSTM classification model for experiments on publicly available datasets of encrypted traffic, and the results show that excellent results can be obtained when using only the data in packet headers for multiple classification, especially when the data is represented using bit, which outperforms state-of-the-art methods. In addition, we propose a global normalization method, and experimental results show that it outperforms feature-specific normalization methods for both Tor traffic and regular encrypted traffic.</p>","PeriodicalId":10718,"journal":{"name":"Computing","volume":"15 1","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00607-024-01306-w","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

With the development of Internet technology, cyberspace security has become a research hotspot. Network traffic classification is closely related to cyberspace security. In this paper, the problem of classification based on raw traffic data is investigated. This involves the granularity analysis of packets, separating packet headers from payloads, complementing and aligning packet headers, and converting them into structured data, including three representation types: bit, byte, and segmented protocol fields. Based on this, we propose the Rew-LSTM classification model for experiments on publicly available datasets of encrypted traffic, and the results show that excellent results can be obtained when using only the data in packet headers for multiple classification, especially when the data is represented using bit, which outperforms state-of-the-art methods. In addition, we propose a global normalization method, and experimental results show that it outperforms feature-specific normalization methods for both Tor traffic and regular encrypted traffic.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于加密网络流量分类的基于数据包头的加权长时短记忆（Rew-LSTM）方法

随着互联网技术的发展，网络空间安全已成为研究热点。网络流量分类与网络空间安全密切相关。本文研究了基于原始流量数据的分类问题。这涉及到数据包的粒度分析、包头和有效载荷的分离、包头的补充和对齐，以及将其转换为结构化数据，包括比特、字节和分段协议字段三种表示类型。在此基础上，我们提出了 Rew-LSTM 分类模型，并在公开的加密流量数据集上进行了实验，结果表明，仅使用数据包头中的数据进行多重分类就能获得出色的结果，尤其是当数据使用比特表示时，其效果优于最先进的方法。此外，我们还提出了一种全局归一化方法，实验结果表明，对于 Tor 流量和普通加密流量，该方法优于针对特定特征的归一化方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Computing 工程技术-计算机：理论方法

CiteScore

8.20

自引率

2.70%

发文量

107

审稿时长

3 months

期刊介绍： Computing publishes original papers, short communications and surveys on all fields of computing. The contributions should be written in English and may be of theoretical or applied nature, the essential criteria are computational relevance and systematic foundation of results.