Unveiling traffic paths: Explainable path signature feature-based encrypted traffic classification

IF 4.8 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Computers & Security Pub Date : 2024-12-19 DOI:10.1016/j.cose.2024.104283

Shi-Jie Xu , Kai-Chuan Kong , Xiao-Bo Jin , Guang-Gang Geng

{"title":"Unveiling traffic paths: Explainable path signature feature-based encrypted traffic classification","authors":"Shi-Jie Xu , Kai-Chuan Kong , Xiao-Bo Jin , Guang-Gang Geng","doi":"10.1016/j.cose.2024.104283","DOIUrl":null,"url":null,"abstract":"<div><div>Encryption technology ensures secure transmission for internet communications but poses significant challenges for effective encrypted traffic classification, which categorizes traffic into distinct groups, facilitating the process of monitoring network activities to uncover patterns and extract valuable information applicable in areas such as network management and anomaly detection. To this end, machine learning has emerged as a powerful technology for conducting encrypted traffic classification without compromising user data privacy. Machine learning-based classification demonstrates remarkable capabilities in processing vast amounts of data through sophisticated handcrafted features, with traffic path signature features representing the cutting edge of this field. This method shows stable performance improvements for common encrypted traffic types using only packet length information. However, it also yields a high dimensionality of path signature features, complicating the training of lightweight models and hindering further innovation due to a lack of model explainability. In this paper, we first propose leveraging feature selection to conduct feature dimensionality reduction, and then try to focus on the explanation of the model from both global and local perspectives. Performance comparisons indicate that our proposed method significantly reduces the number of path signature features while preserving classification performance, which enhances computational efficiency and meets the demand for lightweight models in various application scenarios. Furthermore, this significant reduction in the feature dimensionality allows for the interpretability of the model, which gives the user a clear understanding of the modeling decision-making process.</div></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":"150 ","pages":"Article 104283"},"PeriodicalIF":4.8000,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Security","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167404824005893","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Encryption technology ensures secure transmission for internet communications but poses significant challenges for effective encrypted traffic classification, which categorizes traffic into distinct groups, facilitating the process of monitoring network activities to uncover patterns and extract valuable information applicable in areas such as network management and anomaly detection. To this end, machine learning has emerged as a powerful technology for conducting encrypted traffic classification without compromising user data privacy. Machine learning-based classification demonstrates remarkable capabilities in processing vast amounts of data through sophisticated handcrafted features, with traffic path signature features representing the cutting edge of this field. This method shows stable performance improvements for common encrypted traffic types using only packet length information. However, it also yields a high dimensionality of path signature features, complicating the training of lightweight models and hindering further innovation due to a lack of model explainability. In this paper, we first propose leveraging feature selection to conduct feature dimensionality reduction, and then try to focus on the explanation of the model from both global and local perspectives. Performance comparisons indicate that our proposed method significantly reduces the number of path signature features while preserving classification performance, which enhances computational efficiency and meets the demand for lightweight models in various application scenarios. Furthermore, this significant reduction in the feature dimensionality allows for the interpretability of the model, which gives the user a clear understanding of the modeling decision-making process.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助

来源期刊

Computers & Security 工程技术-计算机：信息系统

CiteScore

12.40

自引率

7.10%

发文量

365

审稿时长

10.7 months

期刊介绍： Computers & Security is the most respected technical journal in the IT security field. With its high-profile editorial board and informative regular features and columns, the journal is essential reading for IT security professionals around the world. Computers & Security provides you with a unique blend of leading edge research and sound practical management advice. It is aimed at the professional involved with computer security, audit, control and data integrity in all sectors - industry, commerce and academia. Recognized worldwide as THE primary source of reference for applied research and technical expertise it is your first step to fully secure systems.