Shi-Jie Xu , Kai-Chuan Kong , Xiao-Bo Jin , Guang-Gang Geng
{"title":"Unveiling traffic paths: Explainable path signature feature-based encrypted traffic classification","authors":"Shi-Jie Xu , Kai-Chuan Kong , Xiao-Bo Jin , Guang-Gang Geng","doi":"10.1016/j.cose.2024.104283","DOIUrl":null,"url":null,"abstract":"<div><div>Encryption technology ensures secure transmission for internet communications but poses significant challenges for effective encrypted traffic classification, which categorizes traffic into distinct groups, facilitating the process of monitoring network activities to uncover patterns and extract valuable information applicable in areas such as network management and anomaly detection. To this end, machine learning has emerged as a powerful technology for conducting encrypted traffic classification without compromising user data privacy. Machine learning-based classification demonstrates remarkable capabilities in processing vast amounts of data through sophisticated handcrafted features, with traffic path signature features representing the cutting edge of this field. This method shows stable performance improvements for common encrypted traffic types using only packet length information. However, it also yields a high dimensionality of path signature features, complicating the training of lightweight models and hindering further innovation due to a lack of model explainability. In this paper, we first propose leveraging feature selection to conduct feature dimensionality reduction, and then try to focus on the explanation of the model from both global and local perspectives. Performance comparisons indicate that our proposed method significantly reduces the number of path signature features while preserving classification performance, which enhances computational efficiency and meets the demand for lightweight models in various application scenarios. Furthermore, this significant reduction in the feature dimensionality allows for the interpretability of the model, which gives the user a clear understanding of the modeling decision-making process.</div></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":"150 ","pages":"Article 104283"},"PeriodicalIF":4.8000,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Security","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167404824005893","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Encryption technology ensures secure transmission for internet communications but poses significant challenges for effective encrypted traffic classification, which categorizes traffic into distinct groups, facilitating the process of monitoring network activities to uncover patterns and extract valuable information applicable in areas such as network management and anomaly detection. To this end, machine learning has emerged as a powerful technology for conducting encrypted traffic classification without compromising user data privacy. Machine learning-based classification demonstrates remarkable capabilities in processing vast amounts of data through sophisticated handcrafted features, with traffic path signature features representing the cutting edge of this field. This method shows stable performance improvements for common encrypted traffic types using only packet length information. However, it also yields a high dimensionality of path signature features, complicating the training of lightweight models and hindering further innovation due to a lack of model explainability. In this paper, we first propose leveraging feature selection to conduct feature dimensionality reduction, and then try to focus on the explanation of the model from both global and local perspectives. Performance comparisons indicate that our proposed method significantly reduces the number of path signature features while preserving classification performance, which enhances computational efficiency and meets the demand for lightweight models in various application scenarios. Furthermore, this significant reduction in the feature dimensionality allows for the interpretability of the model, which gives the user a clear understanding of the modeling decision-making process.
期刊介绍:
Computers & Security is the most respected technical journal in the IT security field. With its high-profile editorial board and informative regular features and columns, the journal is essential reading for IT security professionals around the world.
Computers & Security provides you with a unique blend of leading edge research and sound practical management advice. It is aimed at the professional involved with computer security, audit, control and data integrity in all sectors - industry, commerce and academia. Recognized worldwide as THE primary source of reference for applied research and technical expertise it is your first step to fully secure systems.