{"title":"TLS Encrypted Application Classification Using Machine Learning with Flow Feature Engineering","authors":"Onur Barut, Rebecca S. Zhu, Yan Luo, Tong Zhang","doi":"10.1145/3442520.3442529","DOIUrl":null,"url":null,"abstract":"Network traffic classification has become increasingly important as the number of devices connected to the Internet is rapidly growing. Proportionally, the amount of encrypted traffic is also increasing, making payload based classification methods obsolete. Consequently, machine learning approaches have become crucial when user privacy is concerned. For this purpose, we propose an accurate, fast, and privacy preserved encrypted traffic classification approach with engineered flow feature extraction and appropriate feature selection. The proposed scheme achieves a 0.92899 macro-average F1 score and a 0.88313 macro-averaged mAP score for the encrypted traffic classification of Audio, Email, Chat, and Video classes derived from the non-vpn2016 dataset. Further experiments on the mixed non-encrypted and encrypted flow dataset with a data augmentation method called Synthetic Minority Over-Sampling Technique are conducted and the results are discussed for TLS-encrypted and mixed flows.","PeriodicalId":340416,"journal":{"name":"Proceedings of the 2020 10th International Conference on Communication and Network Security","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 10th International Conference on Communication and Network Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3442520.3442529","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
Network traffic classification has become increasingly important as the number of devices connected to the Internet is rapidly growing. Proportionally, the amount of encrypted traffic is also increasing, making payload based classification methods obsolete. Consequently, machine learning approaches have become crucial when user privacy is concerned. For this purpose, we propose an accurate, fast, and privacy preserved encrypted traffic classification approach with engineered flow feature extraction and appropriate feature selection. The proposed scheme achieves a 0.92899 macro-average F1 score and a 0.88313 macro-averaged mAP score for the encrypted traffic classification of Audio, Email, Chat, and Video classes derived from the non-vpn2016 dataset. Further experiments on the mixed non-encrypted and encrypted flow dataset with a data augmentation method called Synthetic Minority Over-Sampling Technique are conducted and the results are discussed for TLS-encrypted and mixed flows.