Hasan Karagöl, Oğuzhan Erdem, Barkin Akbas, Tuncay Soylu
{"title":"Darknet Traffic Classification with Machine Learning Algorithms and SMOTE Method","authors":"Hasan Karagöl, Oğuzhan Erdem, Barkin Akbas, Tuncay Soylu","doi":"10.1109/UBMK55850.2022.9919462","DOIUrl":null,"url":null,"abstract":"The Darknet is a network that can be accessed with certain privileges and runs a non-standard communication protocol. The Darknet traffic that consists of data from several known networks such as Tor and the P2P is often used for criminal activities due to its anonymity. It is so critical to correctly classify Darknet traffic to differentiate the individual flows for security purposes. In this paper, we proposed three different machine learning (ML) based traffic classification approaches; the binary classification of Darknet and Benign traffic classes (Case 1); the quadruple classification of classes Tor, NonTor, VPN, and NonVpn (Case 2); an traffic classification of eight sub-traffic classes (Case 3). We further applied the SMOTE method for balancing the sizes of the classes in the traffic dataset and feature selection (FS) algorithms to identify the most effective attributes where the number of features in the original dataset were reduced from 63 to 8, 8 and 6 for Case 1, 2 and 3 respectively. For all three cases, classification was performed with six different machine learning algorithms with and without SMOTE, and the highest accuracy values were obtained with SMOTE method. The highest accuracy values were obtained with the Random Forest Algorithm as 97.22%, 97.16% and 85.99% for Case 1, 2 and 3, respectively.","PeriodicalId":417604,"journal":{"name":"2022 7th International Conference on Computer Science and Engineering (UBMK)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 7th International Conference on Computer Science and Engineering (UBMK)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UBMK55850.2022.9919462","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The Darknet is a network that can be accessed with certain privileges and runs a non-standard communication protocol. The Darknet traffic that consists of data from several known networks such as Tor and the P2P is often used for criminal activities due to its anonymity. It is so critical to correctly classify Darknet traffic to differentiate the individual flows for security purposes. In this paper, we proposed three different machine learning (ML) based traffic classification approaches; the binary classification of Darknet and Benign traffic classes (Case 1); the quadruple classification of classes Tor, NonTor, VPN, and NonVpn (Case 2); an traffic classification of eight sub-traffic classes (Case 3). We further applied the SMOTE method for balancing the sizes of the classes in the traffic dataset and feature selection (FS) algorithms to identify the most effective attributes where the number of features in the original dataset were reduced from 63 to 8, 8 and 6 for Case 1, 2 and 3 respectively. For all three cases, classification was performed with six different machine learning algorithms with and without SMOTE, and the highest accuracy values were obtained with SMOTE method. The highest accuracy values were obtained with the Random Forest Algorithm as 97.22%, 97.16% and 85.99% for Case 1, 2 and 3, respectively.