A comparative study of deep learning and decision tree based ensemble learning algorithms for network traffic identification

Q3 Engineering Telfor Journal Pub Date : 2022-01-01 DOI:10.5937/telfor2202061n

Nedeljko Nikolić, S. Tomovic, I. Radusinović

{"title":"A comparative study of deep learning and decision tree based ensemble learning algorithms for network traffic identification","authors":"Nedeljko Nikolić, S. Tomovic, I. Radusinović","doi":"10.5937/telfor2202061n","DOIUrl":null,"url":null,"abstract":"In this paper, we apply Deep Learning (DL) and decision-tree-based ensemble learning algorithms to classify network traffic by application. Various Deep Learning (DL) models for network traffic identification have been presented, implemented and compared, including 1D convolutional, stacked autoencoder, multi-layer perceptron, and combination of the aforementioned. Then the results of DL models have been compared to those obtained with two popular ensemble learning models based on decision trees-Random Forest and XGBoost. To train and test the classification models, a dataset containing both encrypted and unencrypted traffic has been collected in a real network, under normal operating conditions, and pre-processed in a way that ensures non-biased results. The classification uncertainties of the models have been also quantified on publicly available ISCX VPN-nonVPN dataset. The models have been compared in terms of precision, recall, F1 score and accuracy, for different levels of complexity and training dataset sizes. The evaluation results indicate that the decision-tree ensemble learning algorithms provide more accurate results and outperform the DL algorithms. The performance gap reduces with the dataset complexity.","PeriodicalId":37719,"journal":{"name":"Telfor Journal","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Telfor Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5937/telfor2202061n","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Engineering","Score":null,"Total":0}

引用次数: 0

Abstract

In this paper, we apply Deep Learning (DL) and decision-tree-based ensemble learning algorithms to classify network traffic by application. Various Deep Learning (DL) models for network traffic identification have been presented, implemented and compared, including 1D convolutional, stacked autoencoder, multi-layer perceptron, and combination of the aforementioned. Then the results of DL models have been compared to those obtained with two popular ensemble learning models based on decision trees-Random Forest and XGBoost. To train and test the classification models, a dataset containing both encrypted and unencrypted traffic has been collected in a real network, under normal operating conditions, and pre-processed in a way that ensures non-biased results. The classification uncertainties of the models have been also quantified on publicly available ISCX VPN-nonVPN dataset. The models have been compared in terms of precision, recall, F1 score and accuracy, for different levels of complexity and training dataset sizes. The evaluation results indicate that the decision-tree ensemble learning algorithms provide more accurate results and outperform the DL algorithms. The performance gap reduces with the dataset complexity.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

深度学习与基于决策树的集成学习算法在网络流量识别中的比较研究

在本文中，我们应用深度学习和基于决策树的集成学习算法对网络流量进行应用分类。各种用于网络流量识别的深度学习(DL)模型已经提出、实现和比较，包括1D卷积、堆叠自编码器、多层感知器以及上述模型的组合。然后将深度学习模型的结果与两种流行的基于决策树的集成学习模型——随机森林和XGBoost的结果进行了比较。为了训练和测试分类模型，在正常运行条件下，在真实网络中收集了包含加密和未加密流量的数据集，并以确保结果无偏的方式进行了预处理。模型的分类不确定性也在公开可用的ISCX vpn -非vpn数据集上进行了量化。对于不同的复杂程度和训练数据集大小，这些模型在精度、召回率、F1分数和准确性方面进行了比较。评估结果表明，决策树集成学习算法提供了更准确的结果，并且优于深度学习算法。性能差距随着数据集复杂性的降低而减小。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Telfor Journal Engineering-Media Technology

CiteScore

1.50

自引率

0.00%

发文量

审稿时长

23 weeks

期刊介绍： The TELFOR Journal is an open access international scientific journal publishing improved and extended versions of the selected best papers initially reported at the annual TELFOR Conference (www.telfor.rs), papers invited by the Editorial Board, and papers submitted by authors themselves for publishing. All papers are subject to reviewing. The TELFOR Journal is published in the English language, with both electronic and printed versions. Being an IEEE co-supported publication, it will follow all the IEEE rules and procedures. The TELFOR Journal covers all the essential branches of modern telecommunications and information technology: Telecommunications Policy and Services, Telecommunications Networks, Radio Communications, Communications Systems, Signal Processing, Optical Communications, Applied Electromagnetics, Applied Electronics, Multimedia, Software Tools and Applications, as well as other fields related to ICT. This large spectrum of topics accounts for the rapid convergence through telecommunications of the underlying technologies towards the information and knowledge society. The Journal provides a medium for exchanging research results and technological achievements accomplished by the scientific community from academia and industry.