{"title":"An effective scheme for classifying imbalanced traffic in SD-IoT, leveraging XGBoost and active learning","authors":"Chandroth Jisi, Byeong-hee Roh, Jehad Ali","doi":"10.1016/j.comnet.2024.110939","DOIUrl":null,"url":null,"abstract":"<div><div>The volume and diversity of Internet traffic are constantly growing due to the simplicity of Internet of Things (IoT) technology, making machine learning-powered solutions increasingly essential for efficient network oversight in the future. The IoT applications prefer stringent but various Quality of Service (QoS). To allocate network resources and offer security based on these QoS, network traffic classification is the foremost solution and a complex part of modern communication. Software Defined Networking (SDN) is combined with machine learning (ML) to automate traffic classification in the IoT network. Nevertheless, uneven class distribution in traffic classification is brought about by the immanent features of Software-Defined IoT (SD-IoT) networks, which could hinder classification performance, particularly for minority classes. In order to solve the issue of class imbalance in SD-IoT environments, this study introduces a Cost-Sensitive XGBoost with Active Learning (AL-CSXGB) algorithm. This unique approach characterizes class distribution from a new point of view. The proposed work dynamically assigns a weight to different applications and actively queries to label new data points iteratively to acquire better accuracy. Experiments on the MOORE_SET and ISCX VPN-nonVPN datasets are used to ensure the efficiency of the algorithm under consideration. The experimental findings show that AL-CSXGB outperforms the other state-of-the-art methods regarding classification accuracy and computation time and alleviates the imbalance problem in SD-IoT networks. The proposed scheme achieves an accuracy of 98.4% on the MOORE_SET dataset and 98.89% on the ISCX VPN-nonVPN dataset, demonstrating its effectiveness and reliability in diverse scenarios.</div></div>","PeriodicalId":50637,"journal":{"name":"Computer Networks","volume":"257 ","pages":"Article 110939"},"PeriodicalIF":4.6000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389128624007710","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/11/26 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
The volume and diversity of Internet traffic are constantly growing due to the simplicity of Internet of Things (IoT) technology, making machine learning-powered solutions increasingly essential for efficient network oversight in the future. The IoT applications prefer stringent but various Quality of Service (QoS). To allocate network resources and offer security based on these QoS, network traffic classification is the foremost solution and a complex part of modern communication. Software Defined Networking (SDN) is combined with machine learning (ML) to automate traffic classification in the IoT network. Nevertheless, uneven class distribution in traffic classification is brought about by the immanent features of Software-Defined IoT (SD-IoT) networks, which could hinder classification performance, particularly for minority classes. In order to solve the issue of class imbalance in SD-IoT environments, this study introduces a Cost-Sensitive XGBoost with Active Learning (AL-CSXGB) algorithm. This unique approach characterizes class distribution from a new point of view. The proposed work dynamically assigns a weight to different applications and actively queries to label new data points iteratively to acquire better accuracy. Experiments on the MOORE_SET and ISCX VPN-nonVPN datasets are used to ensure the efficiency of the algorithm under consideration. The experimental findings show that AL-CSXGB outperforms the other state-of-the-art methods regarding classification accuracy and computation time and alleviates the imbalance problem in SD-IoT networks. The proposed scheme achieves an accuracy of 98.4% on the MOORE_SET dataset and 98.89% on the ISCX VPN-nonVPN dataset, demonstrating its effectiveness and reliability in diverse scenarios.
由于物联网(IoT)技术的简单性,互联网流量的数量和多样性正在不断增长,这使得机器学习驱动的解决方案对于未来有效的网络监管变得越来越重要。物联网应用更喜欢严格但多样的服务质量(QoS)。为了在这些QoS的基础上分配网络资源并提供安全性,网络流分类是现代通信的首要解决方案,也是一个复杂的部分。软件定义网络(SDN)与机器学习(ML)相结合,可在物联网网络中自动进行流量分类。然而,由于软件定义物联网(SD-IoT)网络的固有特性,导致流分类中的类分布不均匀,这可能会影响分类性能,特别是对于少数类。为了解决SD-IoT环境下的类不平衡问题,本研究引入了一种Cost-Sensitive XGBoost with Active Learning (AL-CSXGB)算法。这种独特的方法从一个新的角度描述了类的分布。该方法动态地为不同的应用分配权重,并通过主动查询来迭代地标记新的数据点,以获得更好的准确性。在MOORE_SET和ISCX vpn -非vpn数据集上进行了实验,以确保所考虑算法的效率。实验结果表明,AL-CSXGB在分类精度和计算时间方面优于其他最先进的方法,缓解了SD-IoT网络中的不平衡问题。该方案在MOORE_SET数据集和ISCX vpn -非vpn数据集上的准确率分别达到98.4%和98.89%,证明了该方案在不同场景下的有效性和可靠性。
期刊介绍:
Computer Networks is an international, archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in the computer communications networking area. The audience includes researchers, managers and operators of networks as well as designers and implementors. The Editorial Board will consider any material for publication that is of interest to those groups.