An effective scheme for classifying imbalanced traffic in SD-IoT, leveraging XGBoost and active learning

IF 4.6 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Computer Networks Pub Date : 2025-02-01 Epub Date: 2024-11-26 DOI:10.1016/j.comnet.2024.110939
Chandroth Jisi, Byeong-hee Roh, Jehad Ali
{"title":"An effective scheme for classifying imbalanced traffic in SD-IoT, leveraging XGBoost and active learning","authors":"Chandroth Jisi,&nbsp;Byeong-hee Roh,&nbsp;Jehad Ali","doi":"10.1016/j.comnet.2024.110939","DOIUrl":null,"url":null,"abstract":"<div><div>The volume and diversity of Internet traffic are constantly growing due to the simplicity of Internet of Things (IoT) technology, making machine learning-powered solutions increasingly essential for efficient network oversight in the future. The IoT applications prefer stringent but various Quality of Service (QoS). To allocate network resources and offer security based on these QoS, network traffic classification is the foremost solution and a complex part of modern communication. Software Defined Networking (SDN) is combined with machine learning (ML) to automate traffic classification in the IoT network. Nevertheless, uneven class distribution in traffic classification is brought about by the immanent features of Software-Defined IoT (SD-IoT) networks, which could hinder classification performance, particularly for minority classes. In order to solve the issue of class imbalance in SD-IoT environments, this study introduces a Cost-Sensitive XGBoost with Active Learning (AL-CSXGB) algorithm. This unique approach characterizes class distribution from a new point of view. The proposed work dynamically assigns a weight to different applications and actively queries to label new data points iteratively to acquire better accuracy. Experiments on the MOORE_SET and ISCX VPN-nonVPN datasets are used to ensure the efficiency of the algorithm under consideration. The experimental findings show that AL-CSXGB outperforms the other state-of-the-art methods regarding classification accuracy and computation time and alleviates the imbalance problem in SD-IoT networks. The proposed scheme achieves an accuracy of 98.4% on the MOORE_SET dataset and 98.89% on the ISCX VPN-nonVPN dataset, demonstrating its effectiveness and reliability in diverse scenarios.</div></div>","PeriodicalId":50637,"journal":{"name":"Computer Networks","volume":"257 ","pages":"Article 110939"},"PeriodicalIF":4.6000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389128624007710","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/11/26 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

The volume and diversity of Internet traffic are constantly growing due to the simplicity of Internet of Things (IoT) technology, making machine learning-powered solutions increasingly essential for efficient network oversight in the future. The IoT applications prefer stringent but various Quality of Service (QoS). To allocate network resources and offer security based on these QoS, network traffic classification is the foremost solution and a complex part of modern communication. Software Defined Networking (SDN) is combined with machine learning (ML) to automate traffic classification in the IoT network. Nevertheless, uneven class distribution in traffic classification is brought about by the immanent features of Software-Defined IoT (SD-IoT) networks, which could hinder classification performance, particularly for minority classes. In order to solve the issue of class imbalance in SD-IoT environments, this study introduces a Cost-Sensitive XGBoost with Active Learning (AL-CSXGB) algorithm. This unique approach characterizes class distribution from a new point of view. The proposed work dynamically assigns a weight to different applications and actively queries to label new data points iteratively to acquire better accuracy. Experiments on the MOORE_SET and ISCX VPN-nonVPN datasets are used to ensure the efficiency of the algorithm under consideration. The experimental findings show that AL-CSXGB outperforms the other state-of-the-art methods regarding classification accuracy and computation time and alleviates the imbalance problem in SD-IoT networks. The proposed scheme achieves an accuracy of 98.4% on the MOORE_SET dataset and 98.89% on the ISCX VPN-nonVPN dataset, demonstrating its effectiveness and reliability in diverse scenarios.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用XGBoost和主动学习,对SD-IoT中不均衡流量进行分类的有效方案
由于物联网(IoT)技术的简单性,互联网流量的数量和多样性正在不断增长,这使得机器学习驱动的解决方案对于未来有效的网络监管变得越来越重要。物联网应用更喜欢严格但多样的服务质量(QoS)。为了在这些QoS的基础上分配网络资源并提供安全性,网络流分类是现代通信的首要解决方案,也是一个复杂的部分。软件定义网络(SDN)与机器学习(ML)相结合,可在物联网网络中自动进行流量分类。然而,由于软件定义物联网(SD-IoT)网络的固有特性,导致流分类中的类分布不均匀,这可能会影响分类性能,特别是对于少数类。为了解决SD-IoT环境下的类不平衡问题,本研究引入了一种Cost-Sensitive XGBoost with Active Learning (AL-CSXGB)算法。这种独特的方法从一个新的角度描述了类的分布。该方法动态地为不同的应用分配权重,并通过主动查询来迭代地标记新的数据点,以获得更好的准确性。在MOORE_SET和ISCX vpn -非vpn数据集上进行了实验,以确保所考虑算法的效率。实验结果表明,AL-CSXGB在分类精度和计算时间方面优于其他最先进的方法,缓解了SD-IoT网络中的不平衡问题。该方案在MOORE_SET数据集和ISCX vpn -非vpn数据集上的准确率分别达到98.4%和98.89%,证明了该方案在不同场景下的有效性和可靠性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computer Networks
Computer Networks 工程技术-电信学
CiteScore
10.80
自引率
3.60%
发文量
434
审稿时长
8.6 months
期刊介绍: Computer Networks is an international, archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in the computer communications networking area. The audience includes researchers, managers and operators of networks as well as designers and implementors. The Editorial Board will consider any material for publication that is of interest to those groups.
期刊最新文献
From simulation to deep learning: Survey on network performance modeling approaches Eco-efficient task scheduling for MLLMs in edge-cloud continuum TraceX: Early-stage advanced persistent threat detection framework using semantic network traffic analysis Beyond flat identification: Exploiting site-page structure for hierarchical webpage fingerprinting RFD-R: AI-driven dynamic repacking framework for cloud-native O-RAN functions
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1