Lightweight Machine Learning Classifiers of IoT Traffic Flows

R. Bikmukhamedov, A. Nadeev
{"title":"Lightweight Machine Learning Classifiers of IoT Traffic Flows","authors":"R. Bikmukhamedov, A. Nadeev","doi":"10.1109/SYNCHROINFO.2019.8814156","DOIUrl":null,"url":null,"abstract":"IoT traffic flows have different from traditional devices statistics and their classification become an important task because of the exponentially growing number of smart devices. Conventional Deep Packet Inspection systems that rely on inspection of open fields in TLS and DNS packets, and the trend of encrypting the open fields makes machine learning based systems the only viable option for future networks. Moreover, computational complexity of models becomes crucial for large-scale operations. In this work, we investigated whether simple models, such as Logistic Regression, SVM with linear kernel, and a Decision Tree, have suitable for real-world deployments performance of multiclass classification of IoT traces, given thoughtful features engineering. We introduced a new flow feature of categorical type that describes a set of TCP-flag fields within a flow. In addition, removal of correlated features and feature space transformation via PCA method showed their usefulness in terms of prediction complexity reduction. In order to account for online classification mode, we limited the maximal number of packets within a flow to 10. Moreover, to estimate the upper-bound performance with given features, we compared the simple algorithms with Random Forest, Gradient Boosting and a feed-forward neural network. We performed 4-fold cross-validation of models by metrics Accuracy and F1-measure. The test results demonstrated that the introduced feature increases F1-measure for logistic regression from 99.1% in the base case to 99.6%, thus closely approaching more computationally expensive models. Overall, the evaluation results demonstrated feasibility of a lightweight model for IoT flow classification task with the suitable for a practical deployment performance.","PeriodicalId":363848,"journal":{"name":"2019 Systems of Signal Synchronization, Generating and Processing in Telecommunications (SYNCHROINFO)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Systems of Signal Synchronization, Generating and Processing in Telecommunications (SYNCHROINFO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SYNCHROINFO.2019.8814156","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

IoT traffic flows have different from traditional devices statistics and their classification become an important task because of the exponentially growing number of smart devices. Conventional Deep Packet Inspection systems that rely on inspection of open fields in TLS and DNS packets, and the trend of encrypting the open fields makes machine learning based systems the only viable option for future networks. Moreover, computational complexity of models becomes crucial for large-scale operations. In this work, we investigated whether simple models, such as Logistic Regression, SVM with linear kernel, and a Decision Tree, have suitable for real-world deployments performance of multiclass classification of IoT traces, given thoughtful features engineering. We introduced a new flow feature of categorical type that describes a set of TCP-flag fields within a flow. In addition, removal of correlated features and feature space transformation via PCA method showed their usefulness in terms of prediction complexity reduction. In order to account for online classification mode, we limited the maximal number of packets within a flow to 10. Moreover, to estimate the upper-bound performance with given features, we compared the simple algorithms with Random Forest, Gradient Boosting and a feed-forward neural network. We performed 4-fold cross-validation of models by metrics Accuracy and F1-measure. The test results demonstrated that the introduced feature increases F1-measure for logistic regression from 99.1% in the base case to 99.6%, thus closely approaching more computationally expensive models. Overall, the evaluation results demonstrated feasibility of a lightweight model for IoT flow classification task with the suitable for a practical deployment performance.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
物联网流量的轻量级机器学习分类器
物联网流量与传统的设备统计不同,智能设备数量呈指数级增长,物联网流量分类成为一项重要任务。传统的深度包检测系统依赖于对TLS和DNS数据包中的开放字段的检测,以及对开放字段进行加密的趋势,使得基于机器学习的系统成为未来网络唯一可行的选择。此外,模型的计算复杂度对于大规模操作至关重要。在这项工作中,我们研究了简单的模型,如逻辑回归、线性核支持向量机和决策树,在考虑到特征工程的情况下,是否适合物联网轨迹的多类分类的实际部署性能。我们引入了一个分类类型的新流特性,它描述了流中的一组tcp标志字段。此外,通过PCA方法去除相关特征和进行特征空间变换,显示了它们在降低预测复杂度方面的有效性。为了考虑在线分类模式,我们将流中的最大数据包数量限制为10。此外,为了估计给定特征下的上界性能,我们将简单算法与随机森林、梯度增强和前馈神经网络进行了比较。我们通过度量精度和F1-measure对模型进行了4次交叉验证。测试结果表明,引入的特征将逻辑回归的f1测度从基本情况下的99.1%提高到99.6%,从而接近计算成本更高的模型。总体而言,评估结果证明了轻量级模型用于物联网流分类任务的可行性,并且适合实际部署性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Developing Methods and Algorithms for Studying Polarization Coherent Bandwidth for the NVIS Propagation Comparative Analysis of Chaotic Oscillators in the PSIM Environment Analog-to-Digital Converter Selection for Digital Receiver Fano Resonanace Characterization in Ring $\pi$ -Shift Fiber Bragg Gratings Biosensors. Modeling Results Application Efficiency Analysis of the Optimal Measurement Algorithm for Method OFDM Communication Networks Integrated Optimization
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1