基于原始流量数据包的物联网网络行为鲁棒轻量级建模

Aleksandar Pasquini;Rajesh Vasa;Irini Logothetis;Hassan Habibi Gharakheili;Alexander Chambers;Minh Tran
{"title":"基于原始流量数据包的物联网网络行为鲁棒轻量级建模","authors":"Aleksandar Pasquini;Rajesh Vasa;Irini Logothetis;Hassan Habibi Gharakheili;Alexander Chambers;Minh Tran","doi":"10.1109/TMLCN.2024.3517613","DOIUrl":null,"url":null,"abstract":"Machine Learning (ML)-based techniques are increasingly used for network management tasks, such as intrusion detection, application identification, or asset management. Recent studies show that neural network-based traffic analysis can achieve performance comparable to human feature-engineered ML pipelines. However, neural networks provide this performance at a higher computational cost and complexity, due to high-throughput traffic conditions necessitating specialized hardware for real-time operations. This paper presents lightweight models for encoding characteristics of Internet-of-Things (IoT) network packets; 1) we present two strategies to encode packets (regardless of their size, encryption, and protocol) to integer vectors: a shallow lightweight neural network and compression. With a public dataset containing about 8 million packets emitted by 22 IoT device types, we show the encoded packets can form complete (up to 80%) and homogeneous (up to 89%) clusters; 2) we demonstrate the efficacy of our generated encodings in the downstream classification task and quantify their computing costs. We train three multi-class models to predict the IoT class given network packets and show our models can achieve the same levels of accuracy (94%) as deep neural network embeddings but with computing costs up to 10 times lower; 3) we examine how the amount of packet data (headers and payload) can affect the prediction quality. We demonstrate how the choice of Internet Protocol (IP) payloads strikes a balance between prediction accuracy (99%) and cost. Along with the cost-efficacy of models, this capability can result in rapid and accurate predictions, meeting the requirements of network operators.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"98-116"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10802939","citationCount":"0","resultStr":"{\"title\":\"Robust and Lightweight Modeling of IoT Network Behaviors From Raw Traffic Packets\",\"authors\":\"Aleksandar Pasquini;Rajesh Vasa;Irini Logothetis;Hassan Habibi Gharakheili;Alexander Chambers;Minh Tran\",\"doi\":\"10.1109/TMLCN.2024.3517613\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine Learning (ML)-based techniques are increasingly used for network management tasks, such as intrusion detection, application identification, or asset management. Recent studies show that neural network-based traffic analysis can achieve performance comparable to human feature-engineered ML pipelines. However, neural networks provide this performance at a higher computational cost and complexity, due to high-throughput traffic conditions necessitating specialized hardware for real-time operations. This paper presents lightweight models for encoding characteristics of Internet-of-Things (IoT) network packets; 1) we present two strategies to encode packets (regardless of their size, encryption, and protocol) to integer vectors: a shallow lightweight neural network and compression. With a public dataset containing about 8 million packets emitted by 22 IoT device types, we show the encoded packets can form complete (up to 80%) and homogeneous (up to 89%) clusters; 2) we demonstrate the efficacy of our generated encodings in the downstream classification task and quantify their computing costs. We train three multi-class models to predict the IoT class given network packets and show our models can achieve the same levels of accuracy (94%) as deep neural network embeddings but with computing costs up to 10 times lower; 3) we examine how the amount of packet data (headers and payload) can affect the prediction quality. We demonstrate how the choice of Internet Protocol (IP) payloads strikes a balance between prediction accuracy (99%) and cost. Along with the cost-efficacy of models, this capability can result in rapid and accurate predictions, meeting the requirements of network operators.\",\"PeriodicalId\":100641,\"journal\":{\"name\":\"IEEE Transactions on Machine Learning in Communications and Networking\",\"volume\":\"3 \",\"pages\":\"98-116\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-12-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10802939\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Machine Learning in Communications and Networking\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10802939/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Machine Learning in Communications and Networking","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10802939/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

基于机器学习(ML)的技术越来越多地用于网络管理任务,如入侵检测、应用程序识别或资产管理。最近的研究表明,基于神经网络的流量分析可以达到与人类特征工程ML管道相当的性能。然而,神经网络以更高的计算成本和复杂性提供这种性能,因为高吞吐量的流量条件需要专门的硬件来进行实时操作。本文提出了物联网(IoT)网络数据包编码特征的轻量级模型;1)我们提出了两种将数据包(无论其大小,加密和协议如何)编码为整数向量的策略:浅轻量级神经网络和压缩。使用包含22种物联网设备类型发出的约800万个数据包的公共数据集,我们显示编码的数据包可以形成完整(高达80%)和均匀(高达89%)的集群;2)我们证明了我们生成的编码在下游分类任务中的有效性,并量化了它们的计算成本。我们训练了三个多类模型来预测给定网络数据包的物联网类,并表明我们的模型可以达到与深度神经网络嵌入相同的精度水平(94%),但计算成本降低了10倍;3)我们检查数据包数据(报头和有效载荷)的数量如何影响预测质量。我们演示了互联网协议(IP)有效载荷的选择如何在预测精度(99%)和成本之间取得平衡。随着模型的成本效益,这种能力可以导致快速和准确的预测,满足网络运营商的要求。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Robust and Lightweight Modeling of IoT Network Behaviors From Raw Traffic Packets
Machine Learning (ML)-based techniques are increasingly used for network management tasks, such as intrusion detection, application identification, or asset management. Recent studies show that neural network-based traffic analysis can achieve performance comparable to human feature-engineered ML pipelines. However, neural networks provide this performance at a higher computational cost and complexity, due to high-throughput traffic conditions necessitating specialized hardware for real-time operations. This paper presents lightweight models for encoding characteristics of Internet-of-Things (IoT) network packets; 1) we present two strategies to encode packets (regardless of their size, encryption, and protocol) to integer vectors: a shallow lightweight neural network and compression. With a public dataset containing about 8 million packets emitted by 22 IoT device types, we show the encoded packets can form complete (up to 80%) and homogeneous (up to 89%) clusters; 2) we demonstrate the efficacy of our generated encodings in the downstream classification task and quantify their computing costs. We train three multi-class models to predict the IoT class given network packets and show our models can achieve the same levels of accuracy (94%) as deep neural network embeddings but with computing costs up to 10 times lower; 3) we examine how the amount of packet data (headers and payload) can affect the prediction quality. We demonstrate how the choice of Internet Protocol (IP) payloads strikes a balance between prediction accuracy (99%) and cost. Along with the cost-efficacy of models, this capability can result in rapid and accurate predictions, meeting the requirements of network operators.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Asynchronous Real-Time Federated Learning for Anomaly Detection in Microservice Cloud Applications Conditional Denoising Diffusion Probabilistic Models for Data Reconstruction Enhancement in Wireless Communications Multi-Agent Reinforcement Learning With Action Masking for UAV-Enabled Mobile Communications Online Learning for Intelligent Thermal Management of Interference-Coupled and Passively Cooled Base Stations Robust and Lightweight Modeling of IoT Network Behaviors From Raw Traffic Packets
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1