高效的物联网流量推断:从多视图分类到渐进式监控

ACM transactions on the internet of things Pub Date : 2023-09-24 DOI:10.1145/3625306

Arman Pashamokhtari, Gustavo Batista, Hassan Habibi Gharakheili

{"title":"高效的物联网流量推断:从多视图分类到渐进式监控","authors":"Arman Pashamokhtari, Gustavo Batista, Hassan Habibi Gharakheili","doi":"10.1145/3625306","DOIUrl":null,"url":null,"abstract":"Machine learning-based techniques have proven to be effective in IoT network behavioral inference. Existing works developed data-driven models based on features from network packets and/or flows, but mainly in a static and ad-hoc manner, without adequately quantifying their gains versus costs. In this paper, we develop a generic architecture that comprises two distinct inference modules in tandem, which begins with IoT network behavior classification followed by continuous monitoring. In contrast to prior relevant works, our generic architecture flexibly accounts for various traffic features, modeling algorithms, and inference strategies. We argue quantitative metrics are required to systematically compare and efficiently select various traffic features for IoT traffic inference. This paper makes three contributions. (1) For IoT behavior classification, we identify four metrics, namely cost, accuracy, availability, and frequency, that allow us to characterize and quantify the efficacy of seven sets of packet-based and flow-based traffic features, each resulting in a specialized model. By experimenting with traffic traces of 25 IoT devices collected from our testbed, we demonstrate that specialized-view models can be superior to a single combined-view model trained on a plurality of features by accuracy and cost. We also develop an optimization problem that selects the best set of specialized models for a multi-view classification; (2) For monitoring the expected IoT behaviors, we develop a progressive system consisting of one-class clustering models (per IoT class) at three levels of granularity. We develop an outlier detection technique on top of the convex hull algorithm to form custom-shape boundaries for the one-class models. We show how progression helps with computing costs and the explainability of detecting anomalies; and, (3) We evaluate the efficacy of our optimally-selected classifiers versus the superset of specialized classifiers by applying them to our IoT traffic traces. We demonstrate how the optimal set can reduce the processing cost by a factor of six with insignificant impacts on the classification accuracy. Also, we apply our monitoring models to a public IoT dataset of benign and attack traces and show they yield an average true positive rate of 94% and a false positive rate of 5%. Finally, we publicly release our data (training and testing instances of classification and monitoring tasks) and code for convex hull-based one-class models.","PeriodicalId":500855,"journal":{"name":"ACM transactions on the internet of things","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient IoT Traffic Inference: from Multi-View Classification to Progressive Monitoring\",\"authors\":\"Arman Pashamokhtari, Gustavo Batista, Hassan Habibi Gharakheili\",\"doi\":\"10.1145/3625306\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning-based techniques have proven to be effective in IoT network behavioral inference. Existing works developed data-driven models based on features from network packets and/or flows, but mainly in a static and ad-hoc manner, without adequately quantifying their gains versus costs. In this paper, we develop a generic architecture that comprises two distinct inference modules in tandem, which begins with IoT network behavior classification followed by continuous monitoring. In contrast to prior relevant works, our generic architecture flexibly accounts for various traffic features, modeling algorithms, and inference strategies. We argue quantitative metrics are required to systematically compare and efficiently select various traffic features for IoT traffic inference. This paper makes three contributions. (1) For IoT behavior classification, we identify four metrics, namely cost, accuracy, availability, and frequency, that allow us to characterize and quantify the efficacy of seven sets of packet-based and flow-based traffic features, each resulting in a specialized model. By experimenting with traffic traces of 25 IoT devices collected from our testbed, we demonstrate that specialized-view models can be superior to a single combined-view model trained on a plurality of features by accuracy and cost. We also develop an optimization problem that selects the best set of specialized models for a multi-view classification; (2) For monitoring the expected IoT behaviors, we develop a progressive system consisting of one-class clustering models (per IoT class) at three levels of granularity. We develop an outlier detection technique on top of the convex hull algorithm to form custom-shape boundaries for the one-class models. We show how progression helps with computing costs and the explainability of detecting anomalies; and, (3) We evaluate the efficacy of our optimally-selected classifiers versus the superset of specialized classifiers by applying them to our IoT traffic traces. We demonstrate how the optimal set can reduce the processing cost by a factor of six with insignificant impacts on the classification accuracy. Also, we apply our monitoring models to a public IoT dataset of benign and attack traces and show they yield an average true positive rate of 94% and a false positive rate of 5%. Finally, we publicly release our data (training and testing instances of classification and monitoring tasks) and code for convex hull-based one-class models.\",\"PeriodicalId\":500855,\"journal\":{\"name\":\"ACM transactions on the internet of things\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM transactions on the internet of things\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3625306\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM transactions on the internet of things","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3625306","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

基于机器学习的技术已被证明在物联网网络行为推理中是有效的。现有的工作开发了基于网络数据包和/或流特征的数据驱动模型，但主要是静态和特别的方式，没有充分量化它们的收益与成本。在本文中，我们开发了一个通用架构，该架构由两个不同的推理模块串联组成，首先是物联网网络行为分类，然后是连续监控。与之前的相关工作相比，我们的通用架构灵活地考虑了各种流量特征、建模算法和推理策略。我们认为需要定量指标来系统地比较和有效地选择物联网流量推断的各种流量特征。本文有三个贡献。(1)对于物联网行为分类，我们确定了四个指标，即成本、准确性、可用性和频率，这使我们能够表征和量化七组基于数据包和基于流量的流量特征的有效性，每个特征都产生一个专门的模型。通过对从我们的测试平台收集的25个物联网设备的流量轨迹进行实验，我们证明，在准确性和成本方面，专业视图模型可以优于在多个特征上训练的单个组合视图模型。我们还开发了一个优化问题，为多视图分类选择最佳的专用模型集;(2)为了监测预期的物联网行为，我们在三个粒度级别上开发了一个由一类聚类模型(每个物联网类)组成的渐进系统。我们在凸包算法的基础上开发了一种离群点检测技术，为一类模型形成自定义形状的边界。我们展示了进展如何帮助计算成本和检测异常的可解释性;(3)我们通过将我们的最佳选择分类器应用于我们的物联网流量轨迹来评估它们与专用分类器超集的效果。我们演示了最优集如何将处理成本降低六倍，而对分类精度的影响不显著。此外，我们将我们的监控模型应用于良性和攻击痕迹的公共物联网数据集，并显示它们的平均真阳性率为94%，假阳性率为5%。最后，我们公开发布我们的数据(分类和监控任务的训练和测试实例)以及基于凸壳的单类模型的代码。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Efficient IoT Traffic Inference: from Multi-View Classification to Progressive Monitoring

Machine learning-based techniques have proven to be effective in IoT network behavioral inference. Existing works developed data-driven models based on features from network packets and/or flows, but mainly in a static and ad-hoc manner, without adequately quantifying their gains versus costs. In this paper, we develop a generic architecture that comprises two distinct inference modules in tandem, which begins with IoT network behavior classification followed by continuous monitoring. In contrast to prior relevant works, our generic architecture flexibly accounts for various traffic features, modeling algorithms, and inference strategies. We argue quantitative metrics are required to systematically compare and efficiently select various traffic features for IoT traffic inference. This paper makes three contributions. (1) For IoT behavior classification, we identify four metrics, namely cost, accuracy, availability, and frequency, that allow us to characterize and quantify the efficacy of seven sets of packet-based and flow-based traffic features, each resulting in a specialized model. By experimenting with traffic traces of 25 IoT devices collected from our testbed, we demonstrate that specialized-view models can be superior to a single combined-view model trained on a plurality of features by accuracy and cost. We also develop an optimization problem that selects the best set of specialized models for a multi-view classification; (2) For monitoring the expected IoT behaviors, we develop a progressive system consisting of one-class clustering models (per IoT class) at three levels of granularity. We develop an outlier detection technique on top of the convex hull algorithm to form custom-shape boundaries for the one-class models. We show how progression helps with computing costs and the explainability of detecting anomalies; and, (3) We evaluate the efficacy of our optimally-selected classifiers versus the superset of specialized classifiers by applying them to our IoT traffic traces. We demonstrate how the optimal set can reduce the processing cost by a factor of six with insignificant impacts on the classification accuracy. Also, we apply our monitoring models to a public IoT dataset of benign and attack traces and show they yield an average true positive rate of 94% and a false positive rate of 5%. Finally, we publicly release our data (training and testing instances of classification and monitoring tasks) and code for convex hull-based one-class models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM transactions on the internet of things

自引率

0.00%

发文量