机器学习与数据平衡技术用于物联网攻击和异常检测

Muhammad Asad Arshed, M. A. Jabbar, Farrukh Liaquat, Usman Mohy-ud-Din Chaudhary, Danial Karim, Hina Alam, Shahzad Mumtaz
{"title":"机器学习与数据平衡技术用于物联网攻击和异常检测","authors":"Muhammad Asad Arshed, M. A. Jabbar, Farrukh Liaquat, Usman Mohy-ud-Din Chaudhary, Danial Karim, Hina Alam, Shahzad Mumtaz","doi":"10.33411/ijist/2022040218","DOIUrl":null,"url":null,"abstract":"Nowadays the significant concern in IoT infrastructure is anomaly and attack detection from IoT devices. Due to the advanced technology, the attack issues are increasing gradually. There are many attacks like Data Type Probing, Denial of Service, Malicious Operation, Malicious Control, Spying, Scan, and Wrong Setup that cause the failure of the IoT-based system. In this paper, several machine learning model performances have been compared to effectively predict the attack and anomaly. The performance of the models is compared with evaluation matrices (Accuracy) and confusion matrix for the final version of the effective model. Most of the recent studies performed experiments on an unbalanced dataset; that is clear that the model will be biased for such a dataset, so we completed the experiments in two forms, unbalanced and balanced data samples. For the unbalanced dataset, we have achieved the highest accuracy of 98.0% with Generalized Linear Model as well as with Random Forest; Unbalanced dataset means most of the chances are that model is biased, so we have also performed the experiments with Random Under Sampling Technique (Balancing Data) and achieved the highest accuracy of 94.3% with Generalized Linear Model. The confusion matrix in this study also supports the performance of the Generalized Linear Model.","PeriodicalId":330306,"journal":{"name":"Vol 4 Issue 2","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Machine Learning with Data Balancing Technique for IoT Attack and Anomalies Detection\",\"authors\":\"Muhammad Asad Arshed, M. A. Jabbar, Farrukh Liaquat, Usman Mohy-ud-Din Chaudhary, Danial Karim, Hina Alam, Shahzad Mumtaz\",\"doi\":\"10.33411/ijist/2022040218\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays the significant concern in IoT infrastructure is anomaly and attack detection from IoT devices. Due to the advanced technology, the attack issues are increasing gradually. There are many attacks like Data Type Probing, Denial of Service, Malicious Operation, Malicious Control, Spying, Scan, and Wrong Setup that cause the failure of the IoT-based system. In this paper, several machine learning model performances have been compared to effectively predict the attack and anomaly. The performance of the models is compared with evaluation matrices (Accuracy) and confusion matrix for the final version of the effective model. Most of the recent studies performed experiments on an unbalanced dataset; that is clear that the model will be biased for such a dataset, so we completed the experiments in two forms, unbalanced and balanced data samples. For the unbalanced dataset, we have achieved the highest accuracy of 98.0% with Generalized Linear Model as well as with Random Forest; Unbalanced dataset means most of the chances are that model is biased, so we have also performed the experiments with Random Under Sampling Technique (Balancing Data) and achieved the highest accuracy of 94.3% with Generalized Linear Model. The confusion matrix in this study also supports the performance of the Generalized Linear Model.\",\"PeriodicalId\":330306,\"journal\":{\"name\":\"Vol 4 Issue 2\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Vol 4 Issue 2\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.33411/ijist/2022040218\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Vol 4 Issue 2","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33411/ijist/2022040218","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

目前,物联网基础设施的主要关注点是来自物联网设备的异常和攻击检测。由于技术的进步,攻击问题也逐渐增多。有许多攻击,如数据类型探测,拒绝服务,恶意操作,恶意控制,间谍,扫描和错误设置,导致基于物联网的系统失败。本文比较了几种机器学习模型的性能,以有效地预测攻击和异常。将模型的性能与评价矩阵(精度)和混淆矩阵进行比较,得到有效模型的最终版本。最近的大多数研究都是在不平衡数据集上进行实验的;很明显,对于这样的数据集,模型会有偏差,所以我们以不平衡和平衡数据样本两种形式完成了实验。对于非平衡数据集,我们使用广义线性模型和随机森林达到了98.0%的最高准确率;不平衡数据集意味着大多数可能性是模型有偏差,因此我们也使用随机抽样下技术(平衡数据)进行了实验,并使用广义线性模型获得了94.3%的最高精度。本研究中的混淆矩阵也支持广义线性模型的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Machine Learning with Data Balancing Technique for IoT Attack and Anomalies Detection
Nowadays the significant concern in IoT infrastructure is anomaly and attack detection from IoT devices. Due to the advanced technology, the attack issues are increasing gradually. There are many attacks like Data Type Probing, Denial of Service, Malicious Operation, Malicious Control, Spying, Scan, and Wrong Setup that cause the failure of the IoT-based system. In this paper, several machine learning model performances have been compared to effectively predict the attack and anomaly. The performance of the models is compared with evaluation matrices (Accuracy) and confusion matrix for the final version of the effective model. Most of the recent studies performed experiments on an unbalanced dataset; that is clear that the model will be biased for such a dataset, so we completed the experiments in two forms, unbalanced and balanced data samples. For the unbalanced dataset, we have achieved the highest accuracy of 98.0% with Generalized Linear Model as well as with Random Forest; Unbalanced dataset means most of the chances are that model is biased, so we have also performed the experiments with Random Under Sampling Technique (Balancing Data) and achieved the highest accuracy of 94.3% with Generalized Linear Model. The confusion matrix in this study also supports the performance of the Generalized Linear Model.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Prominence of Filtering Techniques for Harmonics Mitigation in Advanced Power Electronics Systems Knowledge Acquisition System for Sentiment Analysis Determination and Mitigation of Urban Heat Island (UHI) In Lahore (A comparative Study of Landsat 8&9) A Study of Reasons behind Unproductivity and In-decisiveness in public Institutions of Urban Planning in Pakistan. Interpretation of Expressions through Hand Signs Using Deep Learning Techniques
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1