Enhancing IoT security: A comparative study of feature reduction techniques for intrusion detection system

Intelligent Systems with Applications Pub Date : 2024-06-14 DOI:10.1016/j.iswa.2024.200407

Jing Li , Hewan Chen , Mohd Othman Shahizan , Lizawati Mi Yusuf

{"title":"Enhancing IoT security: A comparative study of feature reduction techniques for intrusion detection system","authors":"Jing Li , Hewan Chen , Mohd Othman Shahizan , Lizawati Mi Yusuf","doi":"10.1016/j.iswa.2024.200407","DOIUrl":null,"url":null,"abstract":"<div><p>Internet of Things (IoT) devices are extensively utilized but are susceptible to cyberattacks, posing significant security challenges. To mitigate these threats, machine learning techniques have been implemented for network intrusion detection in IoT environments. These techniques commonly employ various feature reduction methods, prior to inputting data into models, in order to enhance the efficiency of detection processes to meet real-time requirements. This study provides a comprehensive comparison of feature selection (FS) and feature extraction (FE) techniques for network intrusion detection systems (NIDS) in IoT environments, utilizing the TON-IoT and BoT-IoT datasets for both binary and multi-class classification tasks. We evaluated FS methods, including Pearson correlation and Chi-square, and FE methods, such as Principal Component Analysis (PCA) and Autoencoders (AE), across five classic machine learning models: Decision Tree (DT), Random Forest (RF), Naive Bayes (NB), k-Nearest Neighbors (kNN), and Multi-Layer Perceptron (MLP). Our analysis revealed that FE techniques generally achieve higher accuracy and robustness compared to FS methods, with RF paired with AE delivering superior performance despite higher computational demands. DTs are most effective with smaller feature sets, while MLPs excel with larger sets. Chi-square is identified as the most efficient FS method, balancing performance and computational efficiency, whereas PCA outperforms AE in runtime efficiency. The study also highlights that FE methods are more effective for complex datasets and less sensitive to feature set size, whereas FS methods show significant performance improvements with more informative features. Despite the higher computational costs of FE methods, they demonstrate a greater capability to detect diverse attack types, making them particularly suitable for complex IoT environments. These findings are crucial for both academic research and industry applications, providing insights into optimizing detection performance and computational efficiency in NIDS for IoT networks.</p></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"23 ","pages":"Article 200407"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667305324000814/pdfft?md5=39f96821978e8d05cd9f43a745da82db&pid=1-s2.0-S2667305324000814-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Systems with Applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667305324000814","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Internet of Things (IoT) devices are extensively utilized but are susceptible to cyberattacks, posing significant security challenges. To mitigate these threats, machine learning techniques have been implemented for network intrusion detection in IoT environments. These techniques commonly employ various feature reduction methods, prior to inputting data into models, in order to enhance the efficiency of detection processes to meet real-time requirements. This study provides a comprehensive comparison of feature selection (FS) and feature extraction (FE) techniques for network intrusion detection systems (NIDS) in IoT environments, utilizing the TON-IoT and BoT-IoT datasets for both binary and multi-class classification tasks. We evaluated FS methods, including Pearson correlation and Chi-square, and FE methods, such as Principal Component Analysis (PCA) and Autoencoders (AE), across five classic machine learning models: Decision Tree (DT), Random Forest (RF), Naive Bayes (NB), k-Nearest Neighbors (kNN), and Multi-Layer Perceptron (MLP). Our analysis revealed that FE techniques generally achieve higher accuracy and robustness compared to FS methods, with RF paired with AE delivering superior performance despite higher computational demands. DTs are most effective with smaller feature sets, while MLPs excel with larger sets. Chi-square is identified as the most efficient FS method, balancing performance and computational efficiency, whereas PCA outperforms AE in runtime efficiency. The study also highlights that FE methods are more effective for complex datasets and less sensitive to feature set size, whereas FS methods show significant performance improvements with more informative features. Despite the higher computational costs of FE methods, they demonstrate a greater capability to detect diverse attack types, making them particularly suitable for complex IoT environments. These findings are crucial for both academic research and industry applications, providing insights into optimizing detection performance and computational efficiency in NIDS for IoT networks.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

增强物联网安全性：入侵检测系统特征缩减技术比较研究

物联网（IoT）设备应用广泛，但容易受到网络攻击，带来了巨大的安全挑战。为了减轻这些威胁，人们采用了机器学习技术来检测物联网环境中的网络入侵。在将数据输入模型之前，这些技术通常采用各种特征缩减方法，以提高检测过程的效率，满足实时要求。本研究利用 TON-IoT 和 BoT-IoT 数据集，对物联网环境中网络入侵检测系统（NIDS）的特征选择（FS）和特征提取（FE）技术进行了全面比较，以完成二类和多类分类任务。我们评估了五种经典机器学习模型的 FS 方法（包括皮尔逊相关性和卡方）和 FE 方法（如主成分分析（PCA）和自动编码器（AE））：决策树 (DT)、随机森林 (RF)、奈夫贝叶斯 (NB)、k-近邻 (kNN) 和多层感知器 (MLP)。我们的分析表明，与 FS 方法相比，FE 技术通常具有更高的准确性和鲁棒性，而 RF 与 AE 搭配使用，尽管计算量更大，但性能更优。DT 对于较小的特征集最为有效，而 MLP 则在较大的特征集上表现出色。Chi-square 被认为是最有效的 FS 方法，在性能和计算效率之间取得了平衡，而 PCA 在运行效率方面优于 AE。研究还突出表明，FE 方法对复杂数据集更有效，对特征集大小的敏感度较低，而 FS 方法在使用信息量更大的特征时性能会有显著提高。尽管 FE 方法的计算成本较高，但它们检测各种攻击类型的能力更强，因此特别适用于复杂的物联网环境。这些发现对学术研究和行业应用都至关重要，为优化物联网网络 NIDS 的检测性能和计算效率提供了启示。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Intelligent Systems with Applications

CiteScore

5.60

自引率

0.00%

发文量