Efficient Intrusion Detection System Data Preprocessing Using Deep Sparse Autoencoder with Differential Evolution

IF 2.6 4区计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS IET Information Security Pub Date : 2024-08-12 DOI:10.1049/2024/9937803

Saranya N., Anandakumar Haldorai

{"title":"Efficient Intrusion Detection System Data Preprocessing Using Deep Sparse Autoencoder with Differential Evolution","authors":"Saranya N., Anandakumar Haldorai","doi":"10.1049/2024/9937803","DOIUrl":null,"url":null,"abstract":"<p>A great amount of data is generated by the Internet and communication areas’ rapid technological improvement, which expands the size of the network. These cutting-edge technologies could result in unique network attacks that present security risks. This intrusion launches many attacks on the communication network which is to be monitored. An intrusion detection system (IDS) is a tool to prevent from intrusions by inspecting the network traffic and to make sure the network integrity, confidentiality, availability, and robustness. Many researchers are focused to IDS with machine and deep learning approaches to detect the intruders. Yet, IDS face challenges to detect the intruders accurately with reduced false alarm rate, feature selection, and detection. High dimensional data affect the feature selection methods effectiveness and efficiency. Preprocessing of data to make the dataset as balanced, normalized, and transformed data is done before the feature selection and classification process. Efficient data preprocessing will ensure the whole IDS performance with improved detection rate (DR) and reduced false alarm rate (FAR). Since datasets are required for the various feature dimensions, this article proposes an efficient data preprocessing method that includes a series of techniques for data balance using SMOTE, data normalization with power transformation, data encoding using one hot and ordinal encoding, and feature reduction using a proposed deep sparse autoencoder (DSAE) with differential evolution (DE) on data before feature selection and classification. The efficiency of the transformation methods is evaluated with recursive Pearson correlation-based feature selection and graphical convolution neural network (G-CNN) methods.</p>","PeriodicalId":50380,"journal":{"name":"IET Information Security","volume":"2024 1","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/9937803","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Information Security","FirstCategoryId":"94","ListUrlMain":"https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/2024/9937803","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

A great amount of data is generated by the Internet and communication areas’ rapid technological improvement, which expands the size of the network. These cutting-edge technologies could result in unique network attacks that present security risks. This intrusion launches many attacks on the communication network which is to be monitored. An intrusion detection system (IDS) is a tool to prevent from intrusions by inspecting the network traffic and to make sure the network integrity, confidentiality, availability, and robustness. Many researchers are focused to IDS with machine and deep learning approaches to detect the intruders. Yet, IDS face challenges to detect the intruders accurately with reduced false alarm rate, feature selection, and detection. High dimensional data affect the feature selection methods effectiveness and efficiency. Preprocessing of data to make the dataset as balanced, normalized, and transformed data is done before the feature selection and classification process. Efficient data preprocessing will ensure the whole IDS performance with improved detection rate (DR) and reduced false alarm rate (FAR). Since datasets are required for the various feature dimensions, this article proposes an efficient data preprocessing method that includes a series of techniques for data balance using SMOTE, data normalization with power transformation, data encoding using one hot and ordinal encoding, and feature reduction using a proposed deep sparse autoencoder (DSAE) with differential evolution (DE) on data before feature selection and classification. The efficiency of the transformation methods is evaluated with recursive Pearson correlation-based feature selection and graphical convolution neural network (G-CNN) methods.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用差分进化的深度稀疏自动编码器进行高效入侵检测系统数据预处理

互联网和通信领域技术的飞速发展产生了大量数据，扩大了网络的规模。这些尖端技术可能导致独特的网络攻击，从而带来安全风险。这种入侵会对需要监控的通信网络发起许多攻击。入侵检测系统（IDS）是一种通过检测网络流量来防止入侵，并确保网络完整性、保密性、可用性和稳健性的工具。许多研究人员专注于利用机器和深度学习方法来检测入侵者的 IDS。然而，IDS 在降低误报率、特征选择和检测方面都面临着准确检测入侵者的挑战。高维数据会影响特征选择方法的有效性和效率。在特征选择和分类过程之前，需要对数据进行预处理，使数据集成为平衡、归一化和转换的数据。高效的数据预处理将确保整个 IDS 性能，提高检测率（DR），降低误报率（FAR）。由于各种特征维度都需要数据集，本文提出了一种高效的数据预处理方法，其中包括使用 SMOTE 进行数据平衡、使用幂变换进行数据归一化、使用一热和序数编码进行数据编码，以及在特征选择和分类之前使用差分进化（DE）的深度稀疏自动编码器（DSAE）对数据进行特征还原等一系列技术。利用基于递归皮尔逊相关性的特征选择和图形卷积神经网络（G-CNN）方法评估了转换方法的效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IET Information Security 工程技术-计算机：理论方法

CiteScore

3.80

自引率

7.10%

发文量

审稿时长

8.6 months

期刊介绍： IET Information Security publishes original research papers in the following areas of information security and cryptography. Submitting authors should specify clearly in their covering statement the area into which their paper falls. Scope: Access Control and Database Security Ad-Hoc Network Aspects Anonymity and E-Voting Authentication Block Ciphers and Hash Functions Blockchain, Bitcoin (Technical aspects only) Broadcast Encryption and Traitor Tracing Combinatorial Aspects Covert Channels and Information Flow Critical Infrastructures Cryptanalysis Dependability Digital Rights Management Digital Signature Schemes Digital Steganography Economic Aspects of Information Security Elliptic Curve Cryptography and Number Theory Embedded Systems Aspects Embedded Systems Security and Forensics Financial Cryptography Firewall Security Formal Methods and Security Verification Human Aspects Information Warfare and Survivability Intrusion Detection Java and XML Security Key Distribution Key Management Malware Multi-Party Computation and Threshold Cryptography Peer-to-peer Security PKIs Public-Key and Hybrid Encryption Quantum Cryptography Risks of using Computers Robust Networks Secret Sharing Secure Electronic Commerce Software Obfuscation Stream Ciphers Trust Models Watermarking and Fingerprinting Special Issues. Current Call for Papers: Security on Mobile and IoT devices - https://digital-library.theiet.org/files/IET_IFS_SMID_CFP.pdf