Toya Acharya, Ishan Khatri, A. Annamalai, M. Chouikha
{"title":"Efficacy of Heterogeneous Ensemble Assisted Machine Learning Model for Binary and Multi-Class Network Intrusion Detection","authors":"Toya Acharya, Ishan Khatri, A. Annamalai, M. Chouikha","doi":"10.1109/I2CACIS52118.2021.9495864","DOIUrl":null,"url":null,"abstract":"The exponential rise in internet technologies and allied applications encompass a significantly large number of networked devices have alarmed academia-industries to achieve more effective and robust security solutions. Undeniably, digitization has led to revolution globally; however, the security threats, breaches, and subsequent losses indicate the need for a robust cybersecurity solution. Unlike classical intrusion detection systems (IDS), network IDS (NIDS) has been becoming more challenging due to continuous changes in attack-patterns and anomaly behavior. As solution data-driven machine learning methods have exhibited better by learning over network traffic information and detecting anomalies; however, its generalization over a network with both known and unknown patterns remains questionable. Moreover, most of the classical approaches fail to address the key issues of class-imbalance, level-of-significance centric feature selection, normalization and over-fitting problems resulting in different performance by varied machine learning models. In this paper, a novel and robust heterogeneous ensemble machine learning model is developed to detect anomalies in NIDS. The proposed model first applies sub-sampling to alleviate the class-imbalance problem of NIDS datasets. Subsequently, performing normalization using the Min-Max algorithm, it mapped the input data in the range of 0 to 1, thus alleviating overfitting and convergence. The feature reduction is used to reduce the features; it retained the most suitable features without imposing computational overheads, often in meta-heuristic-based approaches. Finally, the proposed NIDS solution designed a Heterogeneous ensemble learning model with J48, k-NN, SVM, Bagging, AdaBoost, and RF algorithms as base-classifier to perform two-class as well as multi-class classification over feature-selected NSL-KDD, KDD99, and UNSW-NB-15 datasets. Performance assessment in terms of true-positive rate, false positive rate and AUC revealed that the proposed NIDS model exhibited better performance than the standalone classifiers and superior to other existing anomaly detection methods.","PeriodicalId":210770,"journal":{"name":"2021 IEEE International Conference on Automatic Control & Intelligent Systems (I2CACIS)","volume":"62 11","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Automatic Control & Intelligent Systems (I2CACIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/I2CACIS52118.2021.9495864","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
The exponential rise in internet technologies and allied applications encompass a significantly large number of networked devices have alarmed academia-industries to achieve more effective and robust security solutions. Undeniably, digitization has led to revolution globally; however, the security threats, breaches, and subsequent losses indicate the need for a robust cybersecurity solution. Unlike classical intrusion detection systems (IDS), network IDS (NIDS) has been becoming more challenging due to continuous changes in attack-patterns and anomaly behavior. As solution data-driven machine learning methods have exhibited better by learning over network traffic information and detecting anomalies; however, its generalization over a network with both known and unknown patterns remains questionable. Moreover, most of the classical approaches fail to address the key issues of class-imbalance, level-of-significance centric feature selection, normalization and over-fitting problems resulting in different performance by varied machine learning models. In this paper, a novel and robust heterogeneous ensemble machine learning model is developed to detect anomalies in NIDS. The proposed model first applies sub-sampling to alleviate the class-imbalance problem of NIDS datasets. Subsequently, performing normalization using the Min-Max algorithm, it mapped the input data in the range of 0 to 1, thus alleviating overfitting and convergence. The feature reduction is used to reduce the features; it retained the most suitable features without imposing computational overheads, often in meta-heuristic-based approaches. Finally, the proposed NIDS solution designed a Heterogeneous ensemble learning model with J48, k-NN, SVM, Bagging, AdaBoost, and RF algorithms as base-classifier to perform two-class as well as multi-class classification over feature-selected NSL-KDD, KDD99, and UNSW-NB-15 datasets. Performance assessment in terms of true-positive rate, false positive rate and AUC revealed that the proposed NIDS model exhibited better performance than the standalone classifiers and superior to other existing anomaly detection methods.