Toya Acharya, Ishan Khatri, A. Annamalai, M. Chouikha
{"title":"Efficacy of Machine Learning-Based Classifiers for Binary and Multi-Class Network Intrusion Detection","authors":"Toya Acharya, Ishan Khatri, A. Annamalai, M. Chouikha","doi":"10.1109/I2CACIS52118.2021.9495877","DOIUrl":null,"url":null,"abstract":"The internet-based services undoubtedly led the worldwide revolution with exponential growth, but security breaches resulting personal digital asset losses which need for a comprehensive cybersecurity solution. Traditionally, signature-based network intrusion detection is employed to capture attributes of normal and abnormal traffics in a network, but it fails to detect the zero-day attack. The machine learning-based approach is attractive among various known NIDS methods to circumvent the shortcoming because machine learning based approach can efficiently analyze the big network traffic data and efficiently detect the zero-day attack. The imbalanced NIDS dataset does not provide better performance on practical implementation scenarios. Reducing the number of target classes into a new target class creates a balanced NIDS and improved classifier performance. In this paper, we present the efficacy of several machine learning algorithms, including Random forest (RF), J48, Naïve Bayes, Bayesian Network, Bagging, AdaBoost, and Support Vector Machine (SVM) using network logs traffic (KDD99, UNSW-NB15, and CIC-IDS2017) using WEKA. This paper examined the impact of changing the number of output classes of the publicly available network intrusion datasets on sensitivity (True Positive Rate), False Positive Rate (FPR), Area under the ROC curve (AUC) and incorrectly identified percentage. Interestingly, the efficiency of these classifiers has increased, adding strongly correlated features to the target classes. The experimented results reveal that the machine learning classifiers performance improved when the number of target classes decreased. The addition of a highly correlated feature to the output class increases the performance of the classifiers.","PeriodicalId":210770,"journal":{"name":"2021 IEEE International Conference on Automatic Control & Intelligent Systems (I2CACIS)","volume":"113 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Automatic Control & Intelligent Systems (I2CACIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/I2CACIS52118.2021.9495877","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
The internet-based services undoubtedly led the worldwide revolution with exponential growth, but security breaches resulting personal digital asset losses which need for a comprehensive cybersecurity solution. Traditionally, signature-based network intrusion detection is employed to capture attributes of normal and abnormal traffics in a network, but it fails to detect the zero-day attack. The machine learning-based approach is attractive among various known NIDS methods to circumvent the shortcoming because machine learning based approach can efficiently analyze the big network traffic data and efficiently detect the zero-day attack. The imbalanced NIDS dataset does not provide better performance on practical implementation scenarios. Reducing the number of target classes into a new target class creates a balanced NIDS and improved classifier performance. In this paper, we present the efficacy of several machine learning algorithms, including Random forest (RF), J48, Naïve Bayes, Bayesian Network, Bagging, AdaBoost, and Support Vector Machine (SVM) using network logs traffic (KDD99, UNSW-NB15, and CIC-IDS2017) using WEKA. This paper examined the impact of changing the number of output classes of the publicly available network intrusion datasets on sensitivity (True Positive Rate), False Positive Rate (FPR), Area under the ROC curve (AUC) and incorrectly identified percentage. Interestingly, the efficiency of these classifiers has increased, adding strongly correlated features to the target classes. The experimented results reveal that the machine learning classifiers performance improved when the number of target classes decreased. The addition of a highly correlated feature to the output class increases the performance of the classifiers.