Enhancing Anomaly-Based Intrusion Detection Systems: A Hybrid Approach Integrating Feature Selection and Bayesian Hyperparameter Optimization

Q3 Computer Science Ingenierie des Systemes d''Information Pub Date : 2023-10-31 DOI:10.18280/isi.280506

Naoual Berbiche, Jamila El Alami

{"title":"Enhancing Anomaly-Based Intrusion Detection Systems: A Hybrid Approach Integrating Feature Selection and Bayesian Hyperparameter Optimization","authors":"Naoual Berbiche, Jamila El Alami","doi":"10.18280/isi.280506","DOIUrl":null,"url":null,"abstract":"In the dynamically evolving landscape of cybersecurity, safeguarding IT infrastructures has emerged as an imperative to thwart the escalation of cyber-attacks. Anomaly-based Intrusion Detection Systems (IDS) play a pivotal role in identifying aberrant behaviours that elude conventional detection mechanisms. Nonetheless, these systems are not without their shortcomings, manifesting as elevated false alarm rates and a diminished efficacy in detecting sophisticated attacks. In response to these challenges, a hybrid approach, entailing Machine Learning (ML) techniques, was employed to augment the performance of anomaly-based IDS in terms of detection accuracy, False Positive (FP) Rate, and detection time. The approach encompassed a two-fold optimization strategy: initial feature selection predicated on feature importance derived from the XGBoost classifier, followed by Bayesian optimization (BO) for hyperparameter tuning. The optimization was conducted with respect to two objective functions, namely the ROC-AUC score and the Average Precision score, each serving to identify the optimal hyperparameters for their respective maximization. Classifiers, including Extreme Gradient Boosting (XGBoost), Random Forest (RF), and Stochastic Gradient Descent (SGD), were subjected to training under configurations encompassing both the hyperparameters resultant from BO and the default hyperparameters, the latter serving as reference models. Evaluation, conducted through a multifaceted metric analysis, substantiated the superiority of the optimized models over their reference counterparts, with the optimized XGBoost models demonstrating the most commendable performance. This paradigm offers a promising avenue for enhancing detection precision and mitigating false alarms, thereby fortifying the security of computer","PeriodicalId":38604,"journal":{"name":"Ingenierie des Systemes d''Information","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ingenierie des Systemes d''Information","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18280/isi.280506","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 0

Abstract

In the dynamically evolving landscape of cybersecurity, safeguarding IT infrastructures has emerged as an imperative to thwart the escalation of cyber-attacks. Anomaly-based Intrusion Detection Systems (IDS) play a pivotal role in identifying aberrant behaviours that elude conventional detection mechanisms. Nonetheless, these systems are not without their shortcomings, manifesting as elevated false alarm rates and a diminished efficacy in detecting sophisticated attacks. In response to these challenges, a hybrid approach, entailing Machine Learning (ML) techniques, was employed to augment the performance of anomaly-based IDS in terms of detection accuracy, False Positive (FP) Rate, and detection time. The approach encompassed a two-fold optimization strategy: initial feature selection predicated on feature importance derived from the XGBoost classifier, followed by Bayesian optimization (BO) for hyperparameter tuning. The optimization was conducted with respect to two objective functions, namely the ROC-AUC score and the Average Precision score, each serving to identify the optimal hyperparameters for their respective maximization. Classifiers, including Extreme Gradient Boosting (XGBoost), Random Forest (RF), and Stochastic Gradient Descent (SGD), were subjected to training under configurations encompassing both the hyperparameters resultant from BO and the default hyperparameters, the latter serving as reference models. Evaluation, conducted through a multifaceted metric analysis, substantiated the superiority of the optimized models over their reference counterparts, with the optimized XGBoost models demonstrating the most commendable performance. This paradigm offers a promising avenue for enhancing detection precision and mitigating false alarms, thereby fortifying the security of computer

查看原文