Evaluation of Machine Learning Model for Network Anomaly Detection: Support Vector Machine

Journal of Engineering Research and Reports Pub Date : 2024-08-10 DOI:10.9734/jerr/2024/v26i81248

Andikan E. Okoro, E. Ubom, Ubong Ukommi

{"title":"Evaluation of Machine Learning Model for Network Anomaly Detection: Support Vector Machine","authors":"Andikan E. Okoro, E. Ubom, Ubong Ukommi","doi":"10.9734/jerr/2024/v26i81248","DOIUrl":null,"url":null,"abstract":"Effective network anomaly detection plays a pivotal role in safeguarding digital assets against evolving cyber threats in cybersecurity. In this study, the NSL-KDD dataset was used to investigate anomaly detection using support Vector Machines (SVM) with various kernels: linear, polynomial, radial basis function (RBF), and sigmoid. The linear kernel SVM achieved a high accuracy of 99.47% and an F-score of 99.47%. Despite its strong overall performance, indicated by a weighted average F-score of 0.99, the macro average F-score of 0.79 suggested variability in class performance. Several classes, such as 0, 11, 12, 13, and 20, achieved perfect precision and recall, while classes 1, 7, 8, 16, and 19 had zero recall and F-scores. The Polynomial Kernel SVM demonstrated an accuracy of 99.55% and an F-score of 99.53%. It also showed high precision and recall for many classes, achieving a weighted average F-score of 1.00. However, the macro average F-score of 0.72 indicated notable variation, with poor performance in classes 1, 7, 8, 16, 19, and 22. The RBF Kernel SVM also recorded an accuracy of 99.55% and an F-score of 99.53%, with a macro and weighted average of 0.48 and 0.92 respectively. While several classes achieved perfect scores, significant performance drops were observed in classes 1, 7, 8, 16, 19, and 22. The Sigmoid Kernel SVM showed a lower overall effectiveness with an accuracy of 92.11% and an F-score of 91.80%. The macro and the weighted average of 0.79 and 0.99 respectively and exhibited considerable inconsistency, with some classes achieving high precision and recall while 1, 8, 12, 13, 16, 19, and 22, performed poorly. While the Linear and Poly Kernels showed strong overall performance, the RBF and Sigmoid Kernels exhibited greater variability across different classes, with the Sigmoid Kernel being the least effective for anomaly detection in this dataset.","PeriodicalId":340494,"journal":{"name":"Journal of Engineering Research and Reports","volume":"5 6","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Engineering Research and Reports","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.9734/jerr/2024/v26i81248","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Effective network anomaly detection plays a pivotal role in safeguarding digital assets against evolving cyber threats in cybersecurity. In this study, the NSL-KDD dataset was used to investigate anomaly detection using support Vector Machines (SVM) with various kernels: linear, polynomial, radial basis function (RBF), and sigmoid. The linear kernel SVM achieved a high accuracy of 99.47% and an F-score of 99.47%. Despite its strong overall performance, indicated by a weighted average F-score of 0.99, the macro average F-score of 0.79 suggested variability in class performance. Several classes, such as 0, 11, 12, 13, and 20, achieved perfect precision and recall, while classes 1, 7, 8, 16, and 19 had zero recall and F-scores. The Polynomial Kernel SVM demonstrated an accuracy of 99.55% and an F-score of 99.53%. It also showed high precision and recall for many classes, achieving a weighted average F-score of 1.00. However, the macro average F-score of 0.72 indicated notable variation, with poor performance in classes 1, 7, 8, 16, 19, and 22. The RBF Kernel SVM also recorded an accuracy of 99.55% and an F-score of 99.53%, with a macro and weighted average of 0.48 and 0.92 respectively. While several classes achieved perfect scores, significant performance drops were observed in classes 1, 7, 8, 16, 19, and 22. The Sigmoid Kernel SVM showed a lower overall effectiveness with an accuracy of 92.11% and an F-score of 91.80%. The macro and the weighted average of 0.79 and 0.99 respectively and exhibited considerable inconsistency, with some classes achieving high precision and recall while 1, 8, 12, 13, 16, 19, and 22, performed poorly. While the Linear and Poly Kernels showed strong overall performance, the RBF and Sigmoid Kernels exhibited greater variability across different classes, with the Sigmoid Kernel being the least effective for anomaly detection in this dataset.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

评估用于网络异常检测的机器学习模型：支持向量机

在网络安全领域，有效的网络异常检测对保护数字资产免受不断发展的网络威胁起着至关重要的作用。本研究使用 NSL-KDD 数据集研究了支持向量机（SVM）的异常检测，SVM 有多种内核：线性、多项式、径向基函数（RBF）和 sigmoid。线性核 SVM 的准确率高达 99.47%，F-score 为 99.47%。尽管加权平均 F 分数为 0.99，表明其总体性能很强，但 0.79 的宏观平均 F 分数表明类别性能存在差异。0、11、12、13 和 20 等几个类别的精确度和召回率都很高，而 1、7、8、16 和 19 等类别的召回率和 F 分数都为零。多项式核 SVM 的准确率为 99.55%，F 分数为 99.53%。它还对许多类别显示出较高的精确度和召回率，加权平均 F 分数达到 1.00。不过，0.72 的宏观平均 F 分数显示出明显的差异，在 1、7、8、16、19 和 22 类中表现较差。RBF 核 SVM 的准确率为 99.55%，F 分数为 99.53%，宏观和加权平均值分别为 0.48 和 0.92。虽然有几个类获得了满分，但在 1、7、8、16、19 和 22 类中观察到性能明显下降。Sigmoid Kernel SVM 的总体效果较差，准确率为 92.11%，F 分数为 91.80%。宏观平均值和加权平均值分别为 0.79 和 0.99，而且表现出相当大的不一致性，有些类别的精确度和召回率较高，而 1、8、12、13、16、19 和 22 类别的精确度和召回率较低。线性内核和多内核表现出很强的整体性能，而 RBF 内核和 Sigmoid 内核在不同类别中表现出更大的差异性，其中 Sigmoid 内核在该数据集中的异常检测效果最差。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Engineering Research and Reports

自引率

0.00%

发文量