A Novel Deep Clustering Variational Auto-Encoder for Anomaly-based Network Intrusion Detection

Van Quan Nguyen, V. H. Nguyen, T. Hoang, Nathan Shone
{"title":"A Novel Deep Clustering Variational Auto-Encoder for Anomaly-based Network Intrusion Detection","authors":"Van Quan Nguyen, V. H. Nguyen, T. Hoang, Nathan Shone","doi":"10.1109/KSE56063.2022.9953763","DOIUrl":null,"url":null,"abstract":"The role of semi-supervised network intrusion detection systems is becoming increasingly important in the ever-changing digital landscape. Despite the boom in commercial and research interest, there are still many concerns over accuracy yet to be addressed. Two of the major limitations contributing to this concern are reliably learning the underlying probability distribution of normal network data and the identification of the boundary between the normal and anomalous data regions in the latent space. Recent research has proposed many different ways to learn the latent representation of normal data in a semi-supervised manner, such as using Clustering-based Autoencoder (CAE) and hybridized approaches of Principal Component Analysis (PCA) and CAE. However, such approaches are still affected by these limitations, predominantly due to an overreliance on feature engineering, or the inability to handle the large data dimensionality. In this paper, we propose a novel Cluster Variational Autoencoder (CVAE) deep learning model to overcome the aforementioned limitations and increase the efficiency of network intrusion detection. This enables a more concise and dominant representation of the latent space to be learnt. The probability distribution learning capabilities of the VAE are fully exploited to learn the underlying probability distribution of the normal network data. This combination enables us to address the limitations discussed. The performance of the proposed model is evaluated using eight benchmark network intrusion datasets: NSL-KDD, UNSW-NB15, CICIDS2017 and five scenarios from CTU13 (CTU13-08, CTU-13-09, CTU13-10, CTU13-12 and CTU13-13). The experimental results achieved clearly demonstrate that the proposed method outperforms semi-supervised approaches from existing works.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/KSE56063.2022.9953763","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

The role of semi-supervised network intrusion detection systems is becoming increasingly important in the ever-changing digital landscape. Despite the boom in commercial and research interest, there are still many concerns over accuracy yet to be addressed. Two of the major limitations contributing to this concern are reliably learning the underlying probability distribution of normal network data and the identification of the boundary between the normal and anomalous data regions in the latent space. Recent research has proposed many different ways to learn the latent representation of normal data in a semi-supervised manner, such as using Clustering-based Autoencoder (CAE) and hybridized approaches of Principal Component Analysis (PCA) and CAE. However, such approaches are still affected by these limitations, predominantly due to an overreliance on feature engineering, or the inability to handle the large data dimensionality. In this paper, we propose a novel Cluster Variational Autoencoder (CVAE) deep learning model to overcome the aforementioned limitations and increase the efficiency of network intrusion detection. This enables a more concise and dominant representation of the latent space to be learnt. The probability distribution learning capabilities of the VAE are fully exploited to learn the underlying probability distribution of the normal network data. This combination enables us to address the limitations discussed. The performance of the proposed model is evaluated using eight benchmark network intrusion datasets: NSL-KDD, UNSW-NB15, CICIDS2017 and five scenarios from CTU13 (CTU13-08, CTU-13-09, CTU13-10, CTU13-12 and CTU13-13). The experimental results achieved clearly demonstrate that the proposed method outperforms semi-supervised approaches from existing works.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一种新的基于异常的网络入侵检测深度聚类变分自编码器
在不断变化的数字环境中,半监督网络入侵检测系统的作用变得越来越重要。尽管商业和研究兴趣蓬勃发展,但仍有许多关于准确性的担忧有待解决。造成这一问题的两个主要限制是可靠地学习正常网络数据的潜在概率分布,以及识别潜在空间中正常和异常数据区域之间的边界。最近的研究提出了许多以半监督方式学习正常数据潜在表示的方法,如基于聚类的自编码器(CAE)和主成分分析(PCA)和CAE的混合方法。然而,这些方法仍然受到这些限制的影响,主要是由于过度依赖特征工程,或者无法处理大数据维度。本文提出了一种新的聚类变分自编码器(CVAE)深度学习模型来克服上述局限性,提高网络入侵检测的效率。这使得学习潜在空间的更简洁和主导的表示成为可能。充分利用VAE的概率分布学习能力来学习正常网络数据的底层概率分布。这种组合使我们能够解决所讨论的限制。使用8个基准网络入侵数据集(NSL-KDD、UNSW-NB15、CICIDS2017)和CTU13的5个场景(CTU13-08、CTU13- 09、CTU13-10、CTU13-12和CTU13-13)对该模型的性能进行了评估。实验结果清楚地表明,该方法优于现有的半监督方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
DWEN: A novel method for accurate estimation of cell type compositions from bulk data samples Polygenic risk scores adaptation for Height in a Vietnamese population Sentiment Classification for Beauty-fashion Reviews An Automated Stub Method for Unit Testing C/C++ Projects Knowledge-based Problem Solving and Reasoning methods
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1