Low-sample classification in NIDS using the EC-GAN method

Marko Zekan, Igor Tomičić, M. Schatten
{"title":"Low-sample classification in NIDS using the EC-GAN method","authors":"Marko Zekan, Igor Tomičić, M. Schatten","doi":"10.3897/jucs.85703","DOIUrl":null,"url":null,"abstract":"Numerous advanced methods have been applied throughout the years for the use in Network Intrusion Detection Systems (NIDS). Among these are various Deep Learning models, which have shown great success for attack classification. Nevertheless, false positive rate and detection rate of these systems remains a concern. This is mostly because of the low-sample, imbalanced nature of realistic datasets, which make models challenging to train.\n Considering this, we applied a novel semi-supervised EC-GAN method for network flow classifi- cation of CIC-IDS-2017 dataset. EC-GAN uses synthetic data to aid the training of a supervised classifier on low-sample data. To achieve this, we modified the original EC-GAN to work with tabular data. In our approach, WCGAN-GP is used for synthetic tabular data generation, while  a simple deep neural network is used for classification. The conditional nature of WCGAN-GP diminishes the class imbalance problem, while GAN itself solves the low-sample problem. This approach was successful in generating believable synthetic data, which was consequently used for training and testing the EC-GAN.\n To obtain our results, we trained a classifier on progressively smaller versions of the CIC-DIS-2017 dataset, first via a novel EC-GAN method and then in the conventional way, without the help of synthetic data. We then compared these two sets of results with another author’s results using accuracy, false positive rate, detection rate and macro F1 score as metrics. Our results showed that supervised classifier trained with EC-GAN can achieve significant results even when trained on as little as 25% of the original imbalanced dataset.","PeriodicalId":14652,"journal":{"name":"J. Univers. Comput. Sci.","volume":"2 1","pages":"1330-1346"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Univers. Comput. Sci.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3897/jucs.85703","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Numerous advanced methods have been applied throughout the years for the use in Network Intrusion Detection Systems (NIDS). Among these are various Deep Learning models, which have shown great success for attack classification. Nevertheless, false positive rate and detection rate of these systems remains a concern. This is mostly because of the low-sample, imbalanced nature of realistic datasets, which make models challenging to train. Considering this, we applied a novel semi-supervised EC-GAN method for network flow classifi- cation of CIC-IDS-2017 dataset. EC-GAN uses synthetic data to aid the training of a supervised classifier on low-sample data. To achieve this, we modified the original EC-GAN to work with tabular data. In our approach, WCGAN-GP is used for synthetic tabular data generation, while  a simple deep neural network is used for classification. The conditional nature of WCGAN-GP diminishes the class imbalance problem, while GAN itself solves the low-sample problem. This approach was successful in generating believable synthetic data, which was consequently used for training and testing the EC-GAN. To obtain our results, we trained a classifier on progressively smaller versions of the CIC-DIS-2017 dataset, first via a novel EC-GAN method and then in the conventional way, without the help of synthetic data. We then compared these two sets of results with another author’s results using accuracy, false positive rate, detection rate and macro F1 score as metrics. Our results showed that supervised classifier trained with EC-GAN can achieve significant results even when trained on as little as 25% of the original imbalanced dataset.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用EC-GAN方法的NIDS低样本分类
多年来,许多先进的方法被应用于网络入侵检测系统(NIDS)中。其中包括各种深度学习模型,这些模型在攻击分类方面取得了巨大成功。然而,这些系统的假阳性率和检出率仍然令人担忧。这主要是因为现实数据集的低样本,不平衡的性质,这使得模型的训练具有挑战性。考虑到这一点,我们将一种新颖的半监督EC-GAN方法应用于CIC-IDS-2017数据集的网络流分类。EC-GAN使用合成数据来帮助训练低样本数据上的监督分类器。为了实现这一点,我们修改了原始的EC-GAN来处理表格数据。在我们的方法中,wggan - gp用于合成表格数据生成,而简单的深度神经网络用于分类。wggan - gp的条件性质减少了类不平衡问题,而GAN本身解决了低样本问题。这种方法成功地生成了可信的合成数据,从而用于训练和测试EC-GAN。为了获得我们的结果,我们在逐渐缩小的CIC-DIS-2017数据集版本上训练了一个分类器,首先通过一种新的EC-GAN方法,然后在没有合成数据帮助的情况下以传统的方式训练。然后,我们将这两组结果与另一位作者的结果进行比较,以准确性、假阳性率、检出率和宏观F1评分为指标。我们的研究结果表明,使用EC-GAN训练的监督分类器即使在原始不平衡数据集的25%上训练也能取得显著的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Sentiment Analysis of Code-Mixed Text: A Comprehensive Review Mobile Handoff with 6LoWPAN Neighbour Discovery Auxiliary Communication A Proposal of Naturalistic Software Development Method Recommendation of Machine Learning Techniques for Software Effort Estimation using Multi-Criteria Decision Making Transfer Learning with EfficientNetV2S for Automatic Face Shape Classification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1