{"title":"使用EC-GAN方法的NIDS低样本分类","authors":"Marko Zekan, Igor Tomičić, M. Schatten","doi":"10.3897/jucs.85703","DOIUrl":null,"url":null,"abstract":"Numerous advanced methods have been applied throughout the years for the use in Network Intrusion Detection Systems (NIDS). Among these are various Deep Learning models, which have shown great success for attack classification. Nevertheless, false positive rate and detection rate of these systems remains a concern. This is mostly because of the low-sample, imbalanced nature of realistic datasets, which make models challenging to train.\n Considering this, we applied a novel semi-supervised EC-GAN method for network flow classifi- cation of CIC-IDS-2017 dataset. EC-GAN uses synthetic data to aid the training of a supervised classifier on low-sample data. To achieve this, we modified the original EC-GAN to work with tabular data. In our approach, WCGAN-GP is used for synthetic tabular data generation, while a simple deep neural network is used for classification. The conditional nature of WCGAN-GP diminishes the class imbalance problem, while GAN itself solves the low-sample problem. This approach was successful in generating believable synthetic data, which was consequently used for training and testing the EC-GAN.\n To obtain our results, we trained a classifier on progressively smaller versions of the CIC-DIS-2017 dataset, first via a novel EC-GAN method and then in the conventional way, without the help of synthetic data. We then compared these two sets of results with another author’s results using accuracy, false positive rate, detection rate and macro F1 score as metrics. Our results showed that supervised classifier trained with EC-GAN can achieve significant results even when trained on as little as 25% of the original imbalanced dataset.","PeriodicalId":14652,"journal":{"name":"J. Univers. Comput. Sci.","volume":"2 1","pages":"1330-1346"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Low-sample classification in NIDS using the EC-GAN method\",\"authors\":\"Marko Zekan, Igor Tomičić, M. Schatten\",\"doi\":\"10.3897/jucs.85703\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Numerous advanced methods have been applied throughout the years for the use in Network Intrusion Detection Systems (NIDS). Among these are various Deep Learning models, which have shown great success for attack classification. Nevertheless, false positive rate and detection rate of these systems remains a concern. This is mostly because of the low-sample, imbalanced nature of realistic datasets, which make models challenging to train.\\n Considering this, we applied a novel semi-supervised EC-GAN method for network flow classifi- cation of CIC-IDS-2017 dataset. EC-GAN uses synthetic data to aid the training of a supervised classifier on low-sample data. To achieve this, we modified the original EC-GAN to work with tabular data. In our approach, WCGAN-GP is used for synthetic tabular data generation, while a simple deep neural network is used for classification. The conditional nature of WCGAN-GP diminishes the class imbalance problem, while GAN itself solves the low-sample problem. This approach was successful in generating believable synthetic data, which was consequently used for training and testing the EC-GAN.\\n To obtain our results, we trained a classifier on progressively smaller versions of the CIC-DIS-2017 dataset, first via a novel EC-GAN method and then in the conventional way, without the help of synthetic data. We then compared these two sets of results with another author’s results using accuracy, false positive rate, detection rate and macro F1 score as metrics. Our results showed that supervised classifier trained with EC-GAN can achieve significant results even when trained on as little as 25% of the original imbalanced dataset.\",\"PeriodicalId\":14652,\"journal\":{\"name\":\"J. Univers. Comput. Sci.\",\"volume\":\"2 1\",\"pages\":\"1330-1346\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"J. Univers. Comput. Sci.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3897/jucs.85703\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Univers. Comput. Sci.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3897/jucs.85703","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Low-sample classification in NIDS using the EC-GAN method
Numerous advanced methods have been applied throughout the years for the use in Network Intrusion Detection Systems (NIDS). Among these are various Deep Learning models, which have shown great success for attack classification. Nevertheless, false positive rate and detection rate of these systems remains a concern. This is mostly because of the low-sample, imbalanced nature of realistic datasets, which make models challenging to train.
Considering this, we applied a novel semi-supervised EC-GAN method for network flow classifi- cation of CIC-IDS-2017 dataset. EC-GAN uses synthetic data to aid the training of a supervised classifier on low-sample data. To achieve this, we modified the original EC-GAN to work with tabular data. In our approach, WCGAN-GP is used for synthetic tabular data generation, while a simple deep neural network is used for classification. The conditional nature of WCGAN-GP diminishes the class imbalance problem, while GAN itself solves the low-sample problem. This approach was successful in generating believable synthetic data, which was consequently used for training and testing the EC-GAN.
To obtain our results, we trained a classifier on progressively smaller versions of the CIC-DIS-2017 dataset, first via a novel EC-GAN method and then in the conventional way, without the help of synthetic data. We then compared these two sets of results with another author’s results using accuracy, false positive rate, detection rate and macro F1 score as metrics. Our results showed that supervised classifier trained with EC-GAN can achieve significant results even when trained on as little as 25% of the original imbalanced dataset.