{"title":"基于降维的网络入侵检测系统最小数据集","authors":"Jean-Pierre Nziga","doi":"10.1109/ICDIM.2011.6093368","DOIUrl":null,"url":null,"abstract":"Network Intrusion Detection Systems (NIDS) monitor internet traffic to detect malicious activities including but not limited to denial of service attacks, network accesses by unauthorized users, attempts to gain additional privileges and port scans. The amount of data that must be analyzed by NIDS is too large. Prior studies developed feature selection and feature extraction techniques to reduce the size of data. None has focused on finding exactly by how much the dataset should be reduced. Dimensionality reduction is a field in machine learning that consists on mapping high dimensional data into lower dimension while preserving important features of the original dataset. Dimensionality reduction techniques have been used to reduce the amount of data in applications such as speech signals, digital photographs, fMRI scans, DNA microarrays, Hyper spectral data. The purpose of this paper is to find the finite amount of data required for successful intrusion detection. This evaluation is necessary to improve the efficiency of NIDS in identifying existing attack patterns and recognizing new intrusion in real-time. Two dimensionality reduction techniques are used one linear technique (Principal Component Analysis) and one non-linear technique (Multidimensional Scaling). Data is then submitted to two classification algorithms J48 (C.45) and Naïve Bayes. This study was conducted using the KDD Cup 99 data. Experimental results show optimal performance with reduced datasets of 4 dimensions for J48 and 12 dimensions for Naïve Bayes.","PeriodicalId":355775,"journal":{"name":"2011 Sixth International Conference on Digital Information Management","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Minimal dataset for Network Intrusion Detection Systems via dimensionality reduction\",\"authors\":\"Jean-Pierre Nziga\",\"doi\":\"10.1109/ICDIM.2011.6093368\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Network Intrusion Detection Systems (NIDS) monitor internet traffic to detect malicious activities including but not limited to denial of service attacks, network accesses by unauthorized users, attempts to gain additional privileges and port scans. The amount of data that must be analyzed by NIDS is too large. Prior studies developed feature selection and feature extraction techniques to reduce the size of data. None has focused on finding exactly by how much the dataset should be reduced. Dimensionality reduction is a field in machine learning that consists on mapping high dimensional data into lower dimension while preserving important features of the original dataset. Dimensionality reduction techniques have been used to reduce the amount of data in applications such as speech signals, digital photographs, fMRI scans, DNA microarrays, Hyper spectral data. The purpose of this paper is to find the finite amount of data required for successful intrusion detection. This evaluation is necessary to improve the efficiency of NIDS in identifying existing attack patterns and recognizing new intrusion in real-time. Two dimensionality reduction techniques are used one linear technique (Principal Component Analysis) and one non-linear technique (Multidimensional Scaling). Data is then submitted to two classification algorithms J48 (C.45) and Naïve Bayes. This study was conducted using the KDD Cup 99 data. Experimental results show optimal performance with reduced datasets of 4 dimensions for J48 and 12 dimensions for Naïve Bayes.\",\"PeriodicalId\":355775,\"journal\":{\"name\":\"2011 Sixth International Conference on Digital Information Management\",\"volume\":\"49 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 Sixth International Conference on Digital Information Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDIM.2011.6093368\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 Sixth International Conference on Digital Information Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDIM.2011.6093368","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
摘要
网络入侵检测系统(NIDS)监控互联网流量以检测恶意活动,包括但不限于拒绝服务攻击、未经授权用户的网络访问、试图获得额外特权和端口扫描。NIDS必须分析的数据量太大。先前的研究开发了特征选择和特征提取技术来减小数据的大小。没有人专注于找出数据集应该减少多少。降维是机器学习中的一个领域,它包括将高维数据映射到低维数据,同时保留原始数据集的重要特征。降维技术已被用于减少语音信号、数码照片、功能磁共振成像扫描、DNA微阵列、超光谱数据等应用中的数据量。本文的目的是找到成功的入侵检测所需的有限数量的数据。这种评估对于提高网络入侵检测系统识别现有攻击模式和实时识别新入侵的效率是必要的。采用两种降维技术,一种是线性技术(主成分分析),一种是非线性技术(多维尺度)。然后将数据提交给两种分类算法J48 (C.45)和Naïve Bayes。本研究使用KDD Cup 99数据进行。实验结果表明,J48的约简数据集为4维,Naïve贝叶斯的约简数据集为12维时性能最佳。
Minimal dataset for Network Intrusion Detection Systems via dimensionality reduction
Network Intrusion Detection Systems (NIDS) monitor internet traffic to detect malicious activities including but not limited to denial of service attacks, network accesses by unauthorized users, attempts to gain additional privileges and port scans. The amount of data that must be analyzed by NIDS is too large. Prior studies developed feature selection and feature extraction techniques to reduce the size of data. None has focused on finding exactly by how much the dataset should be reduced. Dimensionality reduction is a field in machine learning that consists on mapping high dimensional data into lower dimension while preserving important features of the original dataset. Dimensionality reduction techniques have been used to reduce the amount of data in applications such as speech signals, digital photographs, fMRI scans, DNA microarrays, Hyper spectral data. The purpose of this paper is to find the finite amount of data required for successful intrusion detection. This evaluation is necessary to improve the efficiency of NIDS in identifying existing attack patterns and recognizing new intrusion in real-time. Two dimensionality reduction techniques are used one linear technique (Principal Component Analysis) and one non-linear technique (Multidimensional Scaling). Data is then submitted to two classification algorithms J48 (C.45) and Naïve Bayes. This study was conducted using the KDD Cup 99 data. Experimental results show optimal performance with reduced datasets of 4 dimensions for J48 and 12 dimensions for Naïve Bayes.