{"title":"Constructing important features from massive network traffic for lightweight intrusion detection","authors":"Wei Wang, Yongzhong He, Jiqiang Liu, Sylvain Gombault","doi":"10.1049/iet-ifs.2014.0353","DOIUrl":null,"url":null,"abstract":"Efficiently processing massive data is a big issue in high-speed network intrusion detection, as network traffic has become increasingly large and complex. In this work, instead of constructing a large number of features from massive network traffic, the authors aim to select the most important features and use them to detect intrusions in a fast and effective manner. The authors first employed several techniques, that is, information gain (IG), wrapper with Bayesian networks (BN) and Decision trees (C4.5), to select important subsets of features for network intrusion detection based on KDD'99 data. The authors then validate the feature selection schemes in a real network test bed to detect distributed denial-of-service attacks. The feature selection schemes are extensively evaluated based on the two data sets. The empirical results demonstrate that with only the most important 10 features selected from all the original 41 features, the attack detection accuracy almost remains the same or even becomes better based on both BN and C4.5 classifiers. Constructing fewer features can also improve the efficiency of network intrusion detection.","PeriodicalId":13305,"journal":{"name":"IET Inf. Secur.","volume":"61 1","pages":"374-379"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"46","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Inf. Secur.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1049/iet-ifs.2014.0353","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 46
Abstract
Efficiently processing massive data is a big issue in high-speed network intrusion detection, as network traffic has become increasingly large and complex. In this work, instead of constructing a large number of features from massive network traffic, the authors aim to select the most important features and use them to detect intrusions in a fast and effective manner. The authors first employed several techniques, that is, information gain (IG), wrapper with Bayesian networks (BN) and Decision trees (C4.5), to select important subsets of features for network intrusion detection based on KDD'99 data. The authors then validate the feature selection schemes in a real network test bed to detect distributed denial-of-service attacks. The feature selection schemes are extensively evaluated based on the two data sets. The empirical results demonstrate that with only the most important 10 features selected from all the original 41 features, the attack detection accuracy almost remains the same or even becomes better based on both BN and C4.5 classifiers. Constructing fewer features can also improve the efficiency of network intrusion detection.