Z. Cataltepe, Ümit Ekmekçi, T. Cataltepe, Ismail Kelebek
{"title":"Online feature selected semi-supervised decision trees for network intrusion detection","authors":"Z. Cataltepe, Ümit Ekmekçi, T. Cataltepe, Ismail Kelebek","doi":"10.1109/NOMS.2016.7502965","DOIUrl":null,"url":null,"abstract":"Network intrusion detection systems need to detect abnormal behaviour in network data as soon as possible and with as little user intervention as possible. In this paper, we describe a semi-supervised network anomaly detection system. Our system uses online clustering to summarize the available network data. Clusters are represented using extended cluster features that comprise of not only features related to the original features, but also features that describe the relationships between clusters. Each cluster is labeled by the user as anomaly or normal and then a decision tree is trained based on this information. The incoming new data is labeled according to the output of the decision tree. We show that this system achieves much better performance than an unsupervised anomaly detection system. We also show that using online feature selection on the cluster features reduces the decision tree complexity without hindering the accuracy.","PeriodicalId":344879,"journal":{"name":"NOMS 2016 - 2016 IEEE/IFIP Network Operations and Management Symposium","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NOMS 2016 - 2016 IEEE/IFIP Network Operations and Management Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NOMS.2016.7502965","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Network intrusion detection systems need to detect abnormal behaviour in network data as soon as possible and with as little user intervention as possible. In this paper, we describe a semi-supervised network anomaly detection system. Our system uses online clustering to summarize the available network data. Clusters are represented using extended cluster features that comprise of not only features related to the original features, but also features that describe the relationships between clusters. Each cluster is labeled by the user as anomaly or normal and then a decision tree is trained based on this information. The incoming new data is labeled according to the output of the decision tree. We show that this system achieves much better performance than an unsupervised anomaly detection system. We also show that using online feature selection on the cluster features reduces the decision tree complexity without hindering the accuracy.