Lala Shahbandayeva, Ulviyya Mammadzada, Ilaha Manafova, Sevinj Jafarli, A. Adamov
{"title":"Network Intrusion Detection using Supervised and Unsupervised Machine Learning","authors":"Lala Shahbandayeva, Ulviyya Mammadzada, Ilaha Manafova, Sevinj Jafarli, A. Adamov","doi":"10.1109/AICT55583.2022.10013594","DOIUrl":null,"url":null,"abstract":"Traditional intrusion detection systems may effectively detect known attacks and intrusions with predefined signatures. This requires training the systems to detect various versions of the same attack patterns and constantly keep updated databases of known attack signatures. However, as the skills of security researchers and practitioners expand, so do those of attackers. In order to detect attack types that are unknown, undefined, or designed to bypass the signature and pattern-based intrusion detection systems, the need for more intelligent systems arises. Machine learning is widely used in such systems for this purpose. While researchers and security professionals have designed approaches to this problem using various types of machine learning, our hybrid approach attempts to provide a novel way to effectively detect attacks. This is done by using a set of supervised learning algorithms to detect known attacks and unsupervised learning to detect unknown and zero-day attacks. By utilizing the CSE-CIC-IDS 2018 dataset, we have trained our classifiers to detect benign traffic and 14 known attacks with a selection of 23 features. The network traffic flows that are not classified with a specific level of certainty are sent to the clustering phase to be detected as benign or malicious traffic. Our results indicate that the three classification algorithms used, K-Nearest Neighbors, Random Forest, and Artificial Neural Networks, are able to successfully classify the known attacks with F1-scores between 0.93 and 0.969, and the clustering algorithm HDBSCAN is able to successfully cluster unclassified benign and malicious traffic with unknown labels with F1-scores between 0.85 and 0.957.","PeriodicalId":441475,"journal":{"name":"2022 IEEE 16th International Conference on Application of Information and Communication Technologies (AICT)","volume":"16 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 16th International Conference on Application of Information and Communication Technologies (AICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AICT55583.2022.10013594","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Traditional intrusion detection systems may effectively detect known attacks and intrusions with predefined signatures. This requires training the systems to detect various versions of the same attack patterns and constantly keep updated databases of known attack signatures. However, as the skills of security researchers and practitioners expand, so do those of attackers. In order to detect attack types that are unknown, undefined, or designed to bypass the signature and pattern-based intrusion detection systems, the need for more intelligent systems arises. Machine learning is widely used in such systems for this purpose. While researchers and security professionals have designed approaches to this problem using various types of machine learning, our hybrid approach attempts to provide a novel way to effectively detect attacks. This is done by using a set of supervised learning algorithms to detect known attacks and unsupervised learning to detect unknown and zero-day attacks. By utilizing the CSE-CIC-IDS 2018 dataset, we have trained our classifiers to detect benign traffic and 14 known attacks with a selection of 23 features. The network traffic flows that are not classified with a specific level of certainty are sent to the clustering phase to be detected as benign or malicious traffic. Our results indicate that the three classification algorithms used, K-Nearest Neighbors, Random Forest, and Artificial Neural Networks, are able to successfully classify the known attacks with F1-scores between 0.93 and 0.969, and the clustering algorithm HDBSCAN is able to successfully cluster unclassified benign and malicious traffic with unknown labels with F1-scores between 0.85 and 0.957.