Improving Network Security: An Intrusion Detection System (IDS) Dataset from Higher Learning Institutions, Mbeya University of Science and Technology (MUST), Tanzania
Daud M. Sindika, Mrindoko R. Nicholaus, Nabahani B. Hamadi
{"title":"Improving Network Security: An Intrusion Detection System (IDS) Dataset from Higher Learning Institutions, Mbeya University of Science and Technology (MUST), Tanzania","authors":"Daud M. Sindika, Mrindoko R. Nicholaus, Nabahani B. Hamadi","doi":"10.37284/eajit.6.1.1627","DOIUrl":null,"url":null,"abstract":"Nowadays, Internet-driven culture securing computer networks in Higher Learning Institutions (HLIs) has become a major responsibility. Intrusion Detection Systems (IDS) are crucial for protecting networks from unauthorized activity and cyber threats. This paper examines the process of improving network security by creating a comprehensive IDS dataset using real traffic from HLIs, highlighting the importance of accurate and representative data in improving the system's ability to identify and mitigate future cyber-attacks. The IDS model was created using a variety of machine learning (ML) techniques. Metrics like accuracy, precision, recall, and F1-score were used to assess the performance of each model. The dataset used for training and testing was real-world network traffic data obtained from the institution's computer network. The results showed that the developed IDS obtained exceptional accuracy rates, with Random Forest, Gradient Boosting, and XGBoost models all achieving an accuracy of around 93%. Precision and recall values were likewise quite high across all algorithms. Furthermore, the study discovered that data quality has a substantial impact on IDS performance. Proper data preparation, feature engineering, and noise removal were found to be helpful in improving model accuracy and reducing false positives. While the IDS models performed well throughout validation and testing, implementing such systems in a production setting necessitates careful thought. As a result, the essay also examined the procedures for testing and deploying the IDS models in a real-world scenario. It underlined the significance of ongoing monitoring and maintenance in order to keep the model effective in identifying intrusions. The research aids in the progress of network security in HLI. Educational institutions can better protect their precious assets and sensitive information from cyberattacks by understanding the impact of data quality on IDS performance and implementing effective deployment techniques","PeriodicalId":476140,"journal":{"name":"East African journal of information technology","volume":"263 26‐30","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"East African journal of information technology","FirstCategoryId":"0","ListUrlMain":"https://doi.org/10.37284/eajit.6.1.1627","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Nowadays, Internet-driven culture securing computer networks in Higher Learning Institutions (HLIs) has become a major responsibility. Intrusion Detection Systems (IDS) are crucial for protecting networks from unauthorized activity and cyber threats. This paper examines the process of improving network security by creating a comprehensive IDS dataset using real traffic from HLIs, highlighting the importance of accurate and representative data in improving the system's ability to identify and mitigate future cyber-attacks. The IDS model was created using a variety of machine learning (ML) techniques. Metrics like accuracy, precision, recall, and F1-score were used to assess the performance of each model. The dataset used for training and testing was real-world network traffic data obtained from the institution's computer network. The results showed that the developed IDS obtained exceptional accuracy rates, with Random Forest, Gradient Boosting, and XGBoost models all achieving an accuracy of around 93%. Precision and recall values were likewise quite high across all algorithms. Furthermore, the study discovered that data quality has a substantial impact on IDS performance. Proper data preparation, feature engineering, and noise removal were found to be helpful in improving model accuracy and reducing false positives. While the IDS models performed well throughout validation and testing, implementing such systems in a production setting necessitates careful thought. As a result, the essay also examined the procedures for testing and deploying the IDS models in a real-world scenario. It underlined the significance of ongoing monitoring and maintenance in order to keep the model effective in identifying intrusions. The research aids in the progress of network security in HLI. Educational institutions can better protect their precious assets and sensitive information from cyberattacks by understanding the impact of data quality on IDS performance and implementing effective deployment techniques