{"title":"评估网络异常检测中机器学习模型的性能和挑战","authors":"Sakshi Bakhare, Dr. Sudhir W. Mohod","doi":"10.32628/ijsrset5241134","DOIUrl":null,"url":null,"abstract":"The application of machine learning algorithms for anomaly detection in network traffic data is examined in this study. Using a collection of network flow records that includes attributes such as IP addresses, ports, protocols, and timestamps, the study makes use of correlation heatmaps, box plots, and data visualization to identify trends in numerical characteristics. After preprocessing, which includes timestamp conversion to Unix format, three machine learning models Support Vector Machine (SVM), Gaussian Naive Bayes, and Random Forest are used for anomaly identification. The Random Forest Classifier outperforms SVM and Naive Bayes classifiers with better precision and recall for anomaly diagnosis, achieving an accuracy of 87%. Confusion matrices and classification reports are used to evaluate the models, and they show that the Random Forest Classifier performs better than the other models in identifying abnormalities in network traffic. These results provide significant value to the field of cybersecurity by highlighting the effectiveness of machine learning models specifically, the Random Forest Classifier in boosting anomaly detection capacities for network environment security.","PeriodicalId":14228,"journal":{"name":"International Journal of Scientific Research in Science, Engineering and Technology","volume":"103 49","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating the Performance and Challenges of Machine Learning Models in Network Anomaly Detection\",\"authors\":\"Sakshi Bakhare, Dr. Sudhir W. Mohod\",\"doi\":\"10.32628/ijsrset5241134\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The application of machine learning algorithms for anomaly detection in network traffic data is examined in this study. Using a collection of network flow records that includes attributes such as IP addresses, ports, protocols, and timestamps, the study makes use of correlation heatmaps, box plots, and data visualization to identify trends in numerical characteristics. After preprocessing, which includes timestamp conversion to Unix format, three machine learning models Support Vector Machine (SVM), Gaussian Naive Bayes, and Random Forest are used for anomaly identification. The Random Forest Classifier outperforms SVM and Naive Bayes classifiers with better precision and recall for anomaly diagnosis, achieving an accuracy of 87%. Confusion matrices and classification reports are used to evaluate the models, and they show that the Random Forest Classifier performs better than the other models in identifying abnormalities in network traffic. These results provide significant value to the field of cybersecurity by highlighting the effectiveness of machine learning models specifically, the Random Forest Classifier in boosting anomaly detection capacities for network environment security.\",\"PeriodicalId\":14228,\"journal\":{\"name\":\"International Journal of Scientific Research in Science, Engineering and Technology\",\"volume\":\"103 49\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Scientific Research in Science, Engineering and Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.32628/ijsrset5241134\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Scientific Research in Science, Engineering and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32628/ijsrset5241134","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Evaluating the Performance and Challenges of Machine Learning Models in Network Anomaly Detection
The application of machine learning algorithms for anomaly detection in network traffic data is examined in this study. Using a collection of network flow records that includes attributes such as IP addresses, ports, protocols, and timestamps, the study makes use of correlation heatmaps, box plots, and data visualization to identify trends in numerical characteristics. After preprocessing, which includes timestamp conversion to Unix format, three machine learning models Support Vector Machine (SVM), Gaussian Naive Bayes, and Random Forest are used for anomaly identification. The Random Forest Classifier outperforms SVM and Naive Bayes classifiers with better precision and recall for anomaly diagnosis, achieving an accuracy of 87%. Confusion matrices and classification reports are used to evaluate the models, and they show that the Random Forest Classifier performs better than the other models in identifying abnormalities in network traffic. These results provide significant value to the field of cybersecurity by highlighting the effectiveness of machine learning models specifically, the Random Forest Classifier in boosting anomaly detection capacities for network environment security.